Ruby Iterators: Enumerable Methods, Part I

#each and #map

As someone new to Ruby who has a fairly limited lexicon of method names, I often have no idea that Ruby already contains a method to accomplish exactly what I am trying to achieve. Sometimes when I’m writing code, after constructing a sequence of method calls or wrapping some logic in my own method, I then find that Ruby has a built in method to abstract a common pattern, producing the same result in just one or two lines of code.

A few methods that I’ve recently come to understand are the map, inject and each_with_object methods. Anything that can be accomplished with these methods could also be accomplished in several more lines of code using a varation on the each method, so they are by no means absolutely necessary, but they have much more precise uses than the more general each method. Building these three methods from scratch using each helped me to understad (1) how they worked and (2) how easy it is to assign the result of some code to a variable and (3) how “yield” works. Today I’m going to examine each and map.

#each

each is our basic iterator. It pretty much does what the name implies. To print the elements of an array, we can write:

array = [1, 2, 3, 4, 5]
array.each do |i|
  puts i
end

the result:

1
2
3
4
5
 => [1, 2, 3, 4, 5] 

each, our iterator, takes each element from its reciever, the Enumerable array, and yields it to the block, the indented line between our each call and the end keyword. It does this five times, as there are 5 elements in the array. With each iteration, the element currently passed to the block will temporarily be assigned to i. end indicates the end of the code to be executed with each iteration and signals the start of the following iteration using the next element in the reciever.

Even the each method could be written in more basic terms:

array = [1, 2, 3, 4, 5]
i=0
while i < array.length
  puts array[i]
  i = i+1
end

produces:

1
2
3
4
5
 => nil 

This, isn’t a method, it’s just some code. Notice how both print the elements of the array, but our first code sample, using each returned the original array (as indicated by the hashrocket =>), while the second returned nil. each always returns the origal array on which it was called. In both the above cases, we were simply printing each element of the array, not trying to modify it in any way. But what if we wanted to transform the array in some way? For example:

array = [1, 2, 3, 4, 5]
array.each do |i|
  i + 5
end
=> [1, 2, 3, 4, 5] 

Once again, each returns the orginal array. In order to create an array with new values, each five more than the values of the original array, we would have to do the following:

array = [1, 2, 3, 4, 5]
result_array = []
array.each do |i|
  result_array << i + 5
end
 => [1, 2, 3, 4, 5] 
result_array
 => [6, 7, 8, 9, 10]

The value of the variable array has not changed, but we now have a new array, result_array containing the modified values. Note that the variable result_array must be declared and assigned the value of empty array before array.each is called. Then with each iteration, a new value i + 5is appended to result_array.

#map

Why should we have to declare an array outside the each loop and then individually add values to it? We can’t just assign the return of the each block to a new variable because, as we know, each always returns the original array. Ruby’s #map method is built to do exactly what we are asking each to do!

array = [1, 2, 3, 4, 5]
array.map do |i|
  i+5
end
 => [6, 7, 8, 9, 10] 

Unlike each, map returns an array whose elements are the result of whatever code was evaluated within the block, which is what we really want. Take note that map doesn’t “overwrite” the original array with the return array, but we can easily assign this return array to a variable.

array = [1, 2, 3, 4, 5]
result_array = array.map do |i|
  i+5
end
 => [6, 7, 8, 9, 10] 

array
 => [1, 2, 3, 4, 5] 

result_array
 => [6, 7, 8, 9, 10]

I sometimes forget how easy it is to assign the return value of an entire block of code to a variable because it’s not as intuitive as writing n = 1, but it’s really no different than assigning an integer, array, string or any other class of object to a variable. If you did not want to retain the original array and wanted to assign it the value of the new array, you could easily do so:

array = [1, 2, 3, 4, 5]
array = array.map do |i|
  i+5
end
 => [6, 7, 8, 9, 10] 

array
 => [6, 7, 8, 9, 10]

The difference between map and each is fairly straight-forward. We already say how certain uses of each are really begging for map. This was pretty easy to recognize, but building my own method that mimicked the functionality of map was a bit harder. The method can only use the each method within its definition to iterate over the reciever array. It also must be versatile–like map it must yield to a block that specifies exactly what should happen to each element in the reciever–I don’t always want to add 5 to every element. It must then return the modified array.

array = [1,2,3,4,5]

def my_map
  return_array = []
  self.each do |n|
    return_array << yield(n)
  end
  return_array
end

array.my_map do |i|
  i * 2
end
=> [2, 4, 6, 8, 10] 

The thing I really struggled to understand was how exactly yield makes this work. Whe we call array.my_map, Ruby immediately shifts to the first line of code within the definition of my_map. Before even interpreting do |i|, Ruby inserts a bookmark and starts reading the first line of the definition of my_app. On this line, a new empty array is initialized. On the next, the method each is called on self. In the context of a method definition, self is equal to the method’s reciever. In this case, the reciever is array. In this particular method call, this line of code means array.each do |n|.

Like always, each signals an iteration over its reciever where each element in that reciever will temporarily be assigned to a variable, in this case n, and then passed to the block that follows. In the first iteration, n is equal to 1.

Inside the block on the following line, we see that some value is appended to the array return_array, but Ruby does not yet knwo what that value will be. yield(n) puts another bookmark in our code, and tells Ruby to go back to where it left of earlier, just before do |i|, bringing the variable nwith it. The variable i is temporarily assigned the value of n. If this is our firt iteration, n is equal to 1. iis equal to n within this block. The block is then evaluated. The last line of code evaluated within the block is its implied return value.

In the first iteration, starting from the point where Ruby interprets yield(n), here’s an expanded version of what’s happening:

n = 1     
i = n
i * 2
 => 2 

After this, Ruby reads the keyword end and knows that it has reached the end of the block. Thus the return value of the block is equal to 2 because the last line of code in the block evaluates to 2. Think back to when Ruby placed its last bookmark. It was back inside a different block of code, within the my_map method. Ruby paused evaluation at this point because it interpreted instructions to yield. Now that the block has ended, Ruby returns to this point. yield(n)evaluates to the return value of the block to which it yielded. In this case, yield(n) where n = 1 evaluates to 2, as we saw. Ruby now knows what value to append to the reciever of the shovel method, return_array.

Upon interpreting the endkeyword on the following line, Ruby will perform the next iteration over array, again assigning n a value, passing that value to the block where my_map was originally called, it will evaluate that block and shovel its return value into return_array all over again and again for each array element. After the last iteration, Ruby will proceed to the line following the end key. Because we want the method to return the modified array, the last line of code in the method definition must be return_array. Whenever this function is called, it will return the updated array.

Abstracting behaviors that will be used again and again is always a good decision. Utilizing yield in this method allowed us to abstract everthing execept the exact way in which each element of the original array would be transformed. The distracting and confusing logic of iteration and return values is separated from the code where the method is called, where all we really want is the result of all that logic. Using yield makes the method highly versatile–the logic is standardized, but we can specify the way in which the elements of the original array should be transformed. So long as we want to iterate over an array and return a new modified array, this method is applicable. This is of course why the makers of Ruby included it!

There are other methods that can be called on Enumerable objects that behave in slightly different ways. They follow a similar model–iteration and return values are abstracted. These too can be built using each. Building these methods from scratch helps to understand the subtle differences between them and. For me, it is a useful way to practice using yield and to add another build in Ruby method to my vocabulary.