Ruby Iterators: Enumerable Methods, Part I
#each and #map
As someone new to Ruby who has a fairly limited lexicon of method names, I often have no idea that Ruby already contains a method to accomplish exactly what I am trying to achieve. Sometimes when I’m writing code, after constructing a sequence of method calls or wrapping some logic in my own method, I then find that Ruby has a built in method to abstract a common pattern, producing the same result in just one or two lines of code.
A few methods that I’ve recently come to understand are the map
, inject
and each_with_object
methods. Anything that can be accomplished with these methods could also be accomplished in several more lines of code using a varation on the each
method, so they are by no means absolutely necessary, but they have much more precise uses than the more general each
method. Building these three methods from scratch using each
helped me to understad (1) how they worked and (2) how easy it is to assign the result of some code to a variable and (3) how “yield” works. Today I’m going to examine each
and map
.
#each
each
is our basic iterator. It pretty much does what the name implies. To print the elements of an array, we can write:
array = [1, 2, 3, 4, 5] array.each do |i| puts i end
the result:
1 2 3 4 5 => [1, 2, 3, 4, 5]
each
, our iterator, takes each element from its reciever, the Enumerable array
, and yields it to the block, the indented line between our each
call and the end
keyword. It does this five times, as there are 5 elements in the array. With each iteration, the element currently passed to the block will temporarily be assigned to i
. end
indicates the end of the code to be executed with each iteration and signals the start of the following iteration using the next element in the reciever.
Even the each
method could be written in more basic terms:
array = [1, 2, 3, 4, 5] i=0 while i < array.length puts array[i] i = i+1 end
produces:
1 2 3 4 5 => nil
This, isn’t a method, it’s just some code. Notice how both print the elements of the array, but our first code sample, using each
returned the original array (as indicated by the hashrocket =>
), while the second returned nil. each
always returns the origal array on which it was called. In both the above cases, we were simply printing each element of the array, not trying to modify it in any way. But what if we wanted to transform the array in some way? For example:
array = [1, 2, 3, 4, 5] array.each do |i| i + 5 end => [1, 2, 3, 4, 5]
Once again, each
returns the orginal array. In order to create an array with new values, each five more than the values of the original array, we would have to do the following:
array = [1, 2, 3, 4, 5] result_array = [] array.each do |i| result_array << i + 5 end => [1, 2, 3, 4, 5] result_array => [6, 7, 8, 9, 10]
The value of the variable array
has not changed, but we now have a new array, result_array
containing the modified values. Note that the variable result_array
must be declared and assigned the value of empty array before array.each
is called. Then with each iteration, a new value i + 5
is appended to result_array
.
#map
Why should we have to declare an array outside the each
loop and then individually add values to it? We can’t just assign the return of the each
block to a new variable because, as we know, each
always returns the original array. Ruby’s #map
method is built to do exactly what we are asking each
to do!
array = [1, 2, 3, 4, 5] array.map do |i| i+5 end => [6, 7, 8, 9, 10]
Unlike each
, map
returns an array whose elements are the result of whatever code was evaluated within the block, which is what we really want. Take note that map
doesn’t “overwrite” the original array with the return array, but we can easily assign this return array to a variable.
array = [1, 2, 3, 4, 5] result_array = array.map do |i| i+5 end => [6, 7, 8, 9, 10] array => [1, 2, 3, 4, 5] result_array => [6, 7, 8, 9, 10]
I sometimes forget how easy it is to assign the return value of an entire block of code to a variable because it’s not as intuitive as writing n = 1
, but it’s really no different than assigning an integer, array, string or any other class of object to a variable. If you did not want to retain the original array and wanted to assign it the value of the new array, you could easily do so:
array = [1, 2, 3, 4, 5] array = array.map do |i| i+5 end => [6, 7, 8, 9, 10] array => [6, 7, 8, 9, 10]
The difference between map
and each
is fairly straight-forward. We already say how certain uses of each
are really begging for map
. This was pretty easy to recognize, but building my own method that mimicked the functionality of map
was a bit harder. The method can only use the each
method within its definition to iterate over the reciever array. It also must be versatile–like map
it must yield to a block that specifies exactly what should happen to each element in the reciever–I don’t always want to add 5 to every element. It must then return the modified array.
array = [1,2,3,4,5] def my_map return_array = [] self.each do |n| return_array << yield(n) end return_array end array.my_map do |i| i * 2 end => [2, 4, 6, 8, 10]
The thing I really struggled to understand was how exactly yield
makes this work. Whe we call array.my_map
, Ruby immediately shifts to the first line of code within the definition of my_map
. Before even interpreting do |i|
, Ruby inserts a bookmark and starts reading the first line of the definition of my_app
. On this line, a new empty array is initialized. On the next, the method each
is called on self
. In the context of a method definition, self is equal to the method’s reciever. In this case, the reciever is array
. In this particular method call, this line of code means array.each do |n|
.
Like always, each
signals an iteration over its reciever where each element in that reciever will temporarily be assigned to a variable, in this case n
, and then passed to the block that follows. In the first iteration, n
is equal to 1.
Inside the block on the following line, we see that some value is appended to the array return_array
, but Ruby does not yet knwo what that value will be. yield(n)
puts another bookmark in our code, and tells Ruby to go back to where it left of earlier, just before do |i|
, bringing the variable n
with it. The variable i
is temporarily assigned the value of n
. If this is our firt iteration, n
is equal to 1. i
is equal to n
within this block. The block is then evaluated. The last line of code evaluated within the block is its implied return value.
In the first iteration, starting from the point where Ruby interprets yield(n)
, here’s an expanded version of what’s happening:
n = 1 i = n i * 2 => 2
After this, Ruby reads the keyword end
and knows that it has reached the end of the block. Thus the return value of the block is equal to 2 because the last line of code in the block evaluates to 2. Think back to when Ruby placed its last bookmark. It was back inside a different block of code, within the my_map
method. Ruby paused evaluation at this point because it interpreted instructions to yield. Now that the block has ended, Ruby returns to this point. yield(n)
evaluates to the return value of the block to which it yielded. In this case, yield(n)
where n = 1 evaluates to 2, as we saw. Ruby now knows what value to append to the reciever of the shovel method, return_array
.
Upon interpreting the end
keyword on the following line, Ruby will perform the next iteration over array
, again assigning n
a value, passing that value to the block where my_map
was originally called, it will evaluate that block and shovel its return value into return_array
all over again and again for each array element. After the last iteration, Ruby will proceed to the line following the end
key. Because we want the method to return the modified array, the last line of code in the method definition must be return_array
. Whenever this function is called, it will return the updated array.
Abstracting behaviors that will be used again and again is always a good decision. Utilizing yield in this method allowed us to abstract everthing execept the exact way in which each element of the original array would be transformed. The distracting and confusing logic of iteration and return values is separated from the code where the method is called, where all we really want is the result of all that logic. Using yield
makes the method highly versatile–the logic is standardized, but we can specify the way in which the elements of the original array should be transformed. So long as we want to iterate over an array and return a new modified array, this method is applicable. This is of course why the makers of Ruby included it!
There are other methods that can be called on Enumerable objects that behave in slightly different ways. They follow a similar model–iteration and return values are abstracted. These too can be built using each
. Building these methods from scratch helps to understand the subtle differences between them and. For me, it is a useful way to practice using yield
and to add another build in Ruby method to my vocabulary.