foldl and foldr

You may have heard of the MapReduce framework for parallel programming before. The idea is simple: Many computations on sequences of elements can be expressed in terms of two primitives:

Map: Apply a function to every element in the sequence. The function applications to different elements in the sequence are independent of each other and thus can be carried out in parallel on multiple cores or even multiple computers.
Reduce: Take the values in a sequence and combine them into a single value. For most combination operators—all those that are associative—this can also be done efficiently in parallel.

By providing a framework that offers efficient parallel implementations of the map and reduce primitives, we put parallel computing at the finger tips of many programmers, whereas writing customized parallel code that is correct and efficient requires significant skill.

In Haskell, the map function discussed before provides the "map" portion of the MapReduce framework. The "reduce" portion is provided by the functions foldl and foldr.

Both foldl and foldr take a combinator function f, an initial accumulator value init, and a list of elements xs. Initially, the accumulator has value init. Then foldl and foldr traverse the list xs. For every list element x, we apply the combinator function f to x and the current accumulator value. The result is the new accumulator value. Whatever the accumulator value is after we have traversed the entire list is the result that foldl or foldr returns.

The difference between the two functions is the direction in which they traverse the list: foldl traverse the list from left to right. foldr traverse the list from right to left. For some operators, the direction of traversal is irrelevant, at least as far as the result goes, but not as far as efficiency goes:

GHCi

>>> foldl (+) 0 [1,2,3,4,5]
15
>>> foldr (+) 0 [1,2,3,4,5]
15

For others, the direction of traversal is important. For example, we can reverse a list like this:

GHCi

>>> foldl (flip (:)) [] [1,2,3,4,5]
[5,4,3,2,1]

For foldl, the combinator function takes the current accumulator value as its first argument and the list element as the second argument. Thus, in this example, the first argument of the combinator function should be a list, and the second argument should be a list element. The order of arguments of (:) is the opposite. flip is a useful helper function that reverses the order of the first two arguments of a function:

flip :: (a -> b -> c) -> (b -> a -> c)
flip f y x = f x y

Thus, foldl (flip (:)) [] [1,2,3,4,5] prepends the elements 1 through 5 to the empty list, which has the effect of reversing this list. The prelude includes the function reverse :: [a] -> [a] to reverse a list, and this function is implemented simply as

reverse :: [a] -> [a]
reverse = foldl (flip (:)) []

foldr traverse the list from right to left. The combinator function expects the current list element as the first argument and the current accumulator value as the second argument. In other words, the argument order is reversed compared to foldl. Intuitively, this makes sense if you think about foldl as putting the initial accumulator value on the left and then folding the list elements onto it from left to right; foldr places the initial accumulator value on the right and then folds the list elements onto it from right to left:

Illustration of left and right folds — `foldl (+) 0 [5,2,3,5,1,10]` starts with the accumulator `0` and "folds the list elements onto it" (i.e., adds each list element to the accumulator) from left to right. That's like folding up the list on top of the accumulator with the first list element on the bottom and the last list element on the top. `foldr (+) 0 [5,2,3,5,1,10]` starts with the accumulator `0` and "folds the list elements onto it" from right to left. That's like folding up the list on top of the accumulator with the last list element on the bottom and the first list element on the top.

If we tried to implement reverse using foldr, then we wouldn't have to use flip (:) as the combinator function. Simply (:) suffices. But then we'd just get the original list back because we'd be adding the list elements to the empty list, from right to left:

GHCi

>>> foldr (:) [] [1,2,3,4,5]
[1,2,3,4,5]

Here are the implementations of foldl and foldr:

foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f = go
  where
    go accum (x:xs) = go (f accum x) xs
    go accum _      = accum

foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f = go
  where
    go accum (x:xs) = f x $ go accum xs
    go accum _      = accum

Note that the implementation of foldl is tail-recursive, while the implementation of foldr is not. Since foldr needs to traverse the list to find the end and then backtrack to accumulate the list elements in reverse, it really has no choice. This has important consequences for efficiency. I think that discussing them in detail here goes a bit too far. I'll just give a brief summary:

When the fold produces a number of some other constant-size result of accumulating the list elements, then a left fold is more efficient because it avoids building up stack frames as the non-tail-recursive implementation of foldr would. However, foldl is not as efficient as you would think. The problem is that due to lazy evaluation, foldl still builds up an expression of linear size that gets reduced to a constant-size value only once we ask for its result. As a result, foldl is used rarely. Most of the time, when a left fold is beneficial, what we really want is the strict version of foldl, called foldl'. foldl' behaves exactly like foldl, but it fully evaluates each application of the combinator function f before proceeding to the next element instead of building up an expression that is reduced only at the end.
Right folds are often used to produce lists from lists. If each list element in the output list can be produced from the current element of the input list and the consumer of the output list does not hold on to the whole list and simply consumes the list elements as they arrive, then lazy evaluation ensures that foldr still runs in constant space because it produces the output list elements on demand as they are requested, and each list element is discarded immediately after being consumed. Under certain conditions, the compiler can completely optimize the list passed from foldr to the consumer away and instead turn this into a tight loop that invokes the consumer on each value produced by foldr.

I should also mention that there exist similar functions foldl1 and foldr1. The 1 means that they expect their input lists to be non-empty. These functions panic on empty lists. For a non-empty list, they take the first or last list element, respectively, as the initial accumulator value, and then they "add" the other list elements to the accumulator in the same fashion that foldl and foldr do. Note that the accumulator can have a different type than the list elements for foldl and foldr. For foldl1 and foldr1, it necessarily has the same type as the list elements because we take the first or last list element as the initial accumulator value. Here are the implementations of foldl1 and foldr1:

foldl1 :: (a -> a -> a) -> [a] -> a
foldl1 _ []     = error "Prelude.foldl1: empty list"
foldl1 f (x:xs) = foldl f x xs

foldr1 :: (a -> a -> a) -> [a] -> a
foldr1 f = go
  where
    go []     = error "Prelude.foldr1: empty list"
    go [x]    = x
    go (x:xs) = f x $ go xs

Since map and foldl/foldr are very useful for expressing a lot of computations, they have been generalized to containers beyond lists. We will have a look at them in the chapter on Functors and Folds. A functor is the abstraction of a container type that supports application of a function to each of its elements, in the same way that map applies a function to every list element. In particular, the list type is a functor. A foldable container is one that supports foldl and foldr operations. Indeed, in today's standard library, foldl and foldr do not have the types I listed above at all. They are defined for arbitrary foldable containers:

GHCi

>>> :t foldl
foldl :: Foldable t => (b -> a -> b) -> b -> t a -> b
>>> :t foldr
foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b