Well-Behaved Containers: Functors and Folds
The naming of abstractions in Haskell is grounded in mathematics, more specifically algebra and category theory. The Haskell haters are put off by this and question why such "complicated" terminology is necessary. The Haskell lovers relish that this naming reflects connections between programming and well-understood mathematical theories. It brings order to the madness. However, whether you love or hate Haskell and its choice of names for various abstractions, there is a pragmatic reason why you should become familiar at least with the most basic abstractions that Haskell offers: they are the key to writing reusable code.
-
We have functions that implement operations that can be applied to many different arguments of the same types in many different places in our program. That's the first step towards reusability.
-
We have generic/polymorphic functions and containers that aren't only applicable to arguments of specific types but to arguments of many different types, as long as these types support the operations required by the function application, expressed using interfaces, traits or type classes. This pushes reusability further. We no longer need to implement lists of integers, lists of strings, lists of student records, etc. separately. We can just use one implementation of a list and stick whatever objects we want into the list.
-
Now look at lists, arrays, binary search trees, hash tables, and skip lists. Their implementations differ, but they have one thing in common: they are container types. They all store elements of some type
a
. Clearly, the difference in implementation of these different containers matters for some operations: Insertions into binary search trees are fast, but insertions into an array are slow. Searching a binary search tree or sorted array is fast, but searching a list is slow. In some situations, however, we care only that we're looking at a container storing some elements, and we want to do something with every element in the container. It would be nice if we didn't have to write separate implementations of our function that take lists, arrays, binary search trees, etc. as arguments. We would like to be able to implement a single function that works for any container type. This takes code reusability to a yet higher level. This abstraction of containers is the focus of this chapter.
This chapter aims to identify common programming patterns that can be applied to any container. If we write our code in terms of these patterns, then our code is applicable to any container type, not just lists or arrays or trees.
The two patterns we will look at here are:
-
Transforming the contents of a container. Specifically, we want to keep the container essentially unchanged. All we want to do is replace every element in the container with a new element, obtained by applying a function to the old element. That's what our
map
function does for lists, and it is the "Map" part in the MapReduce framework I mentioned when introducing the various list functions. In Haskell, containers that support this type of transformation are calledFunctor
s. -
Combining all the elements in the container into a single result value. For example, we might want to add or multiply all the elements in the container if these elements are numbers. This is the "Reduce" part of the MapReduce framework and is what
foldl
andfoldr
do for lists. In Haskell, we call containers that support this accumulation of their elements into a single result valueFoldable
containers, andfoldl
andfoldr
in fact work for anyFoldable
container, not only for lists:GHCi>>> :t foldl foldl :: Foldable t => (b -> a -> b) -> b -> t a -> b >>> :t foldr foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b
As the success of the MapReduce framework for implementing parallel algorithms demonstrates, a lot of computations can be expressed in terms of these two primitives. If we have such a computation, we can implement it on any container that is a functor and is foldable.