Function Composition

How else do mathematicians define functions? They build more complex functions from simpler ones. If we define

\[\begin{gathered} g : \mathbb{Z} \rightarrow \mathbb{Z}\\ g(x) = 2 \cdot x + 1, \end{gathered}\]

that's really the chaining of two functions: First, we apply a function that doubles the value of the function argument \(x\), and then we apply to the result another function that increases it by one. Our function \(f\) from the previous section takes care of doubling the value of \(x\). Let's define another function \(h\) that takes care of increasing the value of its argument by one:

\[\begin{gathered} h : \mathbb{Z} \rightarrow \mathbb{Z}\\ h(x) = x + 1 \end{gathered}\]

or, in Haskell,

h :: Int -> Int
h x = x + 1

Given these two functions, we can define \(g\) as

\[g(x) = h(f(x))\]

or, in Haskell, as

g :: Int -> Int
g x = h (f x)

Here, we do need parentheses to indicate that we want to apply f to x and then apply h to the result. If we had written h f x, then this would mean that we apply h to two arguments, f and x, and this wouldn't compile because h expects only a single argument.

Mathematicians have a shorter way of writing the composition of two functions. The function \(g\) defined as the composition of \(h\) and \(f\) is written as

\[g = h \cdot f,\]

defined exactly as above:

\[(h \cdot f)(x) = h(f(x)).\]

The key is that we have this composition operator that allows us to directly build new functions from existing ones by composing them, by applying one after the other.

Haskell also has a function composition operator:

(.) :: (b -> c) -> (a -> b) -> (a -> c)
(g . f) x = g (f x)

Now, this definition of the composition operator in Haskell requires some unpacking. We'll do that in a second. For now, let's believe that this is exactly the equivalent of mathematics' function composition. Then we can simply write our definition of our function g above as

g :: Int -> Int
g = h . f

As we will see, directly manipulating functions instead of explicitly referring to their arguments is a very powerful and expressive way to program, albeit very different from how you used to program in the past.

Back to our composition operator. First of all, let's look at the function signature

(.) :: (b -> c) -> (a -> b) -> (a -> c)

I suspect you have at least three issues with it: Why the parentheses around the .? What are a, b and c? And what's with all the parentheses in the function's type?

First let's talk about the parentheses around the .. In Haskell, mathematical operators such as +, -, and so on are in fact plain old functions. Indeed, the addition operator takes two number arguments and maps them to a number. Looked at it this way, it's nothing but a two-argument function. The only difference is that we don't write + 2 3 but 2 + 3. The syntactic rule in Haskell is that

Functions whose names are composed of letters, digits, underscores, and apostrophes are "normal" functions, which should be written in prefix notation.

Like so: myFunction 1 2.

Two-argument functions whose names are composed of special characters are interpreted as infix operators and should be written in infix notation.

Like so: 2 + 3 or x >>= f. Yes, (>>=) is indeed a rather important infix operator in Haskell which we will talk about much later in this book.

Now it turns out that we are allowed to use operators in prefix notation in Haskell, and we are allowed to use ordinary two-argument functions in infix notation. It just requires a bit of special syntax.

Instead of 2 + 3, we can also write (+) 2 3.

To use an operator in prefix notation, we need to enclose it in parentheses. The same notation applies when we want to use an operator by itself, without its arguments.

The latter is useful in type signatures, as above, and when passing an operator as an argument to some other function that expects a two-argument function as argument.

Functions are ordinary values in Haskell that can be passed as arguments to other functions and returned as the results of other functions.

That's a really powerful way to program once you get used to it.

In order to write an ordinary function in infix notation, you need to include it in backticks.

For example, there is the elem function that checks whether a given value belongs to a list. Normally, we'd write elem x list, but writing x `elem` list reads a bit more like English or even math: \(x \in \textit{list}\).

ShelL

$ stack ghci
>>> 2 `elem` [1,2,3,4]
True
>>> 5 `elem` [1,2,3,4]
False

So that's the parenthesis around the .. Now what about a, b, and c. That's another Haskell syntax rule.

Concrete values and concrete types (just another kind of value) start with uppercase letters.

For example, Int and Bool are concrete types, True and False are concrete values of type Bool.

Variables start with lowercase letters.

In the type signature of our function composition operator, a, b, and c are type variables. They can refer to any type!

With that, let's take apart that type signature. b -> c is the type of a function that maps a value of type b to a value of type c. a -> b is the type of a function that maps a value of type a to a value of type b. a -> c is the type of a function that maps a value of type a to a value of type c. If we compose a function f :: b -> c with a function g :: a -> b, we expect to obtain a function f . g :: a -> c, because f . g first applies g to its argument, that is, it expects a value of the same type expected by g, type a. Applying g to this value produces a value of type b. That's luckily exactly the argument type that f expects, so it is okay to apply f to the result of g, and f produces a value of type c from it. And this is exactly what our type signature says: (.) takes two arguments, a function of type b -> c and a function of type a -> b, and it produces from them a function of type a -> c, by composing its two arguments.

Finally, the definition of the composition of two functions matches exactly the definition from mathematics:

\[(f \cdot g)(x) = f(g(x)).\]

(f . g) x = f (g x)

Let me emphasize once more that a, b, and c in the type signature of (.) are type variables, not concrete types. Consider three functions:

toInt    :: Char -> Int
toDouble :: Int -> Double
toString :: Double -> String

Then we can construct the function toDouble . toInt. Since toInt has type Char -> Int, it matches the type of the second argument of (.), which is a -> b, by setting a = Char and b = Int. Similarly, toDouble, having type Int -> Double, matches the type of the first argument of (.), which is b -> c, if we set b = Int and c = Double. The only constraint that the type signature of (.) imposes is that we choose the same type as the value of the type variable b in both substitutions. Thankfully, that's the case here, since we choose b = Int both to match the type of toInt to the type a -> b and to match the type of toDouble to b -> c. The type of toDouble . toInt is Char -> Double because we have a = Char and c = Double.

Similarly, we would be able to compose toString and toDouble to obtain a function toString . toDouble of type Int -> String. What we can't do is construct the function toString . toInt. That's because matching toInt to the second argument of (.) would require that we set b = Int, while matching toString to the first argument of (.) would require that we set b = Double. But they're two different types, so we cannot simultaneously have b = Int and b = Double. This is an important constraint because remember, the function f . g first applies g to its argument and then applies f to the result of g. This works only if the argument type of f and the result type of g are the same.

If you've used generic programming in Java, Rust, Go, or C++ before, then type variables are nothing but generic types. The difference is that to use a generic type in Java, you'd have to write something like int myFunction<T>(T val), whereas in Haskell, the syntax is more lightweight: myFunction :: t -> Int. The argument type is generic by virtue of being a type variable. You'll soon come to appreciate that polymorphic functions (Haskell's name for generic functions) are everywhere in Haskell, and Haskell programmers tend to prefer to write polymorphic functions wherever they can.

We digressed a bit. We were talking about how mathematicians define functions: simple functions can be defined directly. More complex functions can be defined by composing simpler functions. Let's look at a third way mathematicians use to define functions: case distinctions. As we will see, these three tools for building functions are all we really need to have an expressive and elegant programming language.