Laziness

In class, we discuss the difference between applicative order evaluation and normal order evaluation of function arguments. In a nutshell, the former evaluates all function arguments before calling the function and passes the resulting values to the function. The latter passes function arguments to the function without evaluating them, and the function evaluates them only if it needs to know their value. Here's an example in pseudo-code:

function f(x, y) {
    if x > 10 {
        return x
    } else {
        return y
    }
}

If we call f(3 + 7 * 4, 1000 / 10), then applicative order evaluation evaluates 3 + 7 * 4 to 31, and 1000 / 10 to 100. It then calls f(31, 100) and the result is 31. With normal order evaluation, the expressions 3 + 7 * 4 and 1000 / 10 are passed to f without evaluating them, that is, x = 3 + 7 * 4 and y = 1000 / 10. f then evaluates x because it needs to know whether x > 10. Since it is, it returns x. In particular, y is never evaluated, because we never ask for its value. Also note that normal order evaluation does not remember the result of evaluating x. So f returns 3 + 7 * 4 as its result, and the caller of f evaluates this result only if and when it needs to know its value.

Why do we have applicative order and normal order evaluation? Applicative order evaluation is efficient to implement. Passing unevaluated expressions around is much harder than passing around their values. Normal order evaluation is what you get if you mimic how we evaluate mathematical expressions: it is computation by substitution. Here is how they compare:

Applicative order evaluation is more efficient.
Normal order evaluation is more general, as there exist computations that produce a result with normal order evaluation but fail to do so when using applicative order evaluation. For example, consider calling f(31, g()), where
```
function g() {
    return g()
}
```
With normal order evaluation, f never asks for the value of its second argument, so it happily returns 31. With applicative order evaluation, we evaluate g() before calling f, but g() never returns.
When both applicative and normal order evaluation produce a result, they produce the same result.
If a function argument has side effects, then normal order evaluation may result in this effect happening more than once. Indeed, the effect happens every time we evaluate the argument. For example, if
```
function h() {
    print("Calling h")
    return 31
}
```
and
```
function i() {
    x = f(h(), g())
    if x < 100 {
        return x
    } else {
        return 2 * x
    }
}
```
then i() prints "Calling h" twice. First, f needs to evaluate h to know whether to return its first or second argument. Based on the comparison, it returns x = h() as its result. i() then evaluates this result a second time to decide whether to return x or 2 * x. With applicative order evaluation, we would see "Calling h" on our screen only once.

So, normal order evaluation is more general than applicative order evaluation, whereas applicative order evaluation is more efficient and interacts as expected with side effects. Is there a middle ground?

Yes. It is called lazy evaluation and is what Haskell uses. The idea is simple: Equip normal order evaluation with caching of the result of evaluating an expression. In other words, once we evaluate an expression for the first time, we remember the result and never evaluate it again if we need to know its value later. This avoids (most of) the downsides of normal order evaluation: If an expression is costly to evaluate, this happens only once. Side effects also happen only once. However, lazy evaluation is yet a little harder to implement than normal order evaluation and incurs significant runtime overhead, the primary reason why C, C++ and Rust programs are significantly faster than Haskell code.

Most of the time, you won't notice that Haskell uses lazy evaluation (apart from its performance penalty). There are three areas where lazy evaluation is particularly handy:

We get short-circuit evaluation of Boolean expressions, which is hardwired into most modern programming languages, for free. In other words, we define Boolean operators as we would normal functions, and lazy evaluation implies that they short-circuit.
Evaluation of expressions only when it is safe to do so. This is hard to explain without an example. We'll look at one when we discuss this.
Infinite data structures. Lazy evaluation lets us define infinite data structures. As long as we only inspect a finite portion of them, we're fine. We'll look at an example where infinite lists result in more elegant and more efficient code than doing the same thing using only finite lists.

The evaluation of a finite portion of an infinite data structure illustrates that it must be possible to evaluate only part of an expression. We will look at exactly how Haskell decides how much of an expression to evaluate at any given point in time.