Exception Handling: For Real Now

Let's put CPS to use to finish our Haskell implementation of inverses. Let's first do this all without using CPS. Since invert may fail to produce a result, it should produce a result of type Maybe Double:

invert :: Double -> Maybe Double
invert 0 = Nothing
invert x = Just (1 / x)

If we now used map to apply invert to every element in the input list, we would obtain a list of type [Maybe Double]. What we want is a Maybe a list of Doubles, Maybe [Double]. That's exactly what we get if we use mapM, not map, to map invert over the elements in the given list:

inverses :: [Double] -> Maybe [Double]
inverses = mapM invert

GHCi

>>> :{
  | invert 0 = Nothing
  | invert x = Just (1 / x)
  | :}
>>> inverses = mapM invert
>>> inverses [1,2,3,4,5]
Just [1.0,0.5,0.3333333333333333,0.25,0.2]
>>> inverses [1,2,0,4,5]
Nothing

But this solution exploits that Maybe is a monad and, for now, we'd like to do things without monads. So let's implement our own version of map,

mapAbortable :: (a -> Maybe b) -> [a] -> Maybe [b]

The first argument of map has type a -> b. For mapAbortable, the type is a -> Maybe b. We use Nothing to indicate failure. Then, if we apply this function to every element in a list of type [a], we obtain Just a list of type [b] if all function applications succeeded. If at least one of them failed, produced Nothing, then the result of mapAbortable is also Nothing. Here's the complete implementation:

mapAbortable :: (a -> Maybe b) -> [a] -> Maybe [b]
mapAbortable _ []     = Just []
mapAbortable f (x:xs) = case f x of
    Nothing -> Nothing
    Just y  -> case mapAbortable f xs of
        Just ys -> Just (y : ys)
        Nothing -> Nothing

inverses :: [Double] -> Maybe [Double]
inverses = mapAbortable invert

This is nothing but the implementation of mapM specialized to Maybe as the monad and with the (>>=) operator of the Maybe monad expanded into the corresponding case expressions.

However, the goal of our exercise here is to do all this using continuations. Our first step is to convert the return type of invert and the argument and return types of mapAbortable to CPS. First invert:

invert :: Double -> (forall r. (Maybe Double -> r) -> r)
invert 0 cont = cont Nothing
invert x cont = cont $ Just (1 / x)

And mapAbortable:

mapCPS :: (a -> (forall r. (Maybe b -> r) -> r)) -> [a] -> (forall r. (Maybe [b] -> r) -> r)
mapCPS _ []     cont = cont (Just [])
mapCPS f (x:xs) cont = f x $ \my -> case my of
    Nothing -> cont Nothing
    Just y  -> mapCPS f xs $ \mys -> case mys of
        Nothing -> cont Nothing
        Just ys -> cont $ Just (y : ys)

Once again, this is a purely mechanical translation of the non-CPS version into CPS. The non-CPS version starts by invoking f x and then branches on whether the result is Nothing or Just y. Here, we pass a continuation to f x. This continuation binds the result that the non-CPS version of f x would have produced to its argument my ("maybe y"), and then we branch on whether my is Nothing or Just y. If the result is Just y, the non-CPS version calls mapAbortable f xs and then once again branches on the result. The CPS version calls mapCPS f xs and passes to it a continuation whose argument mys ("maybe ys") captures the result that mapAbortable f xs would have produced. We then branch on the value of mys.

This produces the correct result:

GHCi

>>> :{
  | invert 0 cont = cont Nothing
  | invert x cont = cont $ Just (1 / x)
  | :}
>>> :{
  | mapCPS _ [] cont = cont (Just [])
  | mapCPS f (x:xs) cont = f x $ \my -> case my of
  |     Nothing -> cont Nothing
  |     Just y  -> mapCPS f xs $ \mys -> case mys of
  |         Nothing -> cont Nothing
  |         Just ys -> cont $ Just (y : ys)
  | :}
>>> inverses = mapCPS invert
>>> inverses [1,2,3,4,5] id
Just [1.0,0.5,0.3333333333333333,0.25,0.2]
>>> inverses [1,2,0,4,5] id
Nothing

Unfortunately, it doesn't quite behave the way we would hope. Neither does our implementation of inverses via mapAbortable above.

Let's look at mapAbortable first. mapAbortable invert [1,2,0,4,5] calls invert 1. This produces Just 1, so we apply mapAbortable invert [2,0,4,5]. This calls invert 2, which produces Just 0.5, so we apply mapAbortable invert [0,4,5]. This calls invert 0, which produces Nothing. Thus, mapAbortable invert [0,4,5] produces Nothing. Back in mapAbortable invert [2,0,4,5], the inner case statement branches on the return value of mapAbortable [0,4,5] and once again produces Nothing. This brings us back to mapAbortable invert [1,2,0,4,5], which branches on the return value of mapAbortable [2,0,4,5] and finally returns Nothing.

What I'm taking issue with is that if the first 100 elements in the input list are non-zero and the 101st element is zero, then the recursive implementation of mapAbortable unwinds the 100 recursive calls we have made so far. Each of these recursive calls detects that its child returned Nothing, so it has to return Nothing itself.

mapCPS behaves the same. The closure passed to the call of invert on the 101st element is the composition of 100 closures built up by the invocations on the 100 non-zero elements before it. Calling this composite closure with Nothing as argument then threads this Nothing value through this entire sequence of 100 individual closures to produce the final result of Nothing. That's not exactly aborting the computation. The Scheme implementation does not have this problem. When the scheme version of invert invokes the abort continuation, there is no unwinding of map involved. We jump directly to the point where call/cc returns.

To obtain the same behaviour in Haskell, we need to look a little more deeply into different encodings of data types. We have discussed in the previous section that we can encode a value of type a as the type (forall r. (a -> r) -> r). Maybe is a data type with multiple data constructors. The general form of a function of type Maybe a -> r is

f :: Maybe a -> r
f x = case x of
    Nothing -> def
    Just y  -> g y

Thus, we can describe f using a value def :: r and a function g :: a -> r. In fact, this is exactly what maybe does:

maybe :: r -> (a -> r) -> (Maybe a -> r)
maybe def g x = case x of
    Nothing -> def
    Just y  -> g y

Given a value of type Maybe a, maybe chooses between def and g to apply the result of type r. We can do this directly, without using maybe and without explicitly constructing a value of type Maybe a. We directly encode a value of type Maybe a as a function of the type r -> (a -> r) -> r. This is our CPS encoding of the Maybe type. Specifically,

nothing :: r -> (a -> r) -> r
nothing def _ = def

just :: a -> (forall r. r -> (a -> r) -> r)
just x _ g = g x

nothing is a function that returns def, just as maybe def g Nothing would have done. just x is a function that applies g to x, just as maybe def g (Just x) would have done. To convert these functional encodings back to the standard Maybe type, we have

toMaybe :: (forall r. r -> (a -> r) -> r) -> Maybe a
toMaybe x = x Nothing Just

GHCi

>>> nothing def _ = def
>>> just x _ g = g x
>>> toMaybe x = x Nothing Just
>>> toMaybe nothing
Nothing
>>> toMaybe (just 4)
Just 4

This does in fact work for any data type. The CPS encoding of the data type takes as many continuations as arguments as there are data constructors of the type, and the number of arguments of the continuation equals the number of arguments of the data constructor. Let's look at a second example then:

data Either a b = Left a | Right b

A function of type Either a b -> r has the form

f :: Either a b -> r
f x = case x of
    Left  a -> g a
    Right b -> h b

It is thus described by two functions g :: a -> r and h :: b -> r. Once again, we have a standard library function that implements exactly this idea:

either :: (a -> r) -> (b -> r) -> (Either a b -> r)
either g h x = case x of
    Left  a -> g a
    Right b -> h b

We obtain the CPS encoding of the Either type by taking the type of either and dropping the argument of type Either. We obtain a function of type (a -> r) -> (b -> r) -> r. We then have CPS versions of Left and Right:

left :: a -> (forall r. (a -> r) -> (b -> r) -> r)
left x g h = g x

right :: b -> (forall r. (a -> r) -> (b -> r) -> r)
right x g h = h x

left x is a function that applies g to x, just as either g h (Left x) would have done. right x is a function that applies h to x, just as either g h (Right x) would have done. To convert these functional encodings back to the standard Either type, we have

toEither :: (forall r. (a -> r) -> (b -> r) -> r) -> Either a b
toEither x = x Left Right

GHCi

>>> left x g h = g x
>>> right x g h = h x
>>> toEither x = x Left Right
>>> toEither (left 1)
Left 1
>>> toEither (right True)
Right True

This example using Either was only for illustration. What we do need now is our encoding of the type Maybe a, which is r -> (a -> r) -> r. With this encoding, we have

invert :: Double -> (forall r. r -> (Double -> r) -> r)
invert 0 fail cont = fail
invert x fail cont = cont (1 / x)

The implementation of mapCPS becomes

mapCPS :: (a -> (forall r. r -> (b -> r) -> r)) -> [a] -> (forall r. r -> ([b] -> r) -> r)
mapCPS _ []     fail cont = cont []
mapCPS f (x:xs) fail cont = f x fail $ \y -> mapCPS f xs fail $ \ys -> cont (y : ys)

We use this to implement inverses as

inverses :: [Double] -> Maybe [Double]
inverses xs = mapCPS invert xs Nothing Just

And with this, we finally have a version of inverses that does behave exactly like the Scheme version. fail is now a proper escape continuation that invert can use to completely abort the remainder of the computation and directly produce a value of type r. cont is the continuation that invert calls to continue the computation with the value 1 / x when invert is able to produce this value because x /= 0.

Let's have a closer look at how inverses [1,2,0,4,5] = mapCPS invert [1,2,0,4,5] Nothing Just behaves. Since [1,2,0,4,5] /= [], we use the second equation of mapCPS. This calls invert 1 with Nothing and \y -> mapCPS invert [2,0,4,5] Nothing $ \ys -> Just (y : ys) as arguments. Since 1 /= 0, this uses the second equation of invert, which calls \y -> mapCPS invert [2,0,4,5] Nothing $ \ys -> Just (y : ys) with argument y = 1. This calls mapCPS invert [2,0,4,5] Nothing $ \ys -> Just (1 : ys). Again, [2,0,4,5] /= [], so we use the second equation of mapCPS. This calls invert 2 with Nothing and \y -> mapCPS invert [0,4,5] Nothing $ \ys -> (\ys -> Just (1 : ys)) (y : ys) as arguments. Since 2 /= 0, this uses the second equation of invert, which calls \y -> mapCPS invert [0,4,5] Nothing $ \ys -> (\ys -> Just (1 : ys)) (y : ys) with argument y = 0.5. This calls mapCPS invert [0,4,5] Nothing $ \ys -> (\ys -> Just (1 : ys)) (0.5 : ys). Since [0,4,5] /= [], we use the second equation of mapCPS again. This calls invert 0 with Nothing and \y -> mapCPS invert [4,5] Nothing $ \ys -> (\ys -> (\ys -> Just (1 : ys)) (0.5 : ys)) (y : ys) as arguments. Now we have 0 == 0, so we use the first equation of invert, which returns Nothing. And that's it. That's the result. We no longer thread a whole series of Nothings through a sequence of case statements to produce the final Nothing.

You can try out that inverses works correctly in GHCi:

GHCi

>>> :{
  | invert 0 fail cont = fail
  | invert x fail cont = cont (1 / x)
  | :}
>>> :{
  | mapCPS _ []     fail cont = cont []
  | mapCPS f (x:xs) fail cont = f x fail $ \y -> mapCPS f xs fail $ \ys -> cont (y : ys)
  | :}
>>> inverses xs = mapCPS invert xs Nothing Just
>>> inverses [1,2,3,4,5]
Just [1.0,0.5,0.3333333333333333,0.25,0.2]
>>> inverses [1,2,0,4,5]
Nothing

The general wisdom is that in Haskell, we don't need continuations to achieve many things that require continuations in other languages, thanks to Haskell's lazy evaluation. Pretty much the only reason why one may want to use continuations in Haskell is because, in some situations, programming in CPS leads to more efficient code. Our implementation of inverses is one such example. Without continuations, and even with the wrong use of continuations, the aborting behaviour when encountering a 0 was achieved by explicitly unwinding the stack of recursive calls. Combined with the correct encoding of the Maybe type, we were able to eliminate this and make aborts immediate, as in Scheme.

However, be warned. Continuations do not always lead to faster code. Even when they do, this may depend on the particular encoding of the data types in CPS that we choose. A paper on different encodings of recursive types discusses a number of recursive data types whose manipulation using standard algebraic data types or using a Scott encoding is much faster than when encoding these types using a Church encoding, asymptotically faster.