filterM

Let's look at another motivating example. In CSCI 3110, you will learn how to use dynamic programming to compute a longest monotonically increasing subsequence of a sequence of numbers. That's the longest increasing sequence of numbers that we can obtain by removing an appropriate subset of numbers from the given sequence. For example, if we are given the sequence

[1,5,2,4,7,3,8]

then the unique longest monotonically increasing subsequence is

[1,2,4,7,8]

which is obtained by dropping 5 and 3. This seems like a toy problem, but finding such a sequence actually has applications in a range of algorithms. For example, a really simple and practically efficient range searching data structure can be obtained by decomposing a set of points into as few sequences of points as possible such that the points in each sequence are sorted from left to right and either their $y$-coordinates increase or the $y$-coordinates decrease.

We won't discuss the dynamic programming solution to this problem. Let's look at a much simpler greedy strategy, which is not guaranteed to give us a longest monotontically increasing subsequence, but hopefully it gives us a fairly long monotontically increasing subsequence. The idea is simple: We always keep the first number in the input sequence. Each subsequent element in the input sequence is kept if it is no less than the last element added to the output. Here is what this looks like in Python:

increasing_subsequence.py

def increasing_subsequence(xs):
    if not xs:
        return []

    last   = xs[0]
    incseq = [last]

    for x in xs[1:]:
        if x >= last:
            last = x
            incseq.append(x)

    return incseq

Python

>>> from increasing_subsequence import increasing_subsequence
>>> increasing_subsequence([1,5,2,4,7,3,8])
[1, 5, 7, 8]

The initial if-statement takes care of empty input lists because the remainder of the function cannot handle empty input lists: the array access xs[0] would throw an exception. For a non-empty list, we start by adding xs[0] to incseq. For each element added to incseq, we update last to this element. We use this value to check for each element x whether x >= last. If so, we can add it to incseq. Otherwise, we can't.

Once again, this computation is stateful, because it needs last to remember the last element added to incseq. This makes it impossible to express this computation using filter. Remember, the type of filter is

filter :: (a -> Bool) -> [a] -> [a]

In the expression filter pred xs, pred decides whether each element of xs satisfies some condition or not. This test is completely independent of the decision that filter made for any list elements that came before the current element. The type of pred says so. A function of type a -> Bool produces its result based only on the value of its argument. To replicate the behaviour of the Python function, we need a predicate that depends on the last element we have added to the output. To do so, it uses the State monad; it has type Ord a => a -> State a Bool. But this means that we cannot use filter to filter our list. What we need is

filterM :: Monad m => (a -> m Bool) -> [a] -> m [a]
filterM pred = go
  where
    go []     = return []
    go (x:xs) = do
        keepThis <- pred x
        keptRest <- go xs
        return $ if keepThis then x : keptRest else keptRest

The predicate does not return just a Bool. It returns a Bool decorated using some monad m. The list produced by filterM is also not just a plain list, but it is also decorated by the monad m. To filter the list, go [] returns the empty list. Just as filter, go (x:xs) decides whether to keep x using the predicate pred and then filters xs recursively. If pred says that x should be kept, the result is x : keptRest. Otherwise, it is simply keptRest.

Using filterM, we can now implement our Haskell version of increasing_subsequence:

increasingSubsequence :: Ord a => [a] -> [a]
increasingSubsequence xs = evalState (filterM notLess xs) Nothing
  where
    notLess x = do
        last <- get
        let keep = maybe True (x >=) last
        if keep then put (Just x) else return ()
        return keep

GHCi

>>> :{
  | increasingSubsequence :: Ord a => [a] -> [a]
  | increasingSubsequence xs = evalState (filterM notLess xs) Nothing
  |   where
  |     notLess x = do
  |         last <- get
  |         let keep = maybe True (x >=) last
  |         if keep then put (Just x) else return ()
  |         return keep
  | :}
>>> increasingSubsequence [1,5,2,4,7,3,8]
[1,5,7,8]

increasingSubsequence uses filterM notLess xs to filter the given list xs. The notLess predicate needs the last element added to the output to decide whether to keep the next element or not. Initially, there is no last element added to the output, so we represent the last element added to the output using the type Maybe a and initialize this state to Nothing: no element added to the output yet. The first thing that notLess x does is to retrieve the last element added to the output, using last <- get. Then it calculates a Boolean keep that decides whether to keep x. This is true if last is Nothing because this means that x is the first element in the input, which we always keep. If last = Just y, we keep x if x >= y. This is captured by the expression maybe True (x >=) last. Next, if we keep x, then we need to remember that it was the last element we have kept, using put (Just x). If we don't keep x, no state updates are necessary. Since if then else is an expression, with a mandatory else branch, we cannot simply do nothing if we don't keep x.¹ Thus, we return () if keep is false. This produces the same return type as put (Just x), which is important because both branches of an if then else expression must produce values of the same type, but does not do anything else. Finally, to tell filterM whether to keep x or not, we return keep.

Again, this example is only meant to demonstrate the use of filterM in conjunction with the State monad. The experienced Haskell programmer does not need monads to compute this type of increasing subsequence:

increasingSubsequence :: Ord a => [a] -> [a]
increasingSubsequence []     = []
increasingSubsequence (x:xs) = x : go x xs
  where
    go _    []                 = []
    go last (y:ys) | last <= y = y : go y    ys
                   | otherwise =     go last ys

or

increasingSubsequence :: Ord a => [a] -> [a]
increasingSubsequence = unfoldr next
  where
    next []     = Nothing
    next (x:xs) = Just (x, dropWhile (< x) xs)

You shouldn't have any problems understanding how the first version works. The second one is more interesting. Remember, unfoldr f seed uses the function f to generate a list from the seed value seed. It calls f on the seed value seed. If the result is Nothing, then the generated list is empty. If the result is Just (x, seed'), then x is the first element in the generated list, and the remainder of the list is produced recursively using seed' as the new seed value.

Here, the input list is the seed value we use. If this seed value is empty, then the result is also empty. We achieve this by having next return Nothing. If the input is x:xs, then the first element in the increasing subsequence is the first list element, x. The next element in the increasing subsequence is the first element in xs that is no less than x. Thus, to obtain a seed from which to generate the remaining elements of the increasing subsequence, we use dropWhile (< x) xs to remove all elements less than x from xs until we encounter the first element that is not less than x.

We discussed why the else branch in an if then else expression is mandatory. Since if then else is an expression that needs to have a value no matter whether the condition after if is true, we need both branches.

The story changes a bit when we use monads. As this example shows, it is perfectly reasonable to do something if a condition is satisfied, and do nothing if it is not, just as when programming imperatively. That's because monads capture effects such as state updates.

We cannot easily change the behaviour of if then else inside a do-block—because a do-block is only syntactic sugar for the composition of expressions using (>>=) and (>>). What we can do is to introduce functions that behave like if then or if not then without an else-branch. The Control.Monad module provides two such functions
```
when :: Bool -> m () -> m ()
when cond f = if cond then f else return ()

unless :: Bool -> m () -> m ()
unless cond f = if not cond then f else return ()
```
Thus, when keep $ put (Just x) behaves exactly like if keep then put (Just x) else return (). This gives the more readable implementation of increasingSubsequence:
```
increasingSubsequence :: Ord a => [a] -> [a]
increasingSubsequence xs = evalState (filterM notLess xs) Nothing
  where
    notLess x = do
        last <- get
        let keep = maybe True (x >=) last
        when keep $ put (Just x)
        return keep
```
↩