filterM
Let's look at another motivating example. In CSCI 3110, you will learn how to use dynamic programming to compute a longest monotonically increasing subsequence of a sequence of numbers. That's the longest increasing sequence of numbers that we can obtain by removing an appropriate subset of numbers from the given sequence. For example, if we are given the sequence
[1,5,2,4,7,3,8]
then the unique longest monotonically increasing subsequence is
[1,2,4,7,8]
which is obtained by dropping 5 and 3. This seems like a toy problem, but finding such a sequence actually has applications in a range of algorithms. For example, a really simple and practically efficient range searching data structure can be obtained by decomposing a set of points into as few sequences of points as possible such that the points in each sequence are sorted from left to right and either their \(y\)-coordinates increase or the \(y\)-coordinates decrease.
We won't discuss the dynamic programming solution to this problem. Let's look at a much simpler greedy strategy, which is not guaranteed to give us a longest monotontically increasing subsequence, but hopefully it gives us a fairly long monotontically increasing subsequence. The idea is simple: We always keep the first number in the input sequence. Each subsequent element in the input sequence is kept if it is no less than the last element added to the output. Here is what this looks like in Python:
def increasing_subsequence(xs):
if not xs:
return []
last = xs[0]
incseq = [last]
for x in xs[1:]:
if x >= last:
last = x
incseq.append(x)
return incseq
>>> from increasing_subsequence import increasing_subsequence
>>> increasing_subsequence([1,5,2,4,7,3,8])
[1, 5, 7, 8]
The initial if-statement takes care of empty input lists because the remainder
of the function cannot handle empty input lists: the array access xs[0]
would
throw an exception. For a non-empty list, we start by adding xs[0]
to
incseq
. For each element added to incseq
, we update last
to this element.
We use this value to check for each element x
whether x >= last
. If so, we
can add it to incseq
. Otherwise, we can't.
Once again, this computation is stateful, because it needs last
to remember
the last element added to incseq
. This makes it impossible to express this
computation using filter
. Remember, the type of filter
is
filter :: (a -> Bool) -> [a] -> [a]
In the expression filter pred xs
, pred
decides whether each element of xs
satisfies some condition or not. This test is completely independent of the
decision that filter
made for any list elements that came before the current
element. The type of pred
says so. A function of type a -> Bool
produces its
result based only on the value of its argument. To replicate the behaviour of
the Python function, we need a predicate that depends on the last element we
have added to the output. To do so, it uses the State
monad; it has type
Ord a => a -> State a Bool
. But this means that we cannot use filter
to
filter our list. What we need is
filterM :: Monad m => (a -> m Bool) -> [a] -> m [a]
filterM pred = go
where
go [] = return []
go (x:xs) = do
keepThis <- pred x
keptRest <- go xs
return $ if keepThis then x : keptRest else keptRest
The predicate does not return just a Bool
. It returns a Bool
decorated using
some monad m
. The list produced by filterM
is also not just a plain list,
but it is also decorated by the monad m
. To filter the list, go []
returns
the empty list. Just as filter
, go (x:xs)
decides whether to keep x
using
the predicate pred
and then filters xs
recursively. If pred
says that x
should be kept, the result is x : keptRest
. Otherwise, it is simply
keptRest
.
Using filterM
, we can now implement our Haskell version of
increasing_subsequence
:
increasingSubsequence :: Ord a => [a] -> [a]
increasingSubsequence xs = evalState (filterM notLess xs) Nothing
where
notLess x = do
last <- get
let keep = maybe True (x >=) last
if keep then put (Just x) else return ()
return keep
>>> :{
| increasingSubsequence :: Ord a => [a] -> [a]
| increasingSubsequence xs = evalState (filterM notLess xs) Nothing
| where
| notLess x = do
| last <- get
| let keep = maybe True (x >=) last
| if keep then put (Just x) else return ()
| return keep
| :}
>>> increasingSubsequence [1,5,2,4,7,3,8]
[1,5,7,8]
increasingSubsequence
uses filterM notLess xs
to filter the given list xs
.
The notLess
predicate needs the last element added to the output to decide
whether to keep the next element or not. Initially, there is no last element
added to the output, so we represent the last element added to the output using
the type Maybe a
and initialize this state to Nothing
: no element added to
the output yet. The first thing that notLess x
does is to retrieve the last
element added to the output, using last <- get
. Then it calculates a Boolean
keep
that decides whether to keep x
. This is true if last
is Nothing
because this means that x
is the first element in the input, which we always
keep. If last = Just y
, we keep x
if x >= y
. This is captured by the
expression maybe True (x >=) last
. Next, if we keep x
, then we need to
remember that it was the last element we have kept, using put (Just x)
. If we
don't keep x
, no state updates are necessary. Since if then else
is an
expression, with a mandatory else
branch, we cannot simply do nothing if we
don't keep x
.1 Thus, we return ()
if keep
is false. This produces
the same return type as put (Just x)
, which is important because both branches
of an if then else
expression must produce values of the same type, but does
not do anything else. Finally, to tell filterM
whether to keep x
or not, we
return keep
.
Again, this example is only meant to demonstrate the use of filterM
in
conjunction with the State
monad. The experienced Haskell programmer does not
need monads to compute this type of increasing subsequence:
increasingSubsequence :: Ord a => [a] -> [a]
increasingSubsequence [] = []
increasingSubsequence (x:xs) = x : go x xs
where
go _ [] = []
go last (y:ys) | last <= y = y : go y ys
| otherwise = go last ys
or
increasingSubsequence :: Ord a => [a] -> [a]
increasingSubsequence = unfoldr next
where
next [] = Nothing
next (x:xs) = Just (x, dropWhile (< x) xs)
You shouldn't have any problems understanding how the first version works. The
second one is more interesting. Remember, unfoldr f seed
uses the function f
to generate a list from the seed value seed
. It calls f
on the seed value
seed
. If the result is Nothing
, then the generated list is empty. If the
result is Just (x, seed')
, then x
is the first element in the generated
list, and the remainder of the list is produced recursively using seed'
as the
new seed value.
Here, the input list is the seed value we use. If this seed value is empty, then
the result is also empty. We achieve this by having next
return Nothing
. If
the input is x:xs
, then the first element in the increasing subsequence is the
first list element, x
. The next element in the increasing subsequence is the
first element in xs
that is no less than x
. Thus, to obtain a seed from
which to generate the remaining elements of the increasing subsequence, we use
dropWhile (< x) xs
to remove all elements less than x
from xs
until we
encounter the first element that is not less than x
.
-
We discussed why the
else
branch in anif then else
expression is mandatory. Sinceif then else
is an expression that needs to have a value no matter whether the condition afterif
is true, we need both branches.The story changes a bit when we use monads. As this example shows, it is perfectly reasonable to do something if a condition is satisfied, and do nothing if it is not, just as when programming imperatively. That's because monads capture effects such as state updates.
We cannot easily change the behaviour of
if then else
inside ado
-block—because ado
-block is only syntactic sugar for the composition of expressions using(>>=)
and(>>)
. What we can do is to introduce functions that behave likeif then
orif not then
without anelse
-branch. TheControl.Monad
module provides two such functionswhen :: Bool -> m () -> m () when cond f = if cond then f else return () unless :: Bool -> m () -> m () unless cond f = if not cond then f else return ()
Thus,
when keep $ put (Just x)
behaves exactly likeif keep then put (Just x) else return ()
. This gives the more readable implementation ofincreasingSubsequence
:↩increasingSubsequence :: Ord a => [a] -> [a] increasingSubsequence xs = evalState (filterM notLess xs) Nothing where notLess x = do last <- get let keep = maybe True (x >=) last when keep $ put (Just x) return keep