zip, zipWith, and unzip

It is not uncommon to want to combine multiple lists into a single list, pairing the first element in one list with the first element in another list, and the same for the second element, third element, and so on:

GHCi

>>> zip [1,2,3] ["Hello","you","there"]
[(1,"Hello"),(2,"you"),(3,"there")]

We may also want to take a list of pairs apart into two lists:

GHCi

>>> unzip [(1,"Hello"),(2,"you"),(3,"there")]
([1,2,3],["Hello","you","there"])

zip and unzip are the functions for the job.

zip :: [a] -> [b] -> [(a,b)]
zip (x:xs) (y:ys) = (x,y) : zip xs ys
zip _      _      = []

unzip :: [(a,b)] -> ([a], [b])
unzip []          = ([], [])
unzip ((x,y):xys) = (x:xs, y:ys)
  where
    (xs, ys) = unzip xys

If you squint at this implementation of zip, you will see that the list it produces has the length of the shorter of the two input lists. The remaining elements in the longer list are discarded:

GHCi

>>> zip [1,2,3] "Hello"
[(1,'H'),(2,'e'),(3,'l')]

This makes perfect sense: Since the output list needs to contain pairs of elements from the two input lists, what should we pair an element in the longer list with if there are no more elements in the shorter list left?

Combined with lazy evaluation (see previous chapter), this behaviour is tremendously useful.

zip also has a more general sibling,

zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
zipWith f = go
  where
    go (x:xs) (y:ys) = f x y : go xs ys
    go _      _      = []

Instead of forming pairs of elements taken from the two input lists, zipWith uses the provided function to compute a value of an arbitrary type c from each pair of elements taken from the two input lists. For example,

GHCi

>>> zipWith (+) [1,2,3] [4,5,6]
[5,7,9]

Since (,) :: a -> b -> (a, b) is a data constructor, we have

zip = zipWith (,)

Conversely,

zipWith f xs ys = map (uncurry f) $ zip xs ys

GHCi

>>> map (uncurry (+)) $ zip [1,2,3] [4,5,6]
[5,7,9]

I'll leave it as an exercise to figure out that this relationship holds. As a bonus exercise, figure out why we weren't able to define

zipWith f = map (uncurry f) . zip

As a final comment, the Data.List module also has versions of zip, unzip, and zipWith that take more than two lists as arguments or, in the case of unzip, produce more than two lists. These functions are called zip3, unzip3, zipWith3, zip4, unzip4, zipWith4`, and so on. For example,

GHCi

>>> zip3 [1,2,3] [False,True,False] ["Hello","beautiful","world"]
[(1,False,"Hello"),(2,True,"beautiful"),(3,False,"world")]

Exercise

Use zip, filter, and map to implement a function

everyOther :: [a] -> [a]

that returns a list containing every other element of its argument list:

GHCi

>>> everyOther [1,2,3,4,5]
[1,3,5]
>>> everyOther "Hello"
"Hlo"

To do so elegantly, you'll find the infinite list of integers [1..] helpful.

Solution

everyOther = map snd . filter (odd . fst) . zip [1..]

everyOther xs applies zip [1..] to xs, that is, it pairs every element of xs with its position in the list:

GHCi

>>> zip [1..] "Hello"
[(1,'H'),(2,'e'),(3,'l'),(4,'l'),(5,'o')]

It then keeps only those pairs that satisfy the condition odd . fst, that is, those pairs corresponding to elements at odd positions in xs:

GHCi

>>> filter (odd . fst) $ zip [1..] "Hello"
[(1,'H'),(3,'l'),(5,'o')]

And finally, it strips the positions of the remaining elements, that is, it produces a new list containing only the second component of each pair. We used the positions only to figure out which elements to keep:

GHCi

>>> map snd $ filter (odd . fst) $ zip [1..] "Hello"
"Hlo"

So this gives

everyOther xs = map snd $ filter (odd . fst) $ zip [1..] xs

The above is just the point-free form that uses function composition. If you don't see why these two definitions are equivalent, here is how we arrive at the point-free form:

everyOther xs
    = map snd $ filter (odd . fst) $ zip [1..] xs
    = map snd (filter (odd . fst) (zip [1..] xs))        -- Definition of ($)
    = (map snd . filter (odd . fst)) (zip [1..] xs)      -- Definition of (.)
    = (map snd . filter (odd . fst) . zip [1..]) xs      -- Definition of (.)
everyOther = map snd . filter (odd . fst) . zip [1..]    -- Drop the common argument xs from both sides

Exercise

Implement a function

filterUps :: Ord a => [a] -> [a]

Given an input list xs, it keeps only those elements in xs that are greater than the elements right before them:

GHCi

>>> filterUps []
[]
>>> filterUps [1]
[]
>>> filterUps [1,2,4,3,7,10,5,4,5]
[2,4,7,10,5]

In a list with less than two elements, there is no element that has any element before it that could be smaller. So the result is the empty list. In the last example, 1 has no predecessor, so it is discarded. 2, 4 (the first one), 7, 10, and 5 (the second one) are greater than the elements before them, so they are kept. 3, 5 (the first one), and 4 (the second one) are less than the elements before them, so they are discarded.

You'll likely need zip, filter, and map to implement this function.

Solution

filterUps :: Ord a => [a] -> [a]
filterUps [] = []
filterUps xs = map snd $ filter (uncurry (<)) $ zip xs (tail xs)

Zipping xs with tail xs produces a list of all pairs of consecutive elements in xs:

GHCi

>>> xs = [1,2,4,3,7,10,5,4,5]
>>> zip xs (tail xs)
[(1,2),(2,4),(4,3),(3,7),(7,10),(10,5),(5,4),(4,5)]

filter (uncurry (<)) keeps only those pairs where the first element is greater than the second one:

GHCi

>>> filter (uncurry (<)) $ zip xs (tail xs)
[(1,2),(2,4),(3,7),(7,10),(4,5)]

Those are exactly the pairs whose second components are the elements of xs that should be kept. map snd extracts these second components:

GHCi

>>> map snd $ filter (uncurry (<)) $ zip xs (tail xs)
[2,4,7,10,5]