Skip to content

Characters and Strings

The character type in Haskell is Char. Whereas in C and C++, characters are 8-bit characters, representing the extended ASCII character set, Haskell's characters are unicode characters, as in most modern programming languages. We again have our standard comparison operators for characters:

GHCi
>>> 'a' == 'b'
False
>>> 'a' < 'b'
True

Beyond this, there isn't all that much we can do with characters out of the box in Haskell. If we want to use more interesting functions that allow us to inspect and manipulate character types, we need to import the Data.Char module. Then we can for example test whether a given character is a digit:

GHCi
>>> import Data.Char
>>> isDigit 'a'
False
>>> isDigit '5'
True
>>> isDigit '+'
False

Or we can convert a character to uppercase or lowercase:

GHCi
>>> toUpper 'a'
'A'
>>> toLower 'Z'
'z'
>>> toUpper '5'
'5'

Data.Char contains a wide range of character functions. Look this module up on Hoogle and see which functions it provides.

In Haskell, strings are just lists of characters. The type of a list that contains elements of type a is [a]. So, the String type in Haskell is literally defined as a type alias for [Char]:

type String = [Char]
GHCi
>>> :info String
type String = [Char]    -- Defined in ‘GHC.Base’

We'll learn about the type keyword to define type aliases later.

Given that strings are lists, we can manipulate them using all the standard list functions we have at our disposal. Search for Data.List on Hoogle to get an idea of the types of functions there are to manipulate lists. You can of course also define your own. Here, I mention a few very basic ones. Other commonly used list functions will be introduced when we need them.

First, there is the function that tests whether a list is empty:

null :: [a] -> Bool

Since strings are special types of lists, we can use it to test whether a string is empty or not:

GHCi
>>> null ""
True
>>> null "Hello"
False

The length function tells us the length of a list:

length :: [a] -> Int
GHCi
>>> length ""
0
>>> length "Hello"
5

The functions head and tail are undefined if the input list is empty. For a non-empty list, head returns the first element of the list, that is, the first character when applied to a string. tail returns the whole list with the first element removed:

GHCi
>>> head "Hello"
'H'
>>> tail "Hello"
"ello"
>>> head ""
*** Exception: Prelude.head: empty list
>>> tail ""
*** Exception: Prelude.tail: empty list

Finally, we have the (:) operator (sometimes called "cons" for "construct"), which takes as arguments an element x of type t and a list xs of type [t] and constructs a new list x:xs of type [t] whose first element (head) is x and whose tail is xs:

GHCi
>>> 1 : [2,3,4,5]
[1,2,3,4,5]
>>> 'H' : "ello"
"Hello"
>>> 'H' : 'e' : 'l' : 'l' : 'o' : ""
"Hello"

Note the subtle distinction between single quotes and double quotes here. Single quotes are used to delimit individual characters. Double quotes are used to delimit strings. So 'a' is a single character, of type Char. "a" is a string of length one, of type String, that is, [Char]. Char and [Char] are two different types. That's the same as in Java, where we use single quotes to delimit characters, and double quotes to delimit strings. In Python, we can use single or double quotes interchangeably to delimit strings, because Python does not have a character type. Individual characters in Python are represented as one-character strings, a questionable choice in my opinion.

The cons operator is the most fundamental operator to manipulate lists because, as we will see soon, we can also use it in pattern matching expressions to decompose a list into its head and tail. All the other list functions, such as null, length, head, tail, and all the functions in Data.List are implemented using cons and pattern matching. For example:

length :: [a] -> Int
length []     = 0
length (_:xs) = 1 + length xs

It's okay if you don't understand this function definition yet. You will soon. This definition says that the length of the empty list [] is 0, and the length of any non-empty list is one more than the length of its tail, a recursive definition to compute the length of any list.

Finally, there is a list concatenation operator, (++), which we can also use to concatenate strings:

GHCi
>>> "Hello" ++ ", world!"
"Hello, world!"

That's it for characters and strings for now. As you learn about more advanced list functions that you can use to transform, filter or partition lists, and many more, remember that all of them can also be applied to strings because strings are lists of characters.