Type Signatures
So far, I've thrown around things like
(+) :: Num a => a -> a -> a
fairly informally, without properly introducing the Haskell syntax for specifying the types of things. Let's rectify this. Every data object in every programming language has a type. Where languages differ is in what we are allowed to do with objects of certain types and whether these constraints are enforced at compile time or at runtime. Haskell is a language that checks the type-correctness of your program at compile time. Languages that do this are called statically typed.
I've already introduced you to the :type
command in GHCi, which we can use to
inspect the types of arbitrary expressions:
>>> :type 'a'
'a' :: Char
Most of the time, we do not need to specify the type of an expression. The
Haskell compiler can figure it out on its own. As I discuss below, there are
still situations when we do want to specify the type of an expression
explicitly. To do this, we write the expression, two colons (::
), and the
type of the expression, as in 'a' :: Char
above. This says that the
expression 'a'
has type Char
.
When we try this with a numeric expression, the answer we get is a bit less clear. We'd probably expect something like:
>>> :type 5
5 :: Int
Instead, we get
>>> :type 5
5 :: Num p => p
I mentioned before that Haskell makes it easy to write functions that can be
applied to arguments of many different types, something we would call generic
functions in languages such as Java. In Haskell, such functions called
polymorphic.1 Here, 5
is a polymorphic expression. Without a context
that constrains the type of 5
, Haskell allows 5
to be of any number type.
And this is exactly what the type expression above tells us. You can read the
=>
like an implication: If p
is a number type, then 5
can have type p
.
So that's simple values. What about functions? First of all, functions are ordinary values in Haskell, just as integers, characters, strings or any other type of data. The type signature
(+) :: Num a => a -> a -> a
states that (+)
refers to a function object whose type is a -> a -> a
,
where a
can be any number type. The arrows separate argument types. The
value after the last arrow is the return type of the function. Here, this
means that (+)
takes two arguments. The first argument is of type a
, as is
the second argument. The return type is also a
. a
is a type variable,
that is, a
can be almost any type, but it must be an instance of the Num
type class: it must be a number type.
Later, we will see much more complex types, but you can interpret them in the
same fashion. For example, we will soon talk about the map
function. Its
type is
map :: (a -> b) -> [a] -> [b]
This says that map
takes two arguments. The first is of type a -> b
, that
is, it is itself a function that takes an argument of type a
and produces a
result of type b
. The second one is of type [a]
, that is, it is a list of
a
s. The result of map
is of type [b]
, that is, it is a list of b
s. Given
a function f
and a list xs
, map f xs
applies f
to every element in xs
.
The output of map f xs
is the list of results this produces:
>>> import Data.Char
>>> map toUpper "Hello"
"HELLO"
When to Specify Types Explicitly
As I said, the Haskell compiler is rather intelligent at figuring out the types of expressions, so most of the time, you don't have to specify these types. There are four situations when you may still want to specify the types of values, especially of functions. Let's discuss them here:
-
Documentation: Especially given the ease with which we can define custom types in Haskell, the type signature of a function, combined with its name if chosen well, can give us a strong hint as to what the function does and what its arguments are. Thus,
It is common practice to document top-level functions—functions defined at the module level, not as local values within a function definition—by providing explicit type signatures.
-
Debugging: Sometimes, an expression doesn't have the type we think it has (because we made a mistake). This may not immediately create a problem because the incorrect type may play nicely with other expressions that use the incorrect expression directly, only their type is now also wrong. These incorrect types may propagate across multiple levels of dependencies between expressions until we finally reach an expression that cannot work with a subexpression of the incorrect type. At this point, the compiler complains. As a result, we get an error message at a point in our code that isn't actually the problem, and the error message seems baffling. Whenever this happens, it helps to (temporarily) annotate the different expressions in our program with explicit types. This ensures that the type error is caught at its source because the compiler can verify whether every expression in our program has the type we think it has (according to the type signatures we provided).
-
Helping the type inferencer: The way the type inferencer determines the types of values is by looking at the functions we call. If we call a function
foo
with argumentx
andfoo
expects anInt
argument,x
must be anInt
. Haskell has strong support for polymorphic functions, functions that can be applied to arguments of many different types. A downside of this is that this may weaken type inference. If the functionfoo
accepts an argument of any type or even just of any number type and there is no other information about the type ofx
available, then the fact that we passx
tofoo
does not tell us the type ofx
. In this case, the type inferencer will tell us that it cannot infer the type ofx
unambiguously and ask us to specifyx
's type explicitly.2 -
Specialization for efficiency: This use of type signatures is also related to polymorphic functions. Polymorphism is a useful feature to avoid code duplication. We can write a single function that can be used for any possible type or at least any type that meets certain constraints, expressed using type classes (e.g., the
Num a
constraint on the(+)
function above.) However, a function that can be used with arguments of many different types needs to figure out at runtime what the type of its argument actually is, to handle it correctly. At least, this is true for the way Haskell implements polymorphic functions. The type inferencer will always derive the most general types for the values in your program that make all types in your program fit together correctly (if that's possible). As a result, it may infer a polymorphic type for your function, even though you only ever need the version whose first argument has typeInt
. By specifying explicitly that you want the type of the first argument to beInt
, you may enable the compiler to generate more efficient code because it knows that there is no need for a fully polymorphic version of the function.
-
"Polymorphic" means "of many forms". If you think about the type of an expression as the "form" that it takes—the type certainly determines the manner in which a given value is represented in memory—then a polymorphic expression is one that can take many forms, that can be of many types, possibly subject to some constraints on the types, on the forms the expression can take. ↩
-
Actually, most of the time, it is perfectly fine to keep the type of an expression polymorphic. The compiler will complain only when it really needs to know the concrete type of an expression. ↩