One of the most important ingredients of Haskell is its type system. Standard Haskell already provides a lot of useful mechanisms for having things checked at compile time, and the language extensions provided by the Glasgow Haskell Compiler (GHC) improve heavily on this.
In this article, I will present several of Haskell’s type system features. Some of them belong to the standard, others are only available as extensions. This is a write-up of a talk I gave on 31 January 2013 during the Theory Lunch of the Institute of Cybernetics. This talk provided the basics for another Theory Lunch talk, which was about the Constraint
kind.
This whole article was written as a literate Haskell file with ordinary text written in Markdown. You can download this literate Haskell file, read it, and load it into GHCi to play with the code. The HTML for the blog post was generated using Pandoc.
Prerequisites
We first enable some language extensions that we will use in this article:
{-# LANGUAGE MultiParamTypeClasses, TypeFamilies #-}
We will reimplement some bits of the Prelude for illustration purposes, and we will use functions from other modules whose names clash with those of certain Prelude functions. Therefore, we have to hide parts of the Prelude:
import Prelude hiding (Eq (..), Functor (..), lookup, (!!))
Furthermore, we need some additional modules for example code:
import Data.Char
import Data.Natural
import Data.Stream hiding (map)
import Data.Set hiding (map)
import Data.IntSet hiding (map)
These imports require the packages Stream
, natural-numbers
, and containers
to be installed.
Kinds
Types typically denote sets of values. For example, Integer
denotes the set of all integers, Char
denotes the set of all characters, and [Bool]
denotes the set of all truth value lists.
However, Haskell uses a more general notion of type, which also covers functions on types. So for example, the unapplied list type constructor []
is also considered a type, as is a partial application of the function type constructor like (->) Integer
. Clearly, these types do not contain values.
To distinguish between ordinary types and functions on types, Haskell uses a system of kinds. Kinds are the “types of types”, so to say. A kind is either *
or has the form kind1 -> kind2
, where kind1
and kind2
are again kinds. The kind *
is the kind of all ordinary types (that is, types that contain values), and a kind1 -> kind2
is the kind of all type-level functions from kind1
to kind2
. Following are some examples of types and their kinds:
Integer, Char, [Bool], [[Bool]], [val] :: *
[], Maybe :: * -> *
Note that in a kind kind1 -> kind2
, kind1
and kind2
can be kinds of function types again. So higher-order types are possible. As a result, type functions with several arguments can be implemented by means of Currying. For example, Haskell’s pair type and function type constructors are kinded as follows:
(,), (->) :: * -> * -> *
Furthermore, we can have type constructors with type function arguments. Take the following generic types of trees and forests as an example:
data Tree func label = Tree label (Forest func label)
newtype Forest func label = Forest (func (Tree func label))
From these types, we can get various more concrete types by partial application:
type RoseTree = Tree []
type RoseForest = Forest []
type NonEmptyList = Tree Maybe
type PossiblyEmptyList = Forest Maybe
type InfiniteList = Tree Identity
The Identity
type used in the definition of Stream
is defined as follows:
newtype Identity val = Identity val
I also want to mention that if we have a type Event
in functional reactive programming, Tree Event
is the type of behaviors that change only at discrete times, and Forest Event
is the type of event streams.
Type classes
A type class denotes a set of types, which are called the instances of the class. Each class declares a set of methods that its instances have to implement.
Simple type classes
As an example of a type class, let us partially reimplement the Eq
class from the Prelude, whose methods are (==)
and (/=)
:
class Eq val where
(==), (/=) :: val -> val -> Bool
Eq
is supposed to cover all types whose values can be checked for equality. Here is an instance
declaration for the Bool
type:
instance Eq Bool where
False == bool2 = not bool2
True == bool2 = bool2
bool1 /= bool2 = not (bool1 == bool2)
Constructor classes
It is possible to define classes whose instances have a kind other than *
. These are sometimes called constructor classes. An example of such a class is the Functor
class from the Prelude, whose instances have kind * -> *
. Here is a reimplementation:
class Functor func where
fmap :: (val -> val') -> (func val -> func val')
Typical instances of Functor
are []
and Maybe
:
instance Functor [] where
fmap = map
instance Functor Maybe where
fmap _ Nothing = Nothing
fmap fun (Just val) = Just (fun val)
Types Tree func
and Forest func
can also be made instances of Functor
, provided that func
is a Functor
instance itself.
instance Functor func => Functor (Tree func) where
fmap fun (Tree root subtrees) = Tree (fun root)
(fmap fun subtrees)
instance Functor func => Functor (Forest func) where
fmap fun (Forest trees) = Forest (fmap (fmap fun) trees)
Note that these instance
declarations make the specialized versions of Tree
and Forest
that we have defined above automatically instances of Functor
.
Multi-parameter type classes
GHC allows classes to have multiple parameters. While single-parameter classes denote sets of types, multi-parameter classes denote relations between types. An example of a class with two parameters is the class that relates types for which there is a conversion function:
class Convertible val val' where
convert :: val -> val'
We can convert from type Int
to type Integer
, but also between types Int
and Char
:
instance Convertible Int Integer where
convert = toInteger
instance Convertible Int Char where
convert = chr
instance Convertible Char Int where
convert = ord
Type families
Haskell allows us to define new types using data
declarations. An example of such a declaration is the following one, which introduces a type of lists:
data List el = Nil | Cons el (List el)
Furthermore, we can use type
declarations for defining aliases of existing types. For example, we can use the following type
declaration to define types of functions whose domain and codomain are the same:
type Endo val = val -> val
Both data
and type
declarations have in common that the types they define are parametric. Informally speaking, this means that the basic structure of the defined types is independent of type parameters. For example, lists are always either empty or pairs of an element and another list, no matter what the element type is. The choice of an el
parameter only determines the structure of elements. Likewise, values of a type Endo val
are always functions whose domain and codomain are the same. The val
parameter just determines the concrete domain and codomain type in use.
There are situations, however, where we want to define type-level functions that yield completely differently structured types for different arguments. This is possible with the type family extension that GHC provides.
There exist two flavors of type families: data families and type synonym families. Data families introduce new types and use the data
keyword, while type synonym families define aliases for types and use the type
keyword. This is analogous to data
and type
declarations, respectively. Type families can be stand-alone or associated. The former variant is analogous to top-level functions, while the latter is analogous to class methods. We will only deal with the latter in this post.
Data families
As an example of a data family, we define a type of total maps, that is, maps that assign values to every value of a chosen key type. Essential to our definition is that different key types lead to differently structured maps. We declare a class of key types, which contains the data family for total maps:
class Key key where
data TotalMap key :: * -> *
lookup :: key -> TotalMap key val -> val
Let us now give an instance
declaration for Bool
. Total maps with boolean keys are essentially pairs of values, consisting of one value for the False
key and one value for the True
key. Our instance
declaration reflects this:
instance Key Bool where
data TotalMap Bool val = BoolMap val val
lookup False (BoolMap falseVal _) = falseVal
lookup True (BoolMap _ trueVal) = trueVal
Total maps whose keys are natural numbers correspond to infinite lists, that is, streams of values:
instance Key Natural where
data TotalMap Natural val = NaturalMap (Stream val)
lookup nat (NaturalMap str) = str !! fromIntegral nat
More advanced things are possible. For example, pairs of keys can again serve as keys. A total map of a type TotalMap (key1,key2) val
corresponds to a function of type (key1,key2) -> val
, which in turn corresponds to a function of type key1 -> key2 -> val
. This suggests how to implement total maps with pair keys:
instance (Key key1, Key key2) => Key (key1,key2) where
data TotalMap (key1,key2) val
= PairMap (TotalMap key1 (TotalMap key2 val))
lookup (key1,key2) (PairMap curriedMap)
= lookup key2 (lookup key1 curriedMap)
Type synonym families
Let us now look at an example of a type synonym family. We define a class of collection types where a collection is basically anything that contains elements. Here are two examples of collection types:
Set el
for any typeel
that is an instance ofOrd
IntSet
Our class contains a type synonym family that tells for every collection type what the corresponding type of collection elements is. The class
declaration is as follows:
class Collection coll where
type Element coll :: *
member :: Element coll -> coll -> Bool
Now we can form instance
declarations for the above example collection types:
instance Ord el => Collection (Set el) where
type Element (Set el) = el
member = Data.Set.member
instance Collection IntSet where
type Element IntSet = Int
member = Data.IntSet.member
[…] I talked about some of Haskell’s type system features. You can find a write-up of my talk on my personal blog. […]
Thank you for the interesting post. One question I couldn’t answer for myself: How do I create instances of types Tree and Forest?
For example, later when you define type List, there is a non-recursive constructor Nil, that allows to stop recursion. In case of Tree and Forest, the recursion seems to be infinite.
According to the type declarations, every tree contains a forest, but a forest contains something of type
func (Tree func label)
. There may be values of this type that do not contain trees. For example in the case ofRoseTree
,func
is[]
, sofunc (Tree func label)
is[RoseTree label]
. As a result, you have a base in the form of the empty list. For instance, the expressionTree 'X' (Forest [])
denotes a rose tree with just a single node with label'X'
.Note that even without such a base, the
Tree
andForest
types are not useless, since you can have infinite data structures.Ah, indeed. Pretty simple. Thank you for the explanation!