Computer Science, Haskell, Programming, Type Theory

Implementing a Hindley-Milner Type System (Part 1)

By 05st

Recently, I've started working on a rather ambitious project. I have always been interested in functional programming - at least for the past few years. I have also been becoming more and more interested in programming language development and type theory. So, to satisfy my interests, I decided to begin writing a statically typed, functional programming language. (It's named Artemis, and you can check out the GitHub repository here.)

One key feature I was determined to implement for Artemis was a Hindley-Milner type system, due to the amazing type inference capabilities, and parametric polymorphism.

In this blog post, I will walk through an implementation of a Hindley-Milner type system for an example programming language, while sharing one or two things I've learned along the way. (I will be using Haskell in this tutorial. However, the explanations should be accessible to everyone.)

The Language

To begin, I will give an overview of the language we will be working with. It's essentially the lambda calculus, extended with a few types, let bindings, if-expressions, integer/boolean literals, and two binary operators. I am keeping it simple on purpose, you are encouraged to extend it with any more features you desire.

{- Expressions -} abc -- Variable if a then b else c -- If-expression \x -> y : a -> b -- Abstraction f x -- Application let a = b in c -- Let-binding 123456789 : Int -- Integer True, False : Bool -- Boolean x + y -- Integer addition x - y -- Integer subtraction

(If something is written a : T, it is meant that a is of type T.)

Syntax Representation

We should start by defining some data types to represent our language. Enable the PatternSynonyms language extension (we will use it in a bit) - and before anything else, let's add a type alias for String for legibility.

{-# Language PatternSynonyms #-} type Name = String

The Expr data type will resemble the example syntax given above. Most of it is hopefully self-explanatory. We will have a data type to represent operators as well.

data Expr = EVar Name | EIf Expr Expr Expr | EAbs Name Expr | EApp Expr Expr | ELet Name Expr Expr | EInt Integer | EBool Bool | EBin Oper Expr Expr deriving Show data Oper = Add | Sub deriving Show

We will also have a data type to represent types. TCon is a concrete type, such as Int, Bool, or List Char. TVar is for type variables, things which are placeholders for concrete types.

newtype TVar = TV Name deriving (Eq, Show) data Type = TCon Name [Type] | TVar TVar deriving (Eq, Show)

(Don't forget to derive Eq and Show for the data types, we will need it later on.)

Let's also define some built-in types. This is where the PatternSynonyms language extension becomes useful.

pattern TInt = TCon "Int" [] pattern TBool = TCon "Bool" [] pattern a :-> b = TCon "->" [a, b]

(The arrow type is the type of functions.)

Parsing

Since parsing is not going to be a focus of this post, you have a few options. The first is to choose to work without a parser (work with the AST itself), another is to write one yourself, finally, you also have the option to copy the one I wrote for this tutorial here, along with the rest of the source code. (I used the parser combinator library Parsec.)

The Type System

With all of the preliminary steps of the language out of the way, we can begin to get into the main details of a Hindley-Milner type system.

Essentially, type inference is the process of descending the abstract syntax tree, assigning fresh type variables to every expression, and generating a set of constraints based on what information is available. Afterwards, solving the set of constraints (much like solving a system of equations), will result in a substitution. This substitution will map every generated type variable to an inferred principal type. The process of solving the constraints is known as unification. (The principal type is, more or less, the most general type of an expression, i.e. any other possible types for the expression are instances of the principal type.)

The Hindley-Milner type system is defined by a set of inference rules:

x:σΓΓx:σVarΓe1:τ1τ2Γe2:τ1Γe1 e2:τ2AppΓ,  x:τ1e:τ2Γλ x . e:τ1τ2AbsΓe1:σΓ,x:σe2:τΓlet x=e1 in e2:τLetΓe:σαftv(Γ)Γe: α . σGenΓe:σ1σ1σ2Γe:σ2Inst\begin{array}{cl} \displaystyle\frac{x:\sigma \in \Gamma}{\Gamma \vdash x:\sigma} & \bold{Var} \\ \\ \displaystyle\frac{\Gamma \vdash e_1:\tau_1 \rightarrow \tau_2 \quad\quad \Gamma \vdash e_2 : \tau_1 }{\Gamma \vdash e_1\ e_2 : \tau_2} & \bold{App} \\ \\ \displaystyle\frac{\Gamma,\;x:\tau_1 \vdash e:\tau_2}{\Gamma \vdash \lambda\ x\ .\ e : \tau_1 \rightarrow \tau_2}& \bold{Abs} \\ \\ \displaystyle\frac{\Gamma \vdash e_1:\sigma \quad\quad \Gamma,\,x:\sigma \vdash e_2:\tau}{\Gamma \vdash \mathtt{let}\ x = e_1\ \mathtt{in}\ e_2 : \tau} & \bold{Let} \\ \\ \displaystyle\frac{\Gamma \vdash e: \sigma \quad \overline{\alpha} \notin \mathtt{ftv}(\Gamma)}{\Gamma \vdash e:\forall\ \overline{\alpha}\ .\ \sigma} & \bold{Gen}\\ \\ \displaystyle\frac{\Gamma \vdash e: \sigma_1 \quad\quad \sigma_1 \sqsubseteq \sigma_2}{\Gamma \vdash e : \sigma_2 } & \bold{Inst} \\ \\ \end{array}

If you haven't seen inference rules before, it may look like an alien language to you. They're actually quite simple; in this notation, anything above the horizontal line are the hypotheses and anything below are the conclusions. Γ\Gamma (capital gamma) is the context, and \vdash can be read as "asserts" or "proves". The letter σ\sigma (sigma) usually represents polymorphic types, whereas τ\tau (tau) usually represents monomorphic types. Polymorphism and monomorphism is explained later on. The letter α\alpha (alpha) usually represents a type variable.

Each rule listed above describes how we deal with a part of our language. The first four: variables, function application, function abstraction, and let-bindings. We will discuss the last two generalization and instantiation afterwards.

Let's try to parse the first one. The hypothesis (x:σΓx:\sigma\in\Gamma) is that xx is of type σ\sigma in the context of Γ\Gamma. The conclusion (Γx:σ\Gamma\vdash x:\sigma) of that is: in Γ\Gamma, it can be asserted that xx is of type σ\sigma. This rule may sound a bit silly, but it's important because it shows us how the context is used. (The context can also be called the typing environment.)

The second one may appear a bit more involved, but it's still equally simple. In Γ\Gamma, if there is a function e1e_1 which maps values of type τ1\tau_1 to values of type τ2\tau_2 (e1:τ1τ2e_1:\tau_1\to\tau_2), and there is an expression e2e_2 which is of type τ1\tau_1, then - in Γ\Gamma - the application of e1e_1 on e2e_2 results in a type of τ2\tau_2.

In the third rule, the hypothesis is a context Γ\Gamma along with a variable x:τ1x : \tau_1 which assert e:τ2e:\tau_2. The conclusion is that a function which takes xx and evaluates to ee (λx.e\lambda x.e) is going to be of type τ1τ2\tau_1\to\tau_2 - maps values of type τ1\tau_1 to values of type τ2\tau_2.

The fourth rule shows us how to handle let-bindings. If in Γ\Gamma some expression e1:σe_1 : \sigma, and also in Γ\Gamma along with a variable xx of the same type (σ\sigma) it is asserted that there is an expression e2:τe_2:\tau, then we can conclude binding x=e1x=e_1 and evaluating e2e_2 will give us an expression of type τ\tau.

Now that the inference rules - hopefully - make sense, you are most likely wondering: how does one use this information for type inference? It's somewhat simple, actually. All you have to do is work backwards through the typing rules. Given an expression and a conclusion (from the rules) which matches the expression, you can assign fresh (unused) type variables to sub-expressions of unknown types, and use the judgements in the hypotheses to develop constraints on those type variables. If this doesn't make sense, think about it a bit more. It will hopefully make more sense once we begin the implementation.

Next up are the rules for generalization and instantiation; but we must figure out what polymorphism and monomorphism are first.

Polymorphism and Monomorphism

Polymorphism in general is about allowing functions to take inputs of varying types. The type of polymorphism we will be focusing on is known as parametric polymorphism. This is just a fancy name for generics. It differs from ad-hoc polymorphism (also known as overloading) by the fact that a parametrically polymorphic function must handle every type of input the same way. The trivial example of a parametrically polymorphic function is the identity function.

λx.x:a.aa\lambda x.x : \forall a. a\to a

The type variable aa is universally quantified. We will represent these polymorphic types (polytypes) with type schemes. In general, if a type contains type variables, then it is a polytype. That fact allows us to generalize a type containing type variables into a type scheme. To show why we need to do this, let's consider monomorphism. (Universal quantification is the same as saying \forall "for all," whereas existential quantification is the same as saying \exists "there exists.")

Monomorphism is the opposite of polymorphism - a monomorphic function would only accept inputs of a specific type. Even if the type contains type variables, type inference would cause it to reduce to a specific type after later uses of the function. For example:

let id = (\x -> x) in (id True)

When the type of id is inferred, it is inferred to be id : a -> a However, in the body of the let-binding, we apply it on a boolean. Since we haven't explicitly stated id should be polymorphic, the final inferred type of id would turn out to be id : Bool -> Bool. Type schemes solve this issue, and infer id : forall a. a -> a.

Generalization and Instantiation

Generalization and instantiation are the two most important concepts in the Hindley-Milner type system. Generalization deals with converting a monotype (monomorphic type) τ\tau into a polytype σ\sigma by closing over all occurrences of free type variables in τ\tau. Closing over simply means removing any free variables, and in mathematics, a free variable is one that is not defined in the current "scope." We remove occurrences of free variables by quantifying them.

The rule for generalization is:

Γe:σαftv(Γ)Γe: α . σ\begin{array}{cl} \displaystyle\frac{\Gamma \vdash e: \sigma \quad \overline{\alpha} \notin \mathtt{ftv}(\Gamma)}{\Gamma \vdash e:\forall\ \overline{\alpha}\ .\ \sigma} \end{array}

Given Γ\Gamma which asserts e:σe:\sigma, and many type variables α\overline{\alpha} which are not free type variables in Γ\Gamma, we can conclude that in Γ\Gamma it can be asserted e: α . σe:\forall\ \overline{\alpha}\ .\ \sigma. In other words, since σ\sigma contains type variables, we can universally quantify it with all type variables not in use in the context. This might result in quantifying it with type variables that do not appear in σ\sigma, but that does not make a difference.

Instantiation is a bit like the inverse of generalization; you take a type scheme and replace all quantified type variables with monomorphic fresh type variables. This will be done whenever we need to use a type scheme.

Γe:σ1σ1σ2Γe:σ2\begin{array}{cl} \displaystyle\frac{\Gamma \vdash e: \sigma_1 \quad\quad \sigma_1 \sqsubseteq \sigma_2}{\Gamma \vdash e : \sigma_2 } \end{array}

So, if in Γ\Gamma it can be asserted that e:σ1e:\sigma_1, and σ2\sigma_2 is an instantiation of σ1\sigma_1, then we can conclude that in Γ\Gamma it can be asserted that ee is also of type σ2\sigma_2. When we say "is an instantiation of," informally, we mean that σ2\sigma_2 resembles (for lack of a better word) σ1\sigma_1. This is better explained by examples.

a. aaIntInta. aabbab. abaIntBoolIntab. aba(ab)c(ab)\begin{aligned} \forall a.\ a\to a & \sqsubseteq \text{Int}\to\text{Int} \\ \forall a.\ a\to a & \sqsubseteq b\to b \\ \forall ab.\ a\to b\to a & \sqsubseteq \text{Int}\to\text{Bool}\to\text{Int} \\ \forall ab.\ a\to b\to a & \sqsubseteq \left(a\to b\right)\to c\to\left(a\to b\right) \end{aligned}

Continued

I have decided to split this tutorial into two parts, since this post is already at two thousand words. In the next part, we will go over substitutions and unification. We will also begin our implementation in Haskell.

Comments