in 1930 the mathematician A.Heyting presented a set of rules that characterized intuitionistic logic and differentiated it from classical logic. in addition he formulated intuitionistic First-Order number theory (based on Peano's axioms) as the classical theory without the Law-Of-The-Excluded-Middle

in 1945 the logician S.C.Kleene proposed his interpretation of intuitionistic logic and his number theory. he showed that any function definable in Heyting arithmetic is Turing-computable. this raised the possibility that formal systems of intuitionistic logic could be used as programming languages

in the 1960’s, de Bruijn and Howard introduced dependent types because they wanted to extend Curry’s correspondence to predicate logic. through the work of Scott and Martin-Lof, this correspondence became the basic building block of a new foundational system for constructive mathematics: Martin-Lof’s intuitionistic type theory

# Type Theory

## unformal

we have empty set ∅, and ∅ ⊂ ∅
for any set A function `f : ∅ → A` exists and this function f is unique
1) f ⊂ ∅ ⨯ A (and rhs is the empty set)
2) so function f is the empty set (but exists)

it might be a surprise that `(I → I) ∩ (void → void) = λx.x `

when A is empty, then ` ∅ ∩ λ x:A . B x ` is not empty but has exactly one element. this element is denoted by any closed expression of the theory, e.g.

`   [Void = 1]  ∈  [∅ ∩ λ x:Void . B x]  `
such equalities follow from the understanding that under the assumption that ` x ∈ Void, ` we can infer anything

types such as `λ x:Void . B x` are maximal in the sense that every type is a subtype and every closed term denotes the single member of the type. we pick one such type expression and call it the top type, denoted `T`. for example, `λ x:Void . x` can be `T`

```product type:           a × b          (a ∧ b)
generalization:         a₁ × ... × aₙ     (foreach index there is the value)
dependent product type: ∏ a P          ("you give me the index (i : a), and I give you the result P i")

sum type:               a + b          (a ∨ b)
generalization:         a₁ + ... + aₙ     (exists index such that exists the value)
dependent sum type:     ∑ a P          ("you give me the index (i : a), and I give you the result P i")
```

there's a difference in HOW YOU INTERPRET the 'forall' and 'exists' quantifiers in intuitionistic and classical logic. in fact, Martin-Lof's 1971 "An intuitionistic theory of types" used "cartesian product" to name the Pi-type, and "disjoint union" to name the Sigma-type

## type inference

type inference breaks down into essentially two components

1. constraint generation
2. unification

we inspect the program we’re trying to infer a type for and generate a bunch of statements (constraints) which are of the form

this Type is equal to this Type

these types have “unification variables” in them. these aren’t normal type variables. they’re generated by the compiler, for the compiler, and will eventually be filled in with either

1. a rigid polymorphic variable
2. a normal concrete type

they should be thought of as holes in an otherwise normal type

for example, if we’re looking at the expression

``   f a``

we first just say that `f : 'f` where `'f` is one of those unification variables

next we say that `a : 'a`. since we’re apply `f` to `a` we can generate the constraints that

``````'f ~ 'x -> 'y
'a ~ 'x``````

since we can only apply things with of the form `_ -> _`

we then unify these constraints to produce `f : 'a -> 'x` and `a : 'a`

we’d then using the surrounding constraints to produce more information about what exactly `'a` and `'x` might be. if this was all the constraints we had we’d then “generalize” `'a` and `'x` to be normal type variables, making our expression have the type `x` where `f : a -> x` and `a : a`.

a large part of this procedure is basically “I’ll give you a list of constraints and you give me the solution”. the program solves these proceeds by pattern matching on the constraints

in the empty case, we have no constraints so we give back the empty solution:

`  `fun unify [] = []``

in the next case we actually have to look at what constraint we’re trying to solve:

`  `  | unify (c :: constrs) = case c of ... ``

so in order to make sure our solution is internally consistent it’s important that whenever we add a type to our solution we first apply the solution to it. this ensures that we can substitute a variable in our solution for its corresponding type and not worry about whether we need to do something further. additionally, whenever we add a new binding we substitute for it in the constraints we have left to ensure we never have a solution which is just inconsistent. this prevents us from unifying `v ~ Bool` and `v ~ (Bool, Bool)` in the same solution

the other half of this algorithm is the constraint generation part. we generate constraints and use unify to turn them into solutions

## formal

to develop one of these type theories you start by discussing some syntax. you lay out the syntax for some types:

``````
A ::= Σ x:A . A | Π x:A . A | ⊤ | ⊥ | ... ``````

and some terms:

``````
M ::= M M | λ x:A . M | <M , M> | π₁ M | ⋆ | ...``````

and now we want to describe the important `M:A` relation

we may decide that we want to identify certain terms which other terms: this is called definitional equality. it’s another inductively defined (and decidable) judgment `M ≡ N:A`. two quick things to note here:

1. definitional equality is completely arbitrary; it exists in the way it does because we defined it that way and for no other reason
2. the complexity of proving `M ≡ N:A` is independent of the complexity of `A`

the last point is some concern because it means that equality for functions is never going to be right for what we want. we have this uniformly complex judgment `M ≡ N:A` but when `A = Π x:B . C` the complexity should be greater and dependent on the complexity of `B` and `C`. that’s how it works in math after all, equality at functions is defined pointwise, something we can’t really do here if `≡` is to be decidable or just be of the same complexity no matter the type

now we can do lots of things with our theory. one thing is to go back and build an operational semantics for our terms. this operational semantics should be some judgment `M ↦ M` with the property that `M ↦ N` will imply that `M ≡ N`. This gives us some computational flavor in our type theory and lets us run the pieces of syntax we carved out with `M:A`

but these terms that we’ve written down aren’t really programs. they’re just serializations of the collections of rules we’ve applied to prove a proposition. there’s no ingrained notion of “running” an `M` since it’s built on after the fact. what we have instead is this `≡` relation which just specifies which symbols we consider equivalent but even it is was defined arbitrarily. there’s no reason we `≡` needs to be a reasonable term rewriting system or anything. if we’re good at our jobs it will be, sometimes (HoTT) it’s not completely clear what that computation system is even though we’re working to find it

so I’d describe a (good) formal type theory as an axiomatic system like any other that we can add a computational flavor to

## Computational Type Theory

now we can look at a second flavor of type theory. in this setting the way we order our system is very different

we start with an programming language, a collection of terms and an untyped evaluation relation between them. we don’t necessarily care about all of what’s in the language. as we define types later we’ll say things like “Well, the system has to include at least X” but we don’t need to exhaustively specify all of the system. it follows that we have actually no clue when defining the type theory how things compute. they just compute somehow. we don’t really even want the system to be strongly normalizing, it’s perfectly valid to take the lambda calculus

so we have some terms and `↦`, on top of this we start by defining a notion of equality between terms. this equality is purely computational and has no notion of types yet (like `M ≡ N:A`) because we have no types yet. this equality is sometimes denoted ` ~ `, we usually define it as

```  `M ~ N` iff `M ↦ O(Ms)` iff `N ↦ O(Ns)`
and
if they terminate
than `Ms ~ Ns` ```

by this I mean that two terms are the same if they compute in the same way, either by diverging or running to the same value built from ` ~ ` equal components

so now we still have a type theory with no types... to fix this we go off an define inferences to answer three questions:

1. what other values denote types equal to it? (`A = B`)
2. what values are in the type? (`a ∈ A`)
3. what values are considered equal at that type? (`a = b ∈ A`)

the first questions is usually answered in a boring way, for instance, we would say that `Π x : A. B = Π x : A'. B'` if we know that `A = A'` and `B = B'` under the assumption that we have some `x ∈ A`. we then specify two and three. there we just give the rules for demonstrating that some value, which is a program existing entirely independently of the type we’re building, is in the type. continuing with functions, we might state that

``````  e x ∈ B (x ∈ A)
———————————————————
e ∈ Π x:A. B``````

here I’m using `_ (_)` as syntax for a hypothetical judgment, we have to know that `e ∈ B` under the assumption that we know that `x ∈ A`

next we have to decide what it means for two values to be equal as functions

we’re going to do this behaviourally, by specifying that they behave as equal programs when used as functions. since we use functions by applying them all we have to do is specify that they behave equally on application:

`````` v x = v' x ∈ B (x ∈ A)
————————————————————————
v = v' ∈ Π x:A. B``````

equality is determined on a per type basis. furthermore, it’s allowed to use the equality of smaller types in its definition. this means that when defining equality for `Π x:A. B` we get to use the equalities for `A` and `B`. we make no attempt to maintain either decidability or uniform complexity in the collections of terms specified by `_ = _ ∈ _` as we did with `≡`

as another example, let’s have a look at the equality type.

`````` A = A'  a = a' ∈ A  b = b' ∈ A
————————————————————————————————
I(a; b; A) = I(a'; b'; A')

a = b ∈ A
——————————————
⋆ ∈ I(a; b; A)

a = b ∈ A
——————————————————
⋆ = ⋆ ∈ I(a; b; A)``````

first of all the various rules depend on the rules governing membership and equality in `A` as we should expect. secondly, `⋆` (the canonical occupant of `I(...)`) has no type information. there’s no way to reconstruct whatever reasoning went into proving `a = b ∈ A` because there’s no computational content in it. the thing on the left of the `∈` only describes the portions of our proof that involve computation and equalities in computational type theory are always computationally trivial. therefore, they get the same witness no matter the proof, no matter the types involved. finally, the infamous equality reflection rule is really just the principle of inversion that we’re allowed to use in reasoning about hypothetical judgments

this leads us to the second cast of props-as-types. this one states that constructive proof has computational character. every proof that we write in a logic like this gives us back an (untyped) program which we can run as appropriate for the theorem we’ve proven. this is the idea behind Kleene’s realizability model. similar to what we’d do with a logical relation we define what each type means by defining the class of appropriate programs that fit its specification.

for example, we defined functions to be the class of things that apply and proofs of equality are ⋆ when the equality is true and there are no proofs when it’s false.

another way of phrasing this correspondence is types-as-specs. types are used to identify a collection of terms that may be used in some particular way instead of merely specifying the syntax of their terms

## NuPRL

a different kind of type theory - computational type theory that underlies Nuprl

in this system you could do all these absolutely crazy things that seemed impossible after years of Coq and Agda

* * *

I

the most familiar types and type constructors are similar to those found in typed programming languages such as Algol68, ML or Pascal

these include the atomic types int , atom and void along with the three basic constructors: cartesian product, disjoint union and function space. if A and B are types, then so is

• their cartesian product, A#B ,
• their disjoint union, A|B, and
• the functions from A to B, A->B

II

the second stage of understanding includes the dependent function and dependent product constructors, which are written as x:A->B and x:A#B, respectively, in Nuprl

the dependent product was suggested by logicians studying the correspondence between propositions and types. the intuitive idea is simple

in a dependent function space represented by x:A->B, the range type , B, is indexed by the domain type , A. THIS IS EXACTLY LIKE THE INDEXED PRODUCT OF A FAMILY OF SETS IN SET THEORY

in the dependent product represented by x:A#B , the TYPE of the second member of a pair can depend on the VALUE of the first. THIS IS EXACTLY THE INDEXED DISJOINT SUM OF SET THEORY

III

the third stage of understanding includes the quotient and set types

the set type is written x:A|B and allows to express the notions of constructive set and of partial function

the quotient type allows to capture the idea of equivalence class used extensively in algebra to build new objects

IV

the fourth stage includes propositions considered as types

the atomic types of this form are written a = b in A and express the proposition that a is equal to b in type A. a special case of this is a = a in A, written also as a in A.

V

the fifth stage includes the recursive types and the type of partial functions

#### practical issues

it is very useful to be able to describe functions whose range type depends on the input. for example, we can imagine a function on integers of the form

`    \x. if even(x) then 2 else (\x.2) `

the type of this function on input x is

`    if even(x) then int else (int → int)`

call this type expression F(x) then the function type we want is written:

`    x:int → F(x) `
and denotes those functions f whose value on input n belongs to F(n)

in the general case of pure functions we can introduce such types by allowing declarations of parameterized types or, equivalently, type-valued functions. these are declared as

`    B:(A → U1)`

to introduce these properly we must think of U1 itself as a type, but a large type. we do not want to say U1 ∈ U1 to express that U1 is a type because this leads to paradox in the full theory. it is in the spirit of type theory to introduce another layer of object, or in our terminology, another "universe", called U2. in addition to the types in U1 U2 contains so-called large types, namely U1 and types built from it such as

```    A → U1,
A → (B → U1)
and so forth```

to say that U1 is a large type we write U1 ∈ U2

the new formal system allows the same class of object expressions but a wider class of types.
now a variable A,B,C,... is a type expression,
the constant U1 is a type expression,
if T is a type expression (possibly containing a free occurrence of the variable x of type S) then x:S → T is a type expression, and
if F is an object expression of type S → U1 then F(x) is a type expression

the old form of function space results when T does not depend on x in this case we still write S → T

the type structure hierarchy of resembles that of Principia Mathematica, the ancestor of all type theories. the hierarchy manifests itself in an unbounded cumulative hierarchy of universes,U1,U2,..., where by cumulative hierarchy we mean that Ui is in Ui+1 and that every element of Ui is also an element of Ui+1. universes are themselves types

the concept of a universe in this role, to organize the hierarchy of types, is suggested by Artin, Grothendieck, & Verdier and was used by Martin-Löf. this is a means of achieving a predicative type structure as opposed to an impredicative one as in Girard or Reynolds. and every type occurs in a universe. in fact, A is a type if and only if it belongs to a universe. conversely all the elements of a universe are types

## HoTT

Homotopy type theory (HoTT) interprets type theory from a homotopical perspective. in homotopy type theory, we regard the types as “spaces” or higher groupoids, and the logical constructions (such as the product A × B) as homotopy-invariant constructions on these spaces. to briefly explain this perspective, consider first the basic concept of type theory, namely that the term a is of type A, which is written: `a : A`. this expression is traditionally thought of as akin to: “a is an element of the set A”. however, in homotopy type theory we think of it instead as: “a is a point of the space A”. similarly, every function `f : A → B` in type theory is regarded as a continuous map from the space A to the space B

the idea of interpreting types as structured objects, rather than sets, has a long pedigree, and is known to clarify various mysterious aspects of type theory. for instance, interpreting types as sheaves helps explain the intuitionistic nature of type-theoretic logic, while interpreting them as partial equivalence relations or “domains” helps explain its computational aspects. it also implies that we can use type-theoretic reasoning to study the structured objects, leading to the rich field of categorical logic

the homotopical interpretation fits this same pattern: it clarifies the nature of identity (or equality) in type theory, and allows us to use type-theoretic reasoning

the key new idea of the homotopy interpretation is that the logical notion of identity a = b of two objects a, b : A of the same type A can be understood as the existence of a path p : a ; b from point a to point b in the space A. this also means that two functions f , g : A → B can be identified if they are homotopic, since a homotopy is just a (continuous) family of paths px : f(x) ; g(x) in B, one for each x : A

in type theory, for every type A there is a (formerly somewhat mysterious) type IdA of identifications of two objects of A; in homotopy type theory, this is just the path space AI of all continuous maps I → A from the unit interval. in this way, a term p : IdA(a, b) represents a path p : a ; b in A

the axiom of choice: “if for every x : A there exists a y : B such that R(x, y), there is a function f : A → B such that for all x : A we have R(x, f(x)).” the pure propositions-as-types notion of “there exists” is strong enough to make this statement simply provable — yet it does not have all the consequences of the usual axiom of choice. however, in (−1)-truncated logic, this statement is not automatically true, but is a strong assumption with the same sorts of consequences as its counterpart in classical set theory

on the other hand, consider the law of excluded middle: “for all A, either A or not A.” interpreting this in the pure propositions-as-types logic yields a statement that is inconsistent with the univalence axiom. for since proving “A” means exhibiting an element of it, this assumption would give a uniform way of selecting an element from every nonempty type — a sort of Hilbertian choice operator. Univalence implies that the element of A selected by such a choice operator must be invariant under all self-equivalences of A, since these are identified with self-identities and every operation must respect identity; but clearly some types have automorphisms with no fixed points, e.g. we can swap the elements of a two-element type. however, the “(−1)-truncated law of excluded middle”, though also not automatically true, may consistently be assumed with most of the same consequences as in classical mathematics

```types of level 0  true|false         ⊥ ⊤
types of level 1  int|char|float|..  props                    drop axiom of Excluded Middle
types of level 2  sum|product        sets                     drop axiom of Choice
types of level 3  sigma|pi           objects in a category    drop Turing-completeness

----------------+----------------+-------------------+------------------------------------
HoTT          |   sets         |     logic         |          types
----------------+----------------+-------------------+------------------------------------
path space     | {(x,x) | x∈A}  |  equality         | IdA           identity
----------------+----------------+-------------------+------------------------------------
{}             | empty set      |  false            | _|_           bottom (Void)
{*}            | singleton      |  true             |       T       top    (I)
----------------+----------------+-------------------+------------------------------------
point          | element        |  proof            | a : A         type instance
space          | set of elems   |  proposition      | A             type
func space     | set of func    |  A => B           | A -> B        function
----------------+----------------+-------------------+------------------------------------
prod space     | intersect      |  A /\ B           | A x B         struct
coprod space   | union          |  A \/ B           | A + B         union
----------------+----------------+-------------------+------------------------------------
fibration     | family of sets |  predicate        | P x:A         function with binrange
----------------+----------------+-------------------+------------------------------------
total space    | cartesian prod |  forall x . P x   | x:A , A -> U  Pi-type    dep.funs
sect space     | disjoint union |  exists x . P x   | x:A , U       Sigma-type dep.sums
----------------+----------------+-------------------+------------------------------------
```