# Type Theory

*First published Wed Feb 8, 2006; substantive revision Fri Oct 20, 2006*

The topic of type theory is fundamental both in logic and computer science. We limit ourselves here to sketch some aspects that are important in logic. For the importance of types in computer science, we refer the reader for instance to (Reynolds, 1983 and 1985).

- Paradoxes and Russell's Type Theories
- Simple Type Theory and the λ-Calculus
- Ramified Hierarchy and Impredicative Principles
- Type Theory/Set Theory
- Type Theory/Category Theory
- Extensions of Type System, Polymorphism, Paradoxes
- Bibliography
- Other Internet Resources
- Related Entries

## 1. Paradoxes and Russell's Type Theories

The theory of types was introduced by Russell in order to cope with some contradictions he found in his account of set theory (Russell, 1903). This contradiction was obtained by analysing a theorem of Cantor that no mapping

F:X→ Pow(X)

(where Pow(*X*) is the class of subclasses of a class
*X*) can be surjective; that is, *F* cannot be such that
every member *b* of Pow(*X*) is equal to
*F*(*a*) for some element *a* of *X*. This
can be phrased “intuitively” as the fact that there are
more subsets of *X* than elements of *X*. The proof of
this fact is so simple and basic that it is worthwhile giving it
here. Consider the following subset of *X*:

A= {x∈X|x∉F(x) }.

This subset cannot be in the range of *F*. For if *A* =
*F*(*a*), for some *a*, then

a∈F(a)iff a∈Aiff a∉F(a)

which is a contradiction. Some remarks are in order. First, the proof
does not use the law of excluded middle and is thus valid
intuitionistically. Second, the method that is used, called
*diagonalisation* was already present in the work of du
Bois-Reymond for building real functions growing faster than any
function in a given sequence of functions.

Russell analysed what happens if we apply this theorem to the case where A is the class of all classes, admitting that there is such a class. He was then lead to consider the special class of classes that do not belong to themselves

(*)R= {w|w∉w}

We then have

R∈RiffR∉R

It seems indeed that Cantor was already aware of the fact that the class of all sets cannot be considered itself to be a set.

Russell communicated this problem to Frege, and his letter, together
with Frege's answer appear in van Heijenoort (1967). It is important
to realise that the formulation (*) does not apply as it is to Frege's
system. As Frege himself wrote in his reply to Russell, the expression
“a predicate is predicated of itself” is not exact. Frege
had a distinction between *predicates* (concepts) and
*objects*. A (first-order) predicate applies to an object but
it cannot have a predicate as argument. The exact formulation of the
paradox in Frege's system uses the notion of the *extension* of
a predicate *P*, which we designate as ε*P*.
The extension of a predicate is itself an object. The important axiom
V is:

(Axiom V) εP= εQ≡ ∀x[P(x) ≡Q(x)]

This axiom asserts that the extension of *P* is identical to
the extension of *Q* if and only if *P* and *Q*
are materially equivalent. We can then translate Russell's paradox
(*) in Frege's system by defining the predicate

R(x) iff ∃P[x= εP∧ ¬P(x)]

It can then been checked, using Axiom V in a crucial way, thatd

R(εR) ≡ ¬R(εR)

and we have a contradiction as well. (Notice that for defining the
predicate *R*, we have used an *impredicative*
existential quantification on predicates. It can be shown that the
*predicative* version of Frege's system is consistent (see
Heck, 1996 and for further refinements Ferreira, 2002).

It is clear from this account that an idea of types was already present in Frege's work: there we find a distinction between objects, predicates (or concepts), predicates of predicates, etc. (This point is stressed by Quine (1940).) This hierarchy is called the “extensional hierarchy” by Russell (1959), and its necessity was recognised by Russell as a consequence of his paradox.

## 2. Simple Type Theory and the λ-Calculus

As we saw above, the distinction: objects, predicates, predicate of
predicates, etc., seems enough to block Russell's paradox (and this
was recognised by Chwistek and Ramsey). We first describe the type
structure as it is in *Principia* and later in this section we
present the elegant formulation due to Church 1940 based on
λ-calculus. The types can be defined as

*i*is the type of individuals- ( ) is the type of propositions
- if
*A*_{1},…,*A*_{n}are types then (*A*_{1},…,*A*_{n}) is the type of*n*-ary relations over objects of respective types*A*_{1},…,*A*_{n}

For instance, the type of binary relations over individuals is
(*i*, *i*), the type of binary connectives is
(( ),( )), the type of quantifiers over individuals is
((*i*)).

For forming propositions we use this type structure: thus
*R*(*a*_{1},…,*a*_{n})
is a proposition if *R* is of type
(*A*_{1},…,*A*_{n}) and
*a*_{i} is of type *A*_{i} for
*i* = 1,…,*n*. This restriction makes it
impossible to form a proposition of the form *P*(*P*):
the type of *P* should be of the form (*A*), and
*P* can only be applied to arguments of type *A*, and
thus cannot be applied to itself since *A* is not the same as
(*A*).

However simple type theory is not predicative: we can define an object
*Q*(*x*, *y*) of type (*i*, *i*)
by

∀P[P(x) ⊃P(y)]

Assume that we have two individuals *a* and *b* such
that *Q*(*a*, *b*) holds. We can define
*P*(*x*) to be *Q*(*x*, *a*). It
is then clear that *P*(*a*) holds, since it is
*Q*(*a*, *a*). Hence *P*(*b*) holds
as well. We have proved, in an impredicative way, that
*Q*(*a*, *b*) implies *Q*(*b*,
*a*).

Alternative simpler formulations, which retain only the notion of
classes, classes of classes, etc., were formulated by Gödel and
Tarski. Actually this simpler version was used by Gödel in his
1931 paper on formally undecidable propositions. We have objects of
type 0, for individuals, objects of type 1, for classes of
individuals, objects of type 2, for classes of classes of individuals,
and so on. Functions of two or more arguments, like relations, need
not be included among primitive objects since one can define relations
to be classes of ordered pairs, and ordered pairs to be classes of
classes. For example, the ordered pair of individuals *a, b*
can be defined to be {{*a*},{*a*,*b*}} where
{*x*,*y*} denotes the class whose sole elements are
*x* and *y*. (Wiener 1914 had suggested a similar
reduction of relations to classes.) In this system, all propositions
have the form *a*(*b*), where *a* is a sign of
type n+1 and *b* a sign of type n. Thus this system is built on
the concept of an arbitrary class or subset of objects of a given
domain and on the fact that the collection of *all subsets* of
the given domain can form a new domain of the next type. Starting from
a given domain of individuals, this process is then iterated. As
emphasised for instance in Scott 1993, in set theory this process of
forming subsets is iterated into the *transfinite*.

In these versions of type theory, as in set theory, functions are not
primitive objects, but are represented as functional relation. The
addition function for instance is represented as a ternary relation by
an object of type (*i*,*i*,*i*). An elegant
formulation of the simple type theory which extends it by introducing
functions as primitive objects was given by Church in 1940. It uses
the λ-calculus notation (Barendregt, 1997). Since such a
formulation is important in computer science, for the connection with
category theory, and for Martin-Löf type theory, we describe it
in some detail. The types of this system are defined inductively as
follows

- there are two basic types
*i*(the type of individuals) and*o*(the type of propositions) - if
*A*,*B*are types then*A*→*B*, the type of functions from*A*to*B*, is a type

We can form in this way the types:

i→o(type of predicates) ( i→o) →o(type of predicates of predicates)

which correspond to the types (*i*) and ((*i*)) but also
the new types

i→i(type of functions) ( i→i) →i(type of functionals)

It is convenient to write

A_{1},…,A_{n}→B

for

A_{1}→ (A_{2}→ … → (A_{n}→B))

In this way

A_{1},…,A_{n}→o

corresponds to the type
(*A*_{1},…,*A*_{n}).

First-order logic considers only types of the form

i,…,i→i(type of function symbols), and i,…,i→o(type of predicate, relation symbols)

Notice that

A→B→C

stands for

A→ (B→C)

(association to the right).

For the terms of this logic, we shall not follow Church's account but a slight variation of it, due to Curry (who had similar ideas before Church's paper appeared) and which is presented in detail in R. Hindley's book on type theory. Like Church, we use λ-calculus, which provides a general notation for functions

M::=x|MM| λx.M

Here we have used the so-called BNF notation, very convenient in computing science. This gives a syntactic specification of the λ-terms which, when expanded, means:

- every variable is a function symbol;
- every juxtaposition of two function symbols is a function symbol;
- every λ
*x*.*M*is a function symbol; - there are no other function symbols.

The notation for function application *M* *N* is a
little different than the mathematical notation, which would be
*M*(*N*). In general,

M_{1}M_{2}M_{3}

stands for

(M_{1}M_{2})M_{3}

(association to the left). The term
λ*x*. *M* represents the function which to
*N* associates *M*[*x*:=*N*]. This
notation is so convenient that one wonders why it is not widely used
in mathematics. The main equation of λ-calculus is then
(β-conversion)

(λx.M)N=M[x:=N]

which expresses the meaning of λ*x*. *M*
as a function. We have used *M*[*x*:=*N*] as a
notation for the value of the expression that results when *N*
is substituted for the variable *x* in the matrix *M*.
One usually sees this equation as a rewrite rule
(β-reduction)

(λx.M)N→M[x:=N]

In *untyped* lambda calculus, it may be that such rewriting does
not terminate. The canonical example is given by the term Δ =
λ*x*. *x* *x* and the application

Δ Δ → Δ Δ

(Notice the similarity with Russell's paradox.) The idea of Curry is
then to look at types as predicates over lambda terms, writing
*M*: *A* to express that *M* satisfies the
predicate/type *A*. The meaning of

N:A→B

is then

∀M, ifM:AthenNM:B

which justifies the following rules

N:A→BM:A

NM:B

M:B[x:A]

λ x.M:A→B

In general one works with judgements of the form

x_{1}:A_{1},...,x_{n}:A_{n}⊢M:A

where *x*_{1},..., *x*_{n} are distinct
variables, and *M* is a term having all free variables among
*x*_{1},..., *x*_{n}. In order to be
able to get Church's system, one adds some constants in order to form
propositions. Typically

not:o→o

imply:o→o→o

and:o→o→o

forall: (A→o) →o

The term

λx. ¬(xx)

represents the predicate of predicates that do not apply to themselves.
This term does not have a type however, that is, it is not possible to
find *A* such that

λx. ¬(xx) : (A→o) →o

which is the formal expression of the fact that Russell's paradox cannot be expressed. Leibniz equality

Q:i→i→o

will be defined as

Q= λx. λy. ∀(λP. imply (Px) (Py))

One usually writes ∀*x*[*M*] instead of
∀(λ*x*. *M*), and the definition of
*Q* can then be rewritten as

Q= λx. λy. ∀P[imply (Px) (Py)]

This example again illustrates that we can formulate impredicative definitions in simple type theory.

The use of λ-terms and β-reduction is most convenient for
representing the complex substitution rules that are needed in simple
type theory. For instance, if we want to substitute the predicate
λ*x*. *Q* *a* *x* for
*P* in the proposition

imply (Pa) (Pb)

we get

imply ((λx.Qax)a) ((λx.Qax)b)

and, using β-reduction,

imply (Qaa) (Qab)

In summary, simple type theory forbids self-application but not the circularity present in impredicative definitions.

The λ-calculus formalism also allows for a clearer analysis of Russell's paradox. We can see it as the definition of the predicate

Rx= ¬(xx)

If we think of β-reduction as the process of unfolding a definition, we see that there is a problem already with understanding the definition of R R

RR→ ¬(RR) → ¬(¬(RR)) → …

In some sense, we have a non-wellfounded definition, which is as
problematic as a contradiction (a proposition equivalent to its
negation). One important theorem, the normalisation theorem, says that
this cannot happen with simple types: if we have
*M*: *A* then *M* is normalisable in a
strong way (any sequence of reductions starting from *M*
terminates).

For more information on this topic, we refer to the entry on Church's simple type theory.

## 3. Ramified Hierarchy and Impredicative Principles

Russell introduced another hierarchy, that was not motivated by any formal paradoxes expressed in a formal system, but rather by the fear of “circularity” and by informal paradoxes similar to the paradox of the liar. If a man says “I am lying”, then we have a situation reminiscent of Russell's paradox: a proposition which is equivalent to its own negation. Another informal such paradoxical situation is obtained if we define an integer to be the “least integer not definable in less than 100 words”. In order to avoid such informal paradoxes, Russell thought it necessary to introduce another kind of hierarchy, the so-called “ramified hierarchy”. Since this notion has been extremely influential in logic and especially proof theory, we describe it in some details.

In order to further motivate this hierarchy, here is one example due to Russell. If we say

Napoleon was Corsican.

we do not refer in this sentence to any assemblage of properties. The
property “to be Corsican” is said to be
*predicative*. If we say on the other hand

Napoleon had all the qualities of a great general

we are referring to a totality of qualities. The property “to
have all qualities of a great general” is said to be
*impredicative*.

Another example, also coming from Russell, shows how impredicative properties can potentially lead to problems reminiscent of the liar paradox. Suppose that we suggest the definition

A typical Englishman is one who possesses all the properties possessed by a majority of Englishmen.

It is clear that most Englishmen do not possess *all* the
properties that most Englishmen possess. Therefore, a typical
Englishman, according to this definition, should be untypical. The
problem, according to Russell, is that the word “typical”
has been defined by a reference to all properties and has been treated
as itself a property. (It is remarkable that similar problems arise
when defining the notion of *random* numbers, cf.
Martin-Löf “Notes on constructive mathematics”
(1970).) Russell introduced the *ramified hierarchy* in order
to deal with the apparent circularity of such impredicative
definitions. One should make a distinction between the
*first-order* properties, like being Corsican, that do not
refer to the totality of properties, and consider that the
*second-order* properties refer only to the totality of
*first-order properties*. One can then introduce third-order
properties, that can refer to the totality of second-order property,
and so on. This clearly eliminates all circularities connected to
impredicative definitions.

At about the same time, Poincaré carried out a similar analysis. He stressed the importance of “predicative” classifications, and of not defining elements of a class using a quantification over this class (Poincaré, 1909). Poincaré used the following example. Assume that we have a collection with elements 0, 1 and an operation + satisfying

x+0 = 0

x+(y+1) = (x+y)+1

Let us say that a property is *inductive* if it holds of
0 and holds for *x*+1 if it holds for *x*.

An impredicative, and potentially “dangerous”, definition
would be to define an element to be a *number* if it satisfies
*all* inductive properties. It is then easy to show that this
property “to be a number” is itself inductive. Indeed,
0 is a number since it satisfies all inductive properties,
and if *x* satisfies all inductive properties then so does
*x*+1. Similarly it is easy to show that *x*+*y* is a
number if *x*,*y* are numbers. Indeed the property
*Q*(*z*) that *x*+*z* is a number is inductive:
*Q*(0) holds since *x*+0=*x* and
if *x*+*z* is a number then so
is *x*+(*z*+1) = (*x*+*z*)+1. This whole
argument however is circular since the property “to be a
number” is not predicative and should be treated with
suspicion.

Instead, one should introduce a ramified hierarchy of properties and
numbers. At the beginning, one has only *first-order* inductive
properties, which do not refer in their definitions to a totality of
properties, and one defines the numbers of order 1 to be the elements
satisfying all first-order inductive properties. One can next consider
the second-order inductive properties, that can refer to the
collection of first-order properties, and the numbers of order 2, that
are the elements satisfying the inductive properties of order 2. One
can then similarly consider numbers of order 3, and so
on. Poincaré emphasizes the fact that a number of order 2 is
*a fortiori* a number of order 1, and more generally, a number
of order n+1 is *a fortiori* a number of order n. We have thus
a sequence of more and more restricted properties: inductive
properties of order 1, 2, … and a sequence of more and more
restricted collections of objects: numbers of order 1, 2, …
Also, the property “to be a number of order n” is itself
an inductive property of order n+1.

It does not seem possible to prove that *x*+*y* is a
number of order *n* if *x*,*y* are numbers of
order n. On the other hand, it is possible to show that if *x*
is a number of order *n*+1, and *y* a number of
order *n*+1 then *x*+*y* is a number of
order *n*. Indeed, the property *P*(*z*) that
“*x*+*z* is a number of order *n*” is an
inductive property of order
*n*+1: *P*(0) holds since *x*+0 = *x* is a number
of order *n*+1, and hence of order *n*, and if
*P*(*z*) holds, that is if *x*+*z* is a number of
order *n*, then so is *x*+(*z*+1) =
(*x*+*z*)+1, and so *P*(*z*+1)
holds. Since *y* is a number of order *n*+1, and
*P*(*z*) is an inductive property of order *n*+1,
*P*(*y*) holds and so *x+y* is a number of order
*n*. This example illustrates well the complexities introduced
by the ramified hierarchy.

The complexities are further amplified if one, like Russell as for Frege, defines even basic objects like natural numbers as classes of classes. For instance the number 2 is defined as the class of all classes of individuals having exactly two elements. We again obtain natural numbers of different orders in the ramified hierarchy. Besides Russell himself, and despite all these complications, Chwistek tried to develop arithmetic in a ramified way, and the interest of such an analysis was stressed by Skolem. See Burgess and Hazen (1998) for a recent development.

Another mathematical example, often given, of an impredicative
definition is the definition of least upper bound of a bounded class
of real numbers. If we identify a real with the set of rationals that
are less than this real, we see that this least upper bound can be
defined as the union of all elements in this class. Let us identify
subsets of the rationals as predicates. For example, for rational
numbers *q*, *P*(*q*) holds iff *q* is a
member of the subset identified as *P*. Now, we define the
predicate *L*_{C} (a subset of the rationals)
to be the least upper bound of class *C* as:

∀q[L_{C}(q) ↔ ∃P[C(P) ∧P(q)]]

which is impredicative: we have defined a predicate *L* by an
existential quantification over all predicates. In the ramified
hierarchy, if *C* is a class of first-order classes of
rationals, then *L* will be a second-order class of
rationals. One obtains then not *one* notion or real numbers,
but real numbers of different orders 1, 2, … The least upper
bound of a collection of reals of order 1 will then be at least of
order 2 in general.

As we saw earlier, yet another example of an impredicative definition
is given by Leibniz' definition of equality. For Leibniz, the
predicate “is equal to *a*” is true for *b*
iff *b* satisfies all the predicates satisfied by *a*.

How should one deal with the complications introduced by the ramified
hierarchy? Russell showed, in the introduction to the second edition
to
*Principia
Mathematica*, that these complications can be avoided in some
cases. He even thought, in Appendix B of the second edition of
*Principia Mathematica*, that the ramified hierarchy of natural
numbers of order 1,2,… collapses at order 5. However,
Gödel later found a problem in his argument, and indeed, it was
shown by Myhill 1974 that this hierarchy actually *does not*
collapse at any finite level. A similar problem, discussed by Russell
in the introduction to the second edition to
*Principia
Mathematica* arises in the proof of Cantor's theorem that
there cannot be any injective functions from the collection of all
predicates to the collection of all objects (the version of Russell's
paradox in Frege's system that we presented in the introduction). Can
this be done in a ramified hierarchy? Russell doubted that this could
be done within a ramified hierarchy of predicates and this was indeed
confirmed indeed later (Heck, 1996).

Because of these problems, Russell and Whitehead introduced in the
first edition of *Principia Mathematica* the following
*reducibility axiom*: the hierarchy of predicates, first-order,
second-order, etc., collapses at level 1. This means that for any
predicate of any order, there is a predicate of the first-order level
which is equivalent to it. In the case of equality for instance, we
postulate a first-order relation “*a*=*b*”
which is equivalent to “*a* satisfies all properties that
*b* satisfies”. The motivation for this axiom was purely
pragmatic. Without it, all basic mathematical notions, like real or
natural numbers are stratified into different orders. Also, despite
the apparent circularity of impredicative definitions, the axiom of
reducibility does not seem to lead to inconsistencies.

As noticed however first by Chwistek, and later by Ramsey, in the presence of the axiom of reducibility, there is actually no point in introducing the ramified hierarchy at all! It is much simpler to accept impredicative definitions from the start. The simple “extensional” hierarchy of individuals, classes, classes of classes, … is then enough. We get in this way the simpler systems formalised in Gödel 1931 or Church 1940 that were presented above.

The axiom of reducibility draws attention to the problematic status of impredicative definitions. To quote Weyl 1946, the axiom of reducibility “is a bold, an almost fantastic axiom; there is little justification for it in the real world in which we live, and none at all in the evidence on which our mind bases its constructions”! So far, no contradictions have been found using the reducibility axiom. However, as we shall see below, proof-theoretic investigations confirm the extreme strength of such a principle.

The idea of the ramified hierarchy has been extremely important in
mathematical logic. Russell considered only the finite iteration of
the hierarchy: first-order, second-order, etc., but from the
beginning, the possibility of extending the ramification transfinitely
was considered. Poincaré (1909) mentions the work of Koenig in
this direction. For the example above of numbers of different order,
he also defines a number to be inductive of order ω if it is
inductive of all finite orders. He then points out that *x+y*
is inductive of order ω if both *x* and *y*
are. This shows that the introduction of transfinite orders can in
some case play the role of the axiom of reducibility. Such transfinite
extension of the ramified hierarchy was analysed further by Gödel
who noticed the key fact that the following form of the reducibility
axiom is actually *provable*: when one extends the ramified
hierarchy of properties over the natural numbers into the transfinite
this hiearchy collapses at level ω_{1}, the least
uncountable ordinal (Gödel, 1995, and Prawitz,
1970). Furthermore, while at all levels < ω_{1}, the
collection of predicates is countable, the collection of predicates at
level ω_{1} is of cardinality ω_{1}. This
fact was a strong motivation behind Gödel's model of
constructible sets. In this model the collection of all subsets of the
set of natural numbers (represented by predicates) is of cardinality
ω_{1} and is similar to the ramified hierarchy. This
model satisfies in this way the Continuum Hypothesis, and gives a
relative consistency proof of this axiom. (The motivation of
Gödel was originally only to build a model where the collection
of all subsets of natural numbers is well-ordered.)

The ramified hierarchy has been also the source of much work in proof
theory. After the discovery by Gentzen that the consistency of
Arithmetic could be proved by transfinite induction (over decidable
predicates) along the ordinal ε_{0}, the natural
question was to find the corresponding ordinal for the different
levels of the ramified hierarchy. Schütte (1960) found that for
the first level of the ramified hierarchy, that is if we extend
arithmetic by quantifying only over first-order properties, we get a
system of ordinal strength
ε_{ε0}. For the second level we get
the ordinal strength
ε_{εε0}, etc. We
recall that ε_{α} denotes the α-th
ε-ordinal number, an ε-ordinal number being an ordinal
β such that ω^{β} = β, see Schütte
(1960).

Gödel stressed the fact that his approach to the problem of the
continuum hypothesis was not constructive, since it needs the
uncountable ordinal ω_{1}, and it was natural to study
the ramified hierarchy along constructive ordinals. After preliminary
works of Lorenzen and Wang, Schütte analysed what happens if we
proceed in the following more constructive way. Since arithmetic has
for ordinal strength ε_{0} we consider first the
iteration of the ramified hierarchy up to ε_{0}.
Schütte computed the ordinal strength of the resulting system and
found an ordinal strength *u*(1)>ε_{0}. We
iterate then ramified hierarchy up to this ordinal *u*(1) and
get a system of ordinal strength *u*(2)> *u*(1) ,
etc. Each *u*(*k*) can be computed in terms of the
so-called Veblen hierarchy: *u*(*k*+1) is
φ_{u(k)}(0). The limit of this process
gives an ordinal called Γ_{0}: if we iterate the
ramified hierarchy up to the ordinal Γ_{0} we get a
system of ordinal strength Γ_{0}. Such an ordinal was
obtained independently about the same time by S. Feferman. It has been
claimed that Γ_{0} plays for predicative systems a role
similar to ε_{0} for Arithmetic. Recent
proof-theoretical works however are concerned with systems having
bigger proof-theoretical ordinals that can be considered predicative
(see for instance Palmgren (1995)).

Besides these proof theoretic investigations related to the ramified hierarchy, much work has been devoted in proof theory to analysing the consistency of the axiom of reducibility, or, equivalently, the consistency of impredicative definitions. Following Gentzen's analysis of the cut-elimination property in the sequent calculus, Takeuti found an elegant sequent formulation of simple type theory (without ramification) and made the bold conjecture that cut-elimination should hold for this system. This conjecture seemed at first extremely dubious given the circularity of impredicative quantification, which is well reflected in this formalism. The rule for quantifications is indeed

Γ ⊢ ∀ X[A(X)]

Γ ⊢ A[X:=T]

where *T* is *any* term predicate, which may itself
involve a quantification over all predicates. Thus the formula
*A*[*X*:=*T*] may be itself much more complex
than the formula *A(X)*.

One early result is that cut-elimination for Takeuti's impredicative system implies in a finitary way the consistency of second-order Arithmetic. (One shows that this implies the consistency of a suitable form of infinity axiom, see Andrews, 2002.) Following work by Schütte, it was later shown by W. Tait and D. Prawitz that indeed the cut-elimination property holds (the proof of this has to use a stronger proof theoretic principle, as it should be according to the incompletness theorem.)

What is important here is that these studies have revealed the extreme power of impredicative quantification or, equivalently, the extreme power of the axiom of reducibility. This confirms in some way the intuitions of Poincaré and Russell. The proof-theoretic strength of second-order Arithmetic is way above all ramified extensions of Arithmetic considered by Schütte. On the other hand, despite the circularity of impredicative definitions, which is made so explicit in Takeuti's calculus, no paradoxes have been found yet in second-order Arithmetic.

Another research direction in proof theory has been to understand how
much of impredicative quantification can be explained from principles
that are available in intuitionistic mathematics. The strongest such
principles are strong forms of inductive definitions. With such
principles, one can explain a limited form a impredicative
quantification, called Π_{1}^{1}-comprehension,
where one uses only one level of impredicative quantification over
predicates. Interestingly, almost all known uses of impredicative
quantifications: Leibniz equality, least upper bound, etc., can be
done with Π_{1}^{1}-comprehension. This reduction
of Π_{1}^{1}-comprehension was first achieved by
Takeuti in a quite indirect way, and was later simplified by Buchholz
and Schütte using the so-called Ω-rule. It can be seen as
a constructive explanation of some restricted, but nontrivial, uses of
impredicative definitions.

## 4. Type Theory/Set Theory

Type theory can be used as a foundation for mathematics, and indeed, it was presented as such by Russell in his 1908 paper, which appeared the same year as Zermelo's paper, presenting set theory as a foundation for mathematics.

It is clear intuitively how we can explain type theory in set theory:
a type is simply interpreted as a set, and function types *A*
→ *B* can be explained using the set theoretic notion of
function (as a functional relation, i.e. a set of pairs of
elements). The type *A* → *o* corresponds to the
powerset operation.

The other direction is more interesting. How can we explain the notion
of sets in term of types? There is an elegant solution, due to
A. Miquel, which complements previous works by P. Aczel (1978) and
which has also the advantage of explaining non necessarily
well-founded sets a la Finsler. One simply interprets a *set*
as a *pointed graph* (where the arrow in the graph represents
the membership relation). This is very conveniently represented in
type theory, a pointed graph being simply given by a type A and a pair
of elements

a:A,R:A→A→o

We can then define in type theory when two such sets *A*,
*a*, *R* and *B*, *b*, *S* are
equal: this is the case iff there is a bisimulation *T* between
*A* and *B* such that *T* *a* *b*
holds. A bisimulation is a relation

T:A→B→o

such that whenever *T* *x* *y* and *R*
*x* *x*′; hold, there exists *y*′
such that *T* *x*′ *y*′ and
*S* *y* *y*′ hold, and whenever *T*
*x* *y* and *R* *y* *y*′
hold, there exists *x*′ such that *T*
*x*′ *y*′ and *R* *x*
*x* hold. We can then define the membership relation: the set
represented *B*, *b*, *S* is a member of the set
represented by *A*, *a*, *R* iff there exists
*a*_{1} such that *R* *a*_{1}
*a* and *A*, *a*_{1}, *R* and
*B*, *b*, *S* are bisimilar.

It can then be checked that all the usual axioms of set theory
extensionality, power set, union, comprehension over bounded formulae
(and even antifoundation, so that the membership relation does not
need to be well-founded) hold in this simple model. (A bounded formula
is a formula where all quantifications are of the form
∀*x* ∈ *a* … or ∃*x*
∈ *a* …). In this way it can been shown that
Church's simple type theory is equiconsistent with the bounded version
of Zermelo's set theory.

## 5. Type Theory/Category Theory

There are deep connections between type theory and category theory. We limit ourselves to presenting two applications of type theory to category theory: the constructions of the free cartesian closed category and of the free topos (see the entry on category theory for an explanation of "cartesian closed" and "topos").

For building the free cartesian closed category, we extend simple type
theory with the type 1 (unit type) and the product type *A*
× *B*, for *A*, *B* types. The terms are
extended by adding pairing operations and projections and a special
element of type 1. As in Lambek and Scott (1986) one can then define a
notion of *typed conversions* between terms, and show that this
relation is decidable. One can then show (Lambek and Scott, 1986) that
the category with types as objects and as morphisms from *A* to
*B* the set of closed terms of type *A* →
*B* (with conversion as equality) is the free cartesian closed
category. This can be used to show that equality between arrows in
this category is decidable.

The theory of types of Church can also be used to build the free
topos. For this we take as objects pairs *A*,*E* with
*A* type and *E* a partial equivalence relation, that is
a closed term *E*: *A* → *A* →
*o* which is symmetric and transitive. We take as morphisms
between *A*, *E* and *B*, *F* the
relations
*R*: *A*→*B*→*o* that are
*functional* that is such that for any
*a*: *A* satisfying *E* *a*
*a* there exists one, and only one (modulo *F*) element
*b* of *B* such that *F* *b* *b*
and *R* *a* *b*. For the subobject classifier we
take the pair *o*, *E* with
*E*: *o*→*o*→*o* defined
as

EMN= and (implyMN) (implyNM)

One can then show that this category forms a topos, indeed the free topos.

It should be noted that the type theory in Lambek and Scott (1986) uses a variation of type theory, introduced by Henkin and refined by P. Andrews (2002) which is to have an extensional equality as the only logical connective, i.e. a polymorphic constant

eq :A→A→o

and to define all logical connectives from this connective and
constants *T*, *F* : *o*. For instance, one defines

∀P=_{df}eq (λx.T)P

The equality at type *o* is logical equivalence.

One advantage of the intensional formulation is that it allows for a direct notation of proofs based on λ-calculus (Martin-Löf, 1971 and Coquand, 1986).

## 6. Extensions of Type System, Polymorphism, Paradoxes

We have seen the analogy between the operation A → A → o on types and the powerset operation on sets. In set theory, the powerset operation can be iterated transfinitely along the cumulative hierarchy. It is then natural to look for analogous transfinite versions of type theory.

One such extension of Church's simple type theory is obtained by
adding universes (Martin-Löf, 1970). Adding a universe is a
*reflection* process: we add a type *U* whose objects
are the types considered so far. For Church's simple type theory we
will have

o:U,i:UandA→B:UifA:U,B:U

and, furthermore, *A* is a type if
*A*: *U*. We can then consider types such as

(A:U) →A→A

and functions such as

id = λA.λx.x: (A:U) →A→A

The function id takes as argument a “small” type
*A*: *U* and an element *x* of type
*A*, and outputs an element of type *A*. More generally
if *T*(*A*) is a type under the assumption
*A*: *U*, one can form the dependent type

(A:U) →T(A)

That *M* is of this type means that *M*
*A*: *T*(*A*) whenever
*A*: *U*. We get in this way extensions of type
theory whose strength is similar to the one of Zermelo's set theory
(Miquel, 2001). More powerful form of universes are considered in
(Palmgren, 1998). Miquel (2003) presents a version of type theory of
strength equivalent to the one of Zermelo-Fraenkel.

One very strong form of universe is obtained by adding the axiom
*U*: *U*. This was suggested by P. Martin-Löf in
1970. J.Y. Girard showed that the resulting type theory is
inconsistent as a logical system (Girard, 1972). Though it seems at first
that one could directly reproduce Russell's paradox using a set of all
sets, such a direct paradox is actually not possible due to the
difference between sets and types. Indeed the derivation of a
contradiction in such a system is subtle and has been rather indirect
(though, as noticed by Miquel (2001), it can now be reduced to
Russell's paradox by representing sets as pointed graphs). J.Y. Girard
first obtained his paradox for a weaker system. This paradox was
refined later (Coquand, 1994 and Hurkens, 1995). (The notion of pure
type system, introduced by Barendregt (1992), is convenient for
getting a sharp formulation of these paradoxes.) Instead of the axiom
*U*: *U* one assumes only

(A:U) →T(A) :U

if *T*(*A*) : *U*
[*A*: *U*]. Notice the circularity, indeed of the
same kind as the one that is rejected by the ramified hierarchy: we
define an element of type *U* by quantifying over all elements
of *U*. For instance the type

(A:U) →A→A:U

will be the type of the polymorphic identity function. Despite this circularity, J.Y. Girard was able to show normalisation for type systems with this form of polymorphism. However, the extension of Church's simple type theory with polymorphism is inconsistent as a logical system, i.e. all propositions (terms of type o) are provable.

J.Y. Girard's motivation for considering a type system with polymorphism was to extend Gödel's Dialectica (Gödel, 1958) interpretation to second-order arithmetic. He proved normalisation using the reducibility method, that had been introduced by Tait (1967) while analysing Gödel, 1958. It is quite remarkable that the circularity inherent in impredicativity does not result in non-normalisable terms. (Girard's argument was then used to show that cut-elimination terminates in Takeuti's sequent calculus presented above.) A similar system was introduced independently by J. Reynolds (1974) while analysing the notion of polymorphism in computer science.

Martin-Löf's introduction of a type of all types comes from the identification of the concept of propositions and types, suggested by the work of Curry and Howard. It is worth recalling here his three motivating points:

- Russell's definition of types as ranges of significance of propositional functions
- the fact that one needs to quantify over all propositions (impredicativity of simple type theory)
- identification of proposition and types

Given (1) and (2) we should have a type of propositions (as in simple type theory), and given (3) this should also be the type of all types. Girard's paradox shows that one cannot have (1),(2) and (3) simultaneously. Martin-Löf's choice was to take away (2), restricting type theory to be predicative (and, indeed, the notion of universe appeared first in type theory as a predicative version of the type of all types). The alternative choice of taking away (3) is discussed by Coquand (1986). For more information on this topic, we refer to the entry on constructive type theory.

## Bibliography

- Aczel, P., 1978, “The type theoretic interpretation of
constructive set theory”,
*Logic Colloquium '77*, North-Holland: 55–66. - Andrews, P., 2002,
*An introduction to mathematical logic and type theory: to truth through proof*, Second edition. Applied Logic Series, 27. Kluwer Academic Publishers, Dordrecht. - Barendregt, H., 1997, “The impact of the lambda calculus in
logic and computer science”,
*Bull. Symbolic Logic*3, no. 2: 181–215. - –––, 1992,
*Lambda calculi with types. Handbook of logic in computer science*, Vol. 2, Oxford Sci. Publ., Oxford Univ. Press, New York: 117–309. - de Bruijn, N. G., 1980, “A survey of the project
AUTOMATH”, in
*To H. B. Curry: essays on combinatory logic, lambda calculus and formalism*, Academic Press, London-New York: 579–606. - Burgess J. P. and Hazen A. P., 1998, “Predicative Logic and
Formal Arithmetic Source,”
*Notre Dame J. Formal Logic 39*, no. 1: 1-17 - Buchholz, W. Schütte, 1988,
*Proof theory of impredicative subsystems of analysis. Studies in Proof Theory*, Monographs, 2. Bibliopolis, Naples. - Church, A. , 1940, “A formulation of the simple theory of
types,”
*J Symbolic Logic*, vol. 5: 56–68 - Chwistek, L., 1948,
*The Limits of Science: Outline of Logic and of the Methodology of the Exact Sciences*, Routledge and Kegan Paul, London. - Coquand, T., 1986, “An analysis of Girard's paradox,”
*Proceedings of the IEEE Symposium on Logic in Computer Science*: 227–236. - –––, 1994, “A new paradox in type theory,”
*Stud. Logic Found. Math.*, 134, North-Holland, Amsterdam: 555–570. - du Bois-Reymond, P., 1875, “Ueber asymptotische Werthe,
infintaere Appproximationen und infitaere Aufloesung von
Gleichungen,”
*Mathematische Annalen*, vol. 8: 363–414. - Feferman, S., 1964, “Systems of predicative analysis,”
*J. Symbolic Logic*29: 1–30. - Ferreira, F. and Wehmeier, K., 2002, “On the consistency of
the Delta-1-1-CA fragment of Frege's Grundgesetze,”
*Journal of Philosophic Logic*31: 301–311. - Girard, J.Y., 1972,
*Interpretation fonctionelle et eleimination des coupures dans l'arithmetique d'ordre superieure*, These d'Etat, Université Paris 7. - Gödel, K., 1995,
*Collected Works Volume III, Unpublished Essays and Lectures*, Oxford, 1995. - –––, 1931, “Uber formal untentschedibare Saetze
der Principia Mathematica und verwandter Systeme I,”
*Monatsh. Math. Phys*, vol 38: p. 349-360. - –––, 1944, “Russell's mathematical logic,” in
*The philosophy of Bertrand Russell*, Schilpp (ed.): 123-153. - –––, 1958, “Über eine bisher noch nicht
benützte Erweiterung des finiten Standpunktes,”,
*Dialectica*12: 280–287. - Heck, Richard G., Jr, 1996, “The consistency of predicative
fragments of Frege's Grundgesetze der Arithmetik,”
*Hist. Philos. Logic*17, no. 4: 209–220. - van Heijenoort, 1967,
*From Frege to Gödel. A source book in mathematical logic 1879–1931*, Harvard University Press, Cambridge, Mass. - Hindley, R., 1997,
*Basic Simple Type Theory*Cambridge University Press, Cambridge - Howard, W. A., 1980, “The formulae-as-types notion of
construction,” in
*To H. B. Curry: essays on combinatory logic, lambda calculus and formalism*, Academic Press, London-New York:, 480–490. - Hurkens, A. J. C., 1995, “A simplification of Girard's
paradox. Typed lambda calculi and applications,” in
*Lecture Notes in Comput. Sci.*, 902, Springer, Berlin: 266–278. - Lambek, J., 1980. “From λ-calculus to Cartesian
Closed Categories” in
*To H.B. Curry: Essays on Combinatory Logic, λ-calculus and Formalism*, Seldin and Hindley (eds.), Academic Press. pp. 375–402. - Lambek, J. and Scott, P. J., 1986,
*Introduction to higher order categorical logic*. Cambridge Studies in Advanced Mathematics, 7. Cambridge University Press, Cambridge, (Reprinted 1988). - Lawvere, F. W. and Rosebrugh, R., 2003,
*Sets for mathematics*, Cambridge University Press, Cambridge. - Martin-Löf, P., 1970,
*Notes on constructive mathematics*, Almqvist & Wiksell, Stockholm. - –––, 1971,
*A Theory of Types Preprint*, Stockhom University. - –––, 1998, “An intuitionistic theory of types,”
in
*Twenty-five years of constructive type theory*, Oxford Logic Guides, 36, Oxford Univ. Press, New York: 127–172. - Myhill, J., 1974, “The Undefinability of the Set of Natural Numbers in the Ramified Principia”, in Bertrand Russell's Philosophy, edited by G. Nakhnikian, Duckworth, London: 19–27.
- Miquel, A., 2001,
*Le Calcul des Constructions implicite: syntaxe et sémantique.*Thèse de doctorat. Université Paris 7. - –––, 2003, “A strongly normalising Curry-Howard
correspondence for IZF set theory,” in
*Computer science logic, Lecture Notes in Comput. Sci.*, 2803, Springer, Berlin: 441–454. - Palmgren, Erik, 1998, “On universes in type theory,”
in
*Twenty-five years of constructive type theory*, Oxford Logic Guides, 36, Oxford Univ. Press, New York: 191–204. - Poincaré, 1909, “H. La logique de l'infini”
*Revue de metaphysique et de morale*, 17: 461–482. - Prawitz, D., 1967, “Completeness and Hauptsatz for second
order logic”,
*Theoria*33, 246–258. - –––, 1970, “On the proof theory of mathematical
analysis,” in
*Logic and value (essays dedicated to Thorild Dahlquist on his fiftieth birthday)*, Filosofiska Studier, No. 9, Filosof. Föreningen och Filosof. Inst., Uppsala Univ., Uppsala: 169–180.. - Quine, W., 1940, “Review of Church “A formulation of
the simple theory of types”
*J Symbolic Logic,*vol. 5, p. 114. - Ramsey, F.P., 1926, “The foundations of mathematics,”
*Proc. Lond. Math. Soc.*2, vol. 25, p. 338-384. - Russell, B., 1903,
*The principles of Mathematics*, vol. 1, Cambridge University Press, Cambridge. - –––, 1908, “Mathematical logic as based on the theory
of types”
*American Journal of Mathematics*30, p. 222-262. - –––, 1959,
*My philosophical development*, London and New York: Routledge. - Russell, B. and Whitehead, A., 1910, 1912, 1913,
*Principia Mathematica*, 3 vols, Cambridge: Cambidge University Press, 1910-1913. - Reynolds, J., 1974, “Towards a theory of type
structure,” in
*Programming Symposium Lecture Notes in Comput. Sci.*, Vol. 19, Springer, Berlin, 1974, pp. 408–425. - –––, 1983, “Types, Abstraction and Parametric
Polymorphism”
*IFIP Congress*, Paris, 513–523. - –––, 1984, “Polymorphism is not
set-theoretic. Semantics of data types”
*Lecture Notes in Comput. Sci.*, 173, Springer, Berlin, 145–156. - –––, 1985, “Three approaches to type
structure. Mathematical foundations of software development,”,
in
*Lecture Notes in Comput. Sci.*, 185, Springer, Berlin, 97–138. - Schütte, K., 1960,
*Beweistheorie*Springer-Verlag, Berlin, 1960. - Scott, D., 1993 “A type-theoretic alternative to ISWIM, CUCH, OWHY,
*Theoretical Computer Science*, 121, p. 411-440. - Skolem, T., 1970,
*Selected works in logic*ed. Jens Erik Fenstad, Universitetsforlaget, Oslo. - Tait, W. W., 1967, “Intensional interpretations of
functionals of finite type”
*I. J. Symbolic Logic*32: 198–212. - Takeuti, G., 1955 “On the fundamental conjecture of
GLC”,
*I. J. Math. Soc.*, Japan 7, 249–275. - –––, 1967, “Consistency proofs of subsystems of
classical analysis”
*Ann. of Math*, (2) 86: 299–348. - Tarski, A., 1931, “Sur les ensembles definissables de nombres
reels I”,
*Fund. Math.*vol 17: p. 210-239. - Wiener, N., 1914, “A simplification of the logic of relations,
*Proceedings of the Cambridge Philosophical Society*vol 17: p. 387-390. - Weyl, H., 1946, “Mathematics and Logic,
*American Mathematical Monthly*53: p. 2-13. - Zermelo, E., 1908, Untersuchungen über die Grundlagen der Mengenlehre I,
*Mathematische Annalen*65: p. 261-281.

## Related Entries

Russell, Bertrand |
category theory |
Frege, Gottlob |
Frege, Gottlob: logic, theorem, and foundations for arithmetic |
logic: paraconsistent |
mathematics: inconsistent |
Peano, Giuseppe |
*Principia Mathematica* |
type theory: Church's type theory |
type theory: constructive