The topic of type theory is fundamental both in logic and computerscience. We limit ourselves here to sketch some aspects that areimportant in logic. For the importance of types in computer science, werefer the reader for instance to Reynolds 1983 and 1985.
The theory of types was introduced by Russell in order to cope withsome contradictions he found in his account of set theory and wasintroduced in “Appendix B: The Doctrine of Types”of Russell 1903. This contradiction was obtained by analysing atheorem of Cantor that no mapping
F : X → Pow(X)
(where Pow(X) is the class of subclasses of a classX) can be surjective; that is,F cannot be such thatevery memberb of Pow(X) is equal toF(a) for some elementa ofX. Thiscan be phrased “intuitively” as the fact that there aremore subsets ofX than elements ofX. The proof ofthis fact is so simple and basic that it is worthwhile giving ithere. Consider the following subset ofX:
A = {x ∈X |x ∉F(x)}.
This subset cannot be in the range ofF. For ifA =F(a), for somea, then
a ∈F(a) iff a ∈A iff a ∉F(a)
which is a contradiction. Some remarks are in order. First, the proofdoes not use the law of excluded middle and is thus validintuitionistically. Second, the method that is used, calleddiagonalisation was already present in the work of duBois-Reymond for building real functions growing faster than anyfunction in a given sequence of functions.
Russell analysed what happens if we apply this theorem to the casewhere A is the class of all classes, admitting that there is such aclass. He was then lead to consider the special class of classes thatdo not belong to themselves
(*)R = {w |w ∉w }
We then have
R ∈R iffR ∉R
It seems indeed that Cantor was already aware of the fact that theclass of all sets cannot be considered itself to be a set.
Russell communicated this problem to Frege, and his letter, togetherwith Frege's answer appear in van Heijenoort 1967. It is importantto realise that the formulation (*) does not apply as it is to Frege'ssystem. As Frege himself wrote in his reply to Russell, the expression“a predicate is predicated of itself” is not exact. Fregehad a distinction betweenpredicates (concepts) andobjects. A (first-order) predicate applies to an object butit cannot have a predicate as argument. The exact formulation of theparadox in Frege's system uses the notion of theextension ofa predicateP, which we designate as εP.The extension of a predicate is itself an object. The important axiomV is:
(Axiom V) εP = εQ ≡ ∀x[P(x) ≡Q(x)]
This axiom asserts that the extension ofP is identical tothe extension ofQ if and only ifP andQare materially equivalent. We can then translate Russell's paradox(*) in Frege's system by defining the predicate
R(x) iff ∃P[x = εP ∧ ¬P(x)]
It can then been checked, using Axiom V in a crucial way, that
R(εR) ≡ ¬R(εR)
and we have a contradiction as well. (Notice that for defining thepredicateR, we have used animpredicativeexistential quantification on predicates. It can be shown that thepredicative version of Frege's system is consistent (seeHeck 1996 and for further refinements Ferreira 2002).
It is clear from this account that an idea of types was alreadypresent in Frege's work: there we find a distinction between objects,predicates (or concepts), predicates of predicates, etc. (This pointis stressed in Quine 1940.) This hierarchy is called the“extensional hierarchy” by Russell (1959), and itsnecessity was recognised by Russell as a consequence of hisparadox.
As we saw above, the distinction: objects, predicates, predicate ofpredicates, etc., seems enough to block Russell's paradox (and thiswas recognised by Chwistek and Ramsey). We first describe the typestructure as it is inPrincipia and later in this section wepresent the elegant formulation due to Church 1940 based onλ-calculus. The types can be defined as
For instance, the type of binary relations over individuals is(i,i), the type of binary connectives is(( ),( )), the type of quantifiers over individuals is((i)).
For forming propositions we use this type structure: thusR(a1,…,an)is a proposition ifR is of type(A1,…,An) andai is of typeAi fori = 1,…,n. This restriction makes itimpossible to form a proposition of the formP(P):the type ofP should be of the form (A), andP can only be applied to arguments of typeA, andthus cannot be applied to itself sinceA is not the same as(A).
However simple type theory is not predicative: we can define an objectQ(x,y) of type (i,i)by
∀P[P(x)⊃P(y)]
Assume that we have two individualsa andb suchthatQ(a,b) holds. We can defineP(x) to beQ(x,a). Itis then clear thatP(a) holds, since it isQ(a,a). HenceP(b) holdsas well. We have proved, in an impredicative way, thatQ(a,b) impliesQ(b,a).
Alternative simpler formulations, which retain only the notion ofclasses, classes of classes, etc., were formulated by Gödel andTarski. Actually this simpler version was used by Gödel in his1931 paper on formally undecidable propositions. We have objects oftype 0, for individuals, objects of type 1, for classes ofindividuals, objects of type 2, for classes of classes of individuals,and so on. Functions of two or more arguments, like relations, neednot be included among primitive objects since one can define relationsto be classes of ordered pairs, and ordered pairs to be classes ofclasses. For example, the ordered pair of individualsa, bcan be defined to be {{a},{a,b}} where{x,y} denotes the class whose sole elements arex andy. (Wiener 1914 had suggested a similarreduction of relations to classes.) In this system, all propositionshave the forma(b), wherea is a sign oftypen+1 andb a sign of typen. Thus thissystem is built on the concept of an arbitrary class or subset ofobjects of a given domain and on the fact that the collectionofall subsets of the given domain can form a new domain ofthe next type. Starting from a given domain of individuals, thisprocess is then iterated. As emphasised for instance in Scott 1993, inset theory this process of forming subsets is iterated intothetransfinite.
In these versions of type theory, as in set theory, functions are notprimitive objects, but are represented as functional relation. Theaddition function for instance is represented as a ternary relation byan object of type (i,i,i). An elegantformulation of the simple type theory which extends it by introducingfunctions as primitive objects was given by Church in 1940. It usesthe λ-calculus notation (Barendregt 1997). Since such aformulation is important in computer science, for the connection withcategory theory, and for Martin-Löf type theory, we describe itin some detail. In this formulation, predicates are seen as a specialkind of functions (propositional functions), an idea that goes back toFrege (see for instance Quine 1940). Furthermore, the notion offunction is seen as more primitive than the notion of predicates andrelations, and a function is not defined anymore as a special kind ofrelation. (Oppenheimer and Zalta 2011 presents some arguments againstsuch a primitive representation of functions.) The types of thissystem are defined inductively as follows
We can form in this way the types:
i→o (type of predicates) (i→o) →o (type of predicates of predicates)
which correspond to the types (i) and ((i)) but alsothe new types
i→i (type of functions) (i→i) →i (type of functionals)
It is convenient to write
A1,…,An →B
for
A1 → (A2 →… → (An →B))
In this way
A1,…,An →o
corresponds to the type(A1,…,An).
First-order logic considers only types of the form
i,…,i →i (type of function symbols), and i,…,i →o (type of predicate, relation symbols)
Notice that
A →B →C
stands for
A → (B→C)
(association to the right).
For the terms of this logic, we shall not follow Church's account buta slight variation of it, due to Curry (who had similar ideas beforeChurch's paper appeared) and which is presented in detail inR. Hindley's book on type theory. Like Church, we useλ-calculus, which provides a general notation for functions
M ::=x |MM |λx. M
Here we have used the so-called BNF notation, very convenient incomputing science. This gives a syntactic specification of theλ-terms which, when expanded, means:
The notation for function applicationMN is alittle different than the mathematical notation, which would beM(N). In general,
M1M2M3
stands for
(M1M2)M3
(association to the left). The termλx. M represents the function which toN associatesM[x:=N]. Thisnotation is so convenient that one wonders why it is not widely usedin mathematics. The main equation of λ-calculus is then(β-conversion)
(λx. M)N =M[x:=N]
which expresses the meaning of λx. Mas a function. We have usedM[x:=N] as anotation for the value of the expression that results whenNis substituted for the variablex in the matrixM.One usually sees this equation as a rewrite rule(β-reduction)
(λx. M)N →M[x:=N]
Inuntyped lambda calculus, it may be that such rewriting doesnot terminate. The canonical example is given by the term Δ =λx. xx and the application
Δ Δ → Δ Δ
(Notice the similarity with Russell's paradox.) The idea of Curry isthen to look at types as predicates over lambda terms, writingM: A to express thatM satisfies thepredicate/typeA. The meaning of
N: A→B
is then
∀M, ifM: A thenNM: B
which justifies the following rules
N: A→B M: A NM: B
M: B [x: A] λx. M: A →B
In general one works with judgements of the form
x1:A1,...,xn:An ⊢M: A
wherex1,...,xn are distinctvariables, andM is a term having all free variables amongx1,...,xn. In order to beable to get Church's system, one adds some constants in order to formpropositions. Typically
not:o→o
imply:o→o→o
and:o→o→o
forall: (A→o) →o
The term
λx. ¬(xx)
represents the predicate of predicates that do not apply to themselves.This term does not have a type however, that is, it is not possible tofindA such that
λx. ¬(xx) :(A→o) →o
which is the formal expression of the fact that Russell's paradoxcannot be expressed. Leibniz equality
Q:i →i →o
will be defined as
Q = λx. λy. ∀(λP. imply (Px) (Py))
One usually writes ∀x[M] instead of∀(λx. M), and the definition ofQ can then be rewritten as
Q =λx. λy. ∀P[imply(Px) (Py)]
This example again illustrates that we can formulate impredicative definitionsin simple type theory.
The use of λ-terms and β-reduction is most convenient forrepresenting the complex substitution rules that are needed in simpletype theory. For instance, if we want to substitute the predicateλx. Qax forP in the proposition
imply (Pa) (Pb)
we get
imply ((λx. Qax)a) ((λx. Qax)b)
and, using β-reduction,
imply (Qaa) (Qab)
In summary, simple type theory forbids self-application but not thecircularity present in impredicative definitions.
The λ-calculus formalism also allows for a clearer analysisof Russell's paradox. We can see it as the definition of thepredicate
Rx = ¬(xx)
If we think of β-reduction as the process of unfolding adefinition, we see that there is a problem already with understandingthe definition of R R
RR → ¬(RR)→ ¬(¬(RR)) → …
In some sense, we have a non-wellfounded definition, which is asproblematic as a contradiction (a proposition equivalent to itsnegation). One important theorem, the normalisation theorem, says thatthis cannot happen with simple types: if we haveM: A thenM is normalisable in astrong way (any sequence of reductions starting fromMterminates).
For more information on this topic, we refer to the entry onChurch's simple type theory.
Russell introduced another hierarchy, that was not motivated by anyformal paradoxes expressed in a formal system, but rather by the fearof “circularity” and by informal paradoxes similar to theparadox of the liar. If a man says “I am lying”, then wehave a situation reminiscent of Russell's paradox: a proposition whichis equivalent to its own negation. Another informal such paradoxicalsituation is obtained if we define an integer to be the “leastinteger not definable in less than 100 words”. In order toavoid such informal paradoxes, Russell thought it necessary tointroduce another kind of hierarchy, the so-called “ramifiedhierarchy”. The need for such a hierarchy is hinted in AppendixB of Russell 1903. Interestingly this is connected there to thequestion of the identity of equivalent propositions and of the logicalproduct of a class of propositions. A thorough discussion can be foundin Chapter 10 of Russell 1959. Since this notion of ramifiedhierarchy has been extremely influential in logic and especially prooftheory, we describe it in some details.
In order to further motivate this hierarchy, here is one example dueto Russell. If we say
Napoleon was Corsican.
we do not refer in this sentence to any assemblage of properties. Theproperty “to be Corsican” is said to bepredicative. If we say on the other hand
Napoleon had all the qualities of a greatgeneral
we are referring to a totality of qualities. The property “tohave all qualities of a great general” is said to beimpredicative.
Another example, also coming from Russell, shows how impredicative properties can potentially lead to problems reminiscent of the liar paradox. Suppose that we suggest the definition
A typical Englishman is one who possesses all theproperties possessed by a majority of Englishmen.
It is clear that most Englishmen do not possessall theproperties that most Englishmen possess. Therefore, a typicalEnglishman, according to this definition, should be untypical. Theproblem, according to Russell, is that the word “typical”has been defined by a reference to all properties and has been treatedas itself a property. (It is remarkable that similar problems arisewhen defining the notion ofrandom numbers, cf.Martin-Löf “Notes on constructive mathematics”(1970).) Russell introduced theramified hierarchy in orderto deal with the apparent circularity of such impredicativedefinitions. One should make a distinction between thefirst-order properties, like being Corsican, that do notrefer to the totality of properties, and consider that thesecond-order properties refer only to the totality offirst-order properties. One can then introduce third-orderproperties, that can refer to the totality of second-order property,and so on. This clearly eliminates all circularities connected toimpredicative definitions.
At about the same time, Poincaré carried out a similaranalysis. He stressed the importance of “predicative”classifications, and of not defining elements of a class using aquantification over this class (Poincaré1909). Poincaré used the following example. Assume that we havea collection with elements 0, 1 and an operation + satisfying
x+0 = 0
x+(y+1) =(x+y)+1
Let us say that a property isinductive if it holds of 0 and holds forx+1 if it holds forx.
An impredicative, and potentially “dangerous”, definitionwould be to define an element to be anumber if it satisfiesall inductive properties. It is then easy to show that thisproperty “to be a number” is itself inductive. Indeed,0 is a number since it satisfies all inductive properties,and ifx satisfies all inductive properties then so doesx+1. Similarly it is easy to show thatx+y is anumber ifx,y are numbers. Indeed the propertyQ(z) thatx+z is a number is inductive:Q(0) holds sincex+0=x andifx+z is a number then soisx+(z+1) = (x+z)+1. This wholeargument however is circular since the property “to be anumber” is not predicative and should be treated withsuspicion.
Instead, one should introduce a ramified hierarchy of properties andnumbers. At the beginning, one has onlyfirst-order inductiveproperties, which do not refer in their definitions to a totality ofproperties, and one defines the numbers of order 1 to be the elementssatisfying all first-order inductive properties. One can next considerthe second-order inductive properties, that can refer to thecollection of first-order properties, and the numbers of order 2, thatare the elements satisfying the inductive properties of order 2. Onecan then similarly consider numbers of order 3, and soon. Poincaré emphasizes the fact that a number of order 2 isa fortiori a number of order 1, and more generally, a numberof order n+1 isa fortiori a number of order n. We have thusa sequence of more and more restricted properties: inductiveproperties of order 1, 2, … and a sequence of more and morerestricted collections of objects: numbers of order 1, 2, …Also, the property “to be a number of ordern” is itselfan inductive property of ordern+1.
It does not seem possible to prove thatx+y is anumber of ordern ifx,y are numbers ofordern. On the other hand, it is possible to show that ifxis a number of ordern+1, andy a number ofordern+1 thenx+y is a number ofordern. Indeed, the propertyP(z) that“x+z is a number of ordern” is aninductive property of ordern+1:P(0) holds sincex+0 =x is a numberof ordern+1, and hence of ordern, and ifP(z) holds, that is ifx+z is a number ofordern, then so isx+(z+1) =(x+z)+1, and soP(z+1)holds. Sincey is a number of ordern+1, andP(z) is an inductive property of ordern+1,P(y) holds and sox+y is a number of ordern. This example illustrates well the complexities introducedby the ramified hierarchy.
The complexities are further amplified if one, like Russell as forFrege, defines even basic objects like natural numbers as classes ofclasses. For instance the number 2 is defined as the class of allclasses of individuals having exactly two elements. We again obtainnatural numbers of different orders in the ramified hierarchy.Besides Russell himself, and despite all these complications, Chwistektried to develop arithmetic in a ramified way, and the interest ofsuch an analysis was stressed by Skolem. See Burgess and Hazen 1998for a recent development.
Another mathematical example, often given, of an impredicativedefinition is the definition of least upper bound of a bounded classof real numbers. If we identify a real with the set of rationals thatare less than this real, we see that this least upper bound can bedefined as the union of all elements in this class. Let us identifysubsets of the rationals as predicates. For example, for rationalnumbersq,P(q) holds iffq is amember of the subset identified asP. Now, we define thepredicateLC (a subset of the rationals)to be the least upper bound of classC as:
∀q[LC(q) ↔ ∃P[C(P) ∧P(q)]]
which is impredicative: we have defined a predicateL by anexistential quantification over all predicates. In the ramifiedhierarchy, ifC is a class of first-order classes ofrationals, thenL will be a second-order class ofrationals. One obtains then notone notion or real numbers,but real numbers of different orders 1, 2, … The least upperbound of a collection of reals of order 1 will then be at least oforder 2 in general.
As we saw earlier, yet another example of an impredicative definitionis given by Leibniz' definition of equality. For Leibniz, thepredicate “is equal toa” is true forbiffb satisfies all the predicates satisfied bya.
How should one deal with the complications introduced by the ramifiedhierarchy? Russell showed, in the introduction to the second editiontoPrincipia Mathematica, that these complications can be avoided in somecases. He even thought, in Appendix B of the second edition ofPrincipia Mathematica, that the ramified hierarchy of naturalnumbers of order 1,2,… collapses at order 5. However,Gödel later found a problem in his argument, and indeed, it wasshown by Myhill 1974 that this hierarchy actuallydoes notcollapse at any finite level. A similar problem, discussed by Russellin the introduction to the second edition toPrincipia Mathematica arises in the proof of Cantor's theorem thatthere cannot be any injective functions from the collection of allpredicates to the collection of all objects (the version of Russell'sparadox in Frege's system that we presented in the introduction). Canthis be done in a ramified hierarchy? Russell doubted that this couldbe done within a ramified hierarchy of predicates and this was indeedconfirmed indeed later (Heck 1996).
Because of these problems, Russell and Whitehead introduced in thefirst edition ofPrincipia Mathematica the followingreducibility axiom: the hierarchy of predicates, first-order,second-order, etc., collapses at level 1. This means that for anypredicate of any order, there is a predicate of the first-order levelwhich is equivalent to it. In the case of equality for instance, wepostulate a first-order relation “a=b”which is equivalent to “a satisfies all properties thatb satisfies”. The motivation for this axiom was purelypragmatic. Without it, all basic mathematical notions, like real ornatural numbers are stratified into different orders. Also, despitethe apparent circularity of impredicative definitions, the axiom ofreducibility does not seem to lead to inconsistencies.
As noticed however first by Chwistek, and later by Ramsey, in thepresence of the axiom of reducibility, there is actually no point inintroducing the ramified hierarchy at all! It is much simpler toaccept impredicative definitions from the start. The simple“extensional” hierarchy of individuals, classes, classesof classes, … is then enough. We get in this way the simplersystems formalised in Gödel 1931 or Church 1940 that werepresented above.
The axiom of reducibility draws attention to the problematic status ofimpredicative definitions. To quote Weyl 1946, the axiom ofreducibility “is a bold, an almost fantastic axiom; there islittle justification for it in the real world in which we live, andnone at all in the evidence on which our mind bases itsconstructions”! So far, no contradictions have been found usingthe reducibility axiom. However, as we shall see below,proof-theoretic investigations confirm the extreme strength of such aprinciple.
The idea of the ramified hierarchy has been extremely important inmathematical logic. Russell considered only the finite iteration ofthe hierarchy: first-order, second-order, etc., but from thebeginning, the possibility of extending the ramification transfinitelywas considered. Poincaré (1909) mentions the work of Koenig inthis direction. For the example above of numbers of different order,he also defines a number to be inductive of order ω if it isinductive of all finite orders. He then points out thatx+yis inductive of order ω if bothx andyare. This shows that the introduction of transfinite orders can insome case play the role of the axiom of reducibility. Such transfiniteextension of the ramified hierarchy was analysed further by Gödelwho noticed the key fact that the following form of the reducibilityaxiom is actuallyprovable: when one extends the ramifiedhierarchy of properties over the natural numbers into the transfinitethis hiearchy collapses at level ω1, the leastuncountable ordinal (Gödel 1995, and Prawitz1970). Furthermore, while at all levels < ω1, thecollection of predicates is countable, the collection of predicates atlevel ω1 is of cardinality ω1. Thisfact was a strong motivation behind Gödel's model ofconstructible sets. In this model the collection of all subsets of theset of natural numbers (represented by predicates) is of cardinalityω1 and is similar to the ramified hierarchy. Thismodel satisfies in this way the Continuum Hypothesis, and gives arelative consistency proof of this axiom. (The motivation ofGödel was originally only to build a model where the collectionof all subsets of natural numbers is well-ordered.)
The ramified hierarchy has been also the source of much work in prooftheory. After the discovery by Gentzen that the consistency ofArithmetic could be proved by transfinite induction (over decidablepredicates) along the ordinal ε0, the naturalquestion was to find the corresponding ordinal for the differentlevels of the ramified hierarchy. Schütte (1960) found that forthe first level of the ramified hierarchy, that is if we extendarithmetic by quantifying only over first-order properties, we get asystem of ordinal strengthεε0. For the second level we getthe ordinal strengthεεε0, etc. Werecall that εα denotes the α-thε-ordinal number, an ε-ordinal number being an ordinalβ such that ωβ = β, see Schütte(1960).
Gödel stressed the fact that his approach to the problem of thecontinuum hypothesis was not constructive, since it needs theuncountable ordinal ω1, and it was natural to studythe ramified hierarchy along constructive ordinals. After preliminaryworks of Lorenzen and Wang, Schütte analysed what happens if weproceed in the following more constructive way. Since arithmetic hasfor ordinal strength ε0 we consider first theiteration of the ramified hierarchy up to ε0.Schütte computed the ordinal strength of the resulting system andfound an ordinal strengthu(1)>ε0. Weiterate then ramified hierarchy up to this ordinalu(1) andget a system of ordinal strengthu(2)>u(1) ,etc. Eachu(k) can be computed in terms of theso-called Veblen hierarchy:u(k+1) isφu(k)(0). The limit of this processgives an ordinal called Γ0: if we iterate theramified hierarchy up to the ordinal Γ0 we get asystem of ordinal strength Γ0. Such an ordinal wasobtained independently about the same time by S. Feferman. It has beenclaimed that Γ0 plays for predicative systems a rolesimilar to ε0 for Arithmetic. Recentproof-theoretical works however are concerned with systems havingbigger proof-theoretical ordinals that can be considered predicative(see for instance Palmgren 1995).
Besides these proof theoretic investigations related to the ramifiedhierarchy, much work has been devoted in proof theory to analysing theconsistency of the axiom of reducibility, or, equivalently, theconsistency of impredicative definitions. Following Gentzen'sanalysis of the cut-elimination property in the sequent calculus,Takeuti found an elegant sequent formulation of simple type theory(without ramification) and made the bold conjecture thatcut-elimination should hold for this system. This conjecture seemed atfirst extremely dubious given the circularity of impredicativequantification, which is well reflected in this formalism. The rulefor quantifications is indeed
Γ ⊢ ∀X[A(X)] Γ ⊢A[X:=T]
whereT isany term predicate, which may itselfinvolve a quantification over all predicates. Thus the formulaA[X:=T] may be itself much more complexthan the formulaA(X).
One early result is that cut-elimination for Takeuti's impredicativesystem implies in a finitary way the consistency of second-orderArithmetic. (One shows that this implies the consistency of a suitableform of infinity axiom, see Andrews 2002.) Following work bySchütte, it was later shown by W. Tait and D. Prawitz that indeedthe cut-elimination property holds (the proof of this has to use astronger proof theoretic principle, as it should be according to theincompletness theorem.)
What is important here is that these studies have revealed the extremepower of impredicative quantification or, equivalently, the extremepower of the axiom of reducibility. This confirms in some way theintuitions of Poincaré and Russell. The proof-theoreticstrength of second-order Arithmetic is way above all ramifiedextensions of Arithmetic considered by Schütte. On the otherhand, despite the circularity of impredicative definitions, which ismade so explicit in Takeuti's calculus, no paradoxes have been foundyet in second-order Arithmetic.
Another research direction in proof theory has been to understand howmuch of impredicative quantification can be explained from principlesthat are available in intuitionistic mathematics. The strongest suchprinciples are strong forms of inductive definitions. With suchprinciples, one can explain a limited form a impredicativequantification, called Π11-comprehension,where one uses only one level of impredicative quantification overpredicates. Interestingly, almost all known uses of impredicativequantifications: Leibniz equality, least upper bound, etc., can bedone with Π11-comprehension. This reductionof Π11-comprehension was first achieved byTakeuti in a quite indirect way, and was later simplified by Buchholzand Schütte using the so-called Ω-rule. It can be seen asa constructive explanation of some restricted, but nontrivial, uses ofimpredicative definitions.
Type theory can be used as a foundation for mathematics, and indeed,it was presented as such by Russell in his 1908 paper, which appearedthe same year as Zermelo's paper, presenting set theory as a foundationfor mathematics.
It is clear intuitively how we can explain type theory in set theory:a type is simply interpreted as a set, and function typesA→B can be explained using the set theoretic notion offunction (as a functional relation, i.e. a set of pairs ofelements). The typeA →o corresponds to thepowerset operation.
The other direction is more interesting. How can we explain the notionof sets in term of types? There is an elegant solution, due toA. Miquel, which complements previous works by P. Aczel (1978) andwhich has also the advantage of explaining non necessarilywell-founded sets a la Finsler. One simply interprets asetas apointed graph (where the arrow in the graph representsthe membership relation). This is very conveniently represented intype theory, a pointed graph being simply given by a type A and a pairof elements
a: A,R: A →A →o
We can then define in type theory when two such setsA,a,R andB,b,S areequal: this is the case iff there is a bisimulationT betweenA andB such thatTabholds. A bisimulation is a relation
T: A→B→o
such that wheneverTxy andRxu hold, there existsv such thatTuv andSyv hold, and wheneverTxy andRyv hold,there existsu such thatTuv andRxuhold. We can then define the membership relation: the setrepresentedB,b,S is a member of the setrepresented byA,a,R iff there existsa1 such thatRa1a andA,a1,R andB,b,S are bisimilar.
It can then be checked that all the usual axioms of set theoryextensionality, power set, union, comprehension over bounded formulae(and even antifoundation, so that the membership relation does notneed to be well-founded) hold in this simple model. (A bounded formulais a formula where all quantifications are of the form∀x ∈a … or ∃x∈a …). In this way it can been shown thatChurch's simple type theory is equiconsistent with the bounded versionof Zermelo's set theory.
There are deep connections between type theory and category theory.We limit ourselves to presenting two applications of type theory tocategory theory: the constructions of the free cartesian closedcategory and of the free topos (see the entry on category theory for anexplanation of “cartesian closed” and “topos”).
For building the free cartesian closed category, we extend simple typetheory with the type 1 (unit type) and the product typeA×B, forA,B types. The terms areextended by adding pairing operations and projections and a specialelement of type 1. As in Lambek and Scott 1986, one can then define anotion oftyped conversions between terms, and show that thisrelation is decidable. One can then show (Lambek and Scott 1986) thatthe category with types as objects and as morphisms fromA toB the set of closed terms of typeA →B (with conversion as equality) is the free cartesian closedcategory. This can be used to show that equality between arrows inthis category is decidable.
The theory of types of Church can also be used to build the freetopos. For this we take as objects pairsA,E withA type andE a partial equivalence relation, that isa closed termE: A →A →o which is symmetric and transitive. We take as morphismsbetweenA,E andB,F therelationsR: A→B→o that arefunctional that is such that for anya: A satisfyingEaa there exists one, and only one (moduloF) elementb ofB such thatFbbandRab. For the subobject classifier wetake the pairo,E withE: o→o→o definedas
EMN = and (implyMN) (implyNM)
One can then show that this category forms a topos, indeed the freetopos.
It should be noted that the type theory in Lambek and Scott 1986uses a variation of type theory, introduced by Henkin and refined byP. Andrews (2002) which is to have an extensional equality as theonly logical connective, i.e. a polymorphic constant
eq :A →A →o
and to define all logical connectives from this connective andconstantsT,F :o. For instance, one defines
∀P =df eq (λx. T)P
The equality at typeo is logical equivalence.
One advantage of the intensional formulation is that it allows for adirect notation of proofs based on λ-calculus (Martin-Löf1971 and Coquand 1986).
We have seen the analogy between the operation A → A → o ontypes and the powerset operation on sets. In set theory, the powersetoperation can be iterated transfinitely along the cumulativehierarchy. It is then natural to look for analogous transfiniteversions of type theory.
One such extension of Church's simple type theory is obtained byadding universes (Martin-Löf 1970). Adding a universe is areflection process: we add a typeU whose objectsare the types considered so far. For Church's simple type theory wewill have
o: U,i: U andA→B: U ifA: U,B: U
and, furthermore,A is a type ifA: U. We can then consider types such as
(A: U) →A →A
and functions such as
id =λA.λx. x :(A: U) →A →A
The function id takes as argument a “small” typeA: U and an elementx of typeA, and outputs an element of typeA. More generallyifT(A) is a type under the assumptionA: U, one can form the dependent type
(A: U) →T(A)
ThatM is of this type means thatMA: T(A) wheneverA: U. We get in this way extensions of typetheory whose strength is similar to the one of Zermelo's set theory(Miquel 2001). More powerful form of universes are considered in(Palmgren 1998). Miquel (2003) presents a version of type theory ofstrength equivalent to the one of Zermelo-Fraenkel.
One very strong form of universe is obtained by adding the axiomU: U. This was suggested by P. Martin-Löf in1970. J.Y. Girard showed that the resulting type theory isinconsistent as a logical system (Girard 1972). Though it seems at firstthat one could directly reproduce Russell's paradox using a set of allsets, such a direct paradox is actually not possible due to thedifference between sets and types. Indeed the derivation of acontradiction in such a system is subtle and has been rather indirect(though, as noticed in Miquel 2001, it can now be reduced toRussell's paradox by representing sets as pointed graphs). J.Y. Girardfirst obtained his paradox for a weaker system. This paradox wasrefined later (Coquand 1994 and Hurkens 1995). (The notion of puretype system, introduced in Barendregt 1992, is convenient forgetting a sharp formulation of these paradoxes.) Instead of the axiomU: U one assumes only
(A: U) →T(A) :U
ifT(A) :U[A: U]. Notice the circularity, indeed of thesame kind as the one that is rejected by the ramified hierarchy: wedefine an element of typeU by quantifying over all elementsofU. For instance the type
(A: U) →A →A: U
will be the type of the polymorphic identity function. Despite thiscircularity, J.Y. Girard was able to show normalisation for typesystems with this form of polymorphism. However, the extension ofChurch's simple type theory with polymorphism is inconsistent as alogical system, i.e. all propositions (terms of type o) areprovable.
J.Y. Girard's motivation for considering a type system withpolymorphism was to extend Gödel's Dialectica (Gödel 1958)interpretation to second-order arithmetic. He proved normalisationusing the reducibility method, that had been introduced by Tait (1967)while analysing Gödel 1958. It is quite remarkable that thecircularity inherent in impredicativity does not result innon-normalisable terms. (Girard's argument was then used to show thatcut-elimination terminates in Takeuti's sequent calculus presentedabove.) A similar system was introduced independently by J. Reynolds(1974) while analysing the notion of polymorphism in computerscience.
Martin-Löf's introduction of a type of all types comes from theidentification of the concept of propositions and types, suggested bythe work of Curry and Howard. It is worth recalling here his threemotivating points:
Given (1) and (2) we should have a type of propositions (as in simpletype theory), and given (3) this should also be the type of alltypes. Girard's paradox shows that one cannot have (1),(2) and (3)simultaneously. Martin-Löf's choice was to take away (2),restricting type theory to be predicative (and, indeed, the notion ofuniverse appeared first in type theory as a predicative version of thetype of all types). The alternative choice of taking away (3) isdiscussed in Coquand 1986.
The connections between type theory, set theory and category theorygets a new light through the work on Univalent Foundations (Voevodsky2009) and theAxiom of Univalence. This involves in anessential way the extension of type theory described in the previoussection, in particular dependent types, the view of propositions astypes, and the notion of universe of types. These development are alsorelevant for discussing the notion of structure, the importance ofwhich was for instance emphasized in Russell 1959.
Martin-Löf 1973 introduced a new basic typeIdA(a,b), ifaandb are in the typeA, which can be thought as thetype of equality proofs of the elementa andb. Animportant feature of this new type is that it can be iterated, so thatwe can consider the typeIdIdA(a,b)(p,q)ifp andq are oftypeIdA(a,b). If wethink of a type as a special kind of set, it is natural to conjecturethat such a type of equality proofs is always inhabited for any twoequality proofsp andq. Indeed, intuitively, thereseems to be at most an equality proof between two elementsaandb. Surprisingly, Hofmann and Streicher 1996 designed amodel of dependent type theory where this is not valid, that is amodel where they can be different proofs that two elements areequal. In this model, a type is interpreted by agroupoid andthe typeIdA(a,b) by theset of isomorphisms betweena andb, set which mayhave more than one element. The existence of this model has theconsequence that it cannot be proved in general in type theory that anequality type has at most one element. This groupoid interpretationhas been generalized in the following way, which gives an intuitiveinterpretation of the identity type. Atype is interpreted byatopological space, up to homotopy, and atypeIdA(a,b) isinterpreted by the type ofpaths connectingaandb. (See Awodey et al. 2013 and the HoTT book 2013.)
Voevodsky 2009 introduced the following stratification of types. (Thisstratification was motivated in part by this interpretation of a typeas a topological space, but can be understood directly withoutreference to this interpretation.) We say that a typeA isaproposition if wehaveIdA(a,b) for anyelementa andb ofA (this means that thetypeA has at most one element). We say that atypeA is aset if the typeIdA(a,b) is a propositionfor any elementa andb ofA. We say that atypeA is agroupoid if the typeIdA(a,b) is a set for anyelementa andb ofA. The justification ofthis terminology is that it can be shown, only using the rules of typetheory, that any such type can indeed be seen as a groupoid in theusual categorical sense, where the objects are the elements of thistype, the set of morphisms betweena andb beingrepresented bythesetIdA(a,b). Thecomposition is the proof of transitivity of equality, and the identitymorphism is the proof of reflexivity of equality. The fact that eachmorphism has an inverse corresponds to the fact that identity is asymmetric relation. This stratification can then be extended and wecan define when a type is a 2-groupoid, 3-groupoid and so on. In thisview,type theory appears as a vast generalization ofsettheory, since a set is a particular kind of type.
Voevodsky 2009 introduces also a notion ofequivalencebetween types, notion which generalizes in an uniform way the notionsoflogical equivalence between propositions,bijection between sets,categorical equivalencebetween groupoids, and so on. We say that a mapf:A→B is an equivalence if, for anyelementb inB the type of pairsa,p wherep is of typeIdB(fa,b), is aproposition and is inhabited. This expresses in a strong way that anelement inB is the image of exactly one element inA, and ifA andB are sets, we recover theusual notion of bijection between sets. (In general iff:A→B is an equivalence, then we havea mapB→A, which can be thought of as theinverse off.) It can be shown for instance that theidentity map is always an equivalence. LetEquiv(A,B) be the type of pairsf,p wheref:A→B andp is a proof thatf is an equivalence. Using thefact that the identity map is an equivalence we have an element ofEquiv(A,A) for any typeA. This impliesthat we have a map
IdU(A,B)→Equiv(A,B)
and theAxiom of Univalence states that this map is anequivalence. In particular, we have the implication
Equiv(A,B)→IdU(A,B)
and so if there is an equivalence between two small types then these types are equal.
This Axiom can be seen as a strong form of the extensionalityprinciple. It indeed generalizes the Axiom ofpropositionalextensionality mentionned by Church 1940, which states that twologically equivalent propositions are equal. Surprisingly, it alsoimplies the Axiom offunction extensionality, Axiom 10 inChurch 1940, which states that two pointwise equal functions are equal(Voevodsky 2009). It also directly implies that two isomorphic setsare equal, that two categorically equivalent groupoids are equal, andso one.
This can be used to give a formulation of the notionoftransport of structures (Bourbaki 1957) alongequivalences. For instance, letMA be the type ofmonoid structures on the setA: this is the type oftuplesm,e,p wherem is a binaryoperation onA ande an element ofAandp a proof that these elements satisfy the usual monoidlaws. The rule of substitution of equal by equal takes the form
IdU(A,B)→MA→MB
If there is a bijection betweenA andB they areequal by the Axiom of Univalence, and we can use this implication totransport any monoid structure ofA in a monoid structureofB.
We can also use this framework to refine Russell 1959 discussion onthe notion of structure. For instance, letMonoid be the typeof pairsA,p wherep is an elementofMA. Two such pairsA,pandB,q are isomorphic if there exists a bijectionf fromA toB such thatq is equalto the transport of structure ofp alongf. Aconsequence of the Axiom of Univalence is that two isomorphic elementsof the typeMonoid are equal, and hence shares the sameproperties. Notice that such a general transport of propertiesisnot possible when structures are formulated in a settheoretic framework. Indeed, in a set theoretic framework, it ispossible to formulate properties using the membership relations, forinstance the property that the carrier set of the structure containsthe natural number0, property that is not preserved ingeneral by isomorphisms. Intuitively, the set theoretical descriptionof a structure is not abstract enough since we can talk about the waythis structure is built up. This difference between set theory andtype theory is yet another illustration of the characterization byJ.Reynolds 1983 of a type structure as a “syntactical disciplinefor enforcing level of abstraction”.
How to cite this entry. Preview the PDF version of this entry at theFriends of the SEP Society. Look up this entry topic at theIndiana Philosophy Ontology Project (InPhO). Enhanced bibliography for this entryatPhilPapers, with links to its database.
[Please contact the author with suggestions.]
View this site from another server:
The Stanford Encyclopedia of Philosophy iscopyright © 2015 byThe Metaphysics Research Lab, Center for the Study of Language and Information (CSLI), Stanford University
Library of Congress Catalog Data: ISSN 1095-5054