Movatterモバイル変換


[0]ホーム

URL:


SEP home page
Stanford Encyclopedia of Philosophy

First-order Model Theory

First published Sat Nov 10, 2001; substantive revision Thu Jan 25, 2024

First-order model theory, also known as classical model theory, is abranch of mathematics that deals with the relationships betweendescriptions in first-order languages and the structures that satisfythese descriptions. From one point of view, this is a vibrant area ofmathematical research that brings logical methods (in particular thetheory of definition) to bear on deep problems of classicalmathematics. From another point of view, first-order model theory isthe paradigm for the rest ofmodel theory; it is the area in which many of the broader ideas of model theorywere first worked out.

1. First-order languages and structures

Mathematical model theory carries a heavy load of notation, and HTMLis not the best container for it. In what follows, syntactic objects(languages, theories, sentences) are generally written in roman orgreek letters (for example L, T, φ), and set-theoretic objectssuch as structures and their elements are written in italic(A,a). Two exceptions are that variables are italic(x,y) and that sequences of elements are writtenwith lower case roman letters (a, b).

We recall and refine some definitions from the entries onclassical logic andmodel theory. Asignature is a set of individual constants, predicatesymbols and function symbols; each of the predicate symbols andfunction symbols has anarity (for example it is binary ifits arity is 2). Each signature K gives rise to a first-orderlanguage, by building up formulas from the symbols in the signaturetogether with logical symbols (including =) and punctuation.

If K is a signature, then astructure of signature K, sayA, consists of the following items:

  1. A set called thedomain ofA and writtendom(A); it is usually assumed to be nonempty;
  2. for each individual constantc in K, an elementcA of dom(A);
  3. for each predicate symbolP of arityn, ann-ary relationPA ondom(A);
  4. for each function symbolF of arityn, ann-ary functionFA from dom(A) todom(A).

Theelements ofA are the elements ofdom(A). Likewise thecardinality orpowerofA is the cardinality of its domain. Since we can recoverthe signature K from the first-order language L that it generates, wecan and will refer to structures of signature K asL-structures. We think ofc as a name for theelementcA in the structureA,and likewise with the other symbols.

For example the field of real numbers forms a structureR whose elements are the real numbers, withsignature consisting of the individual constant 0 to name the numberzero, a 1-ary function symbol − for minus, and two 2-aryfunction symbols + and . for plus and times. At first sight wecan’t add a symbol to express 1/x, since all the namedfunctions have to be defined on the whole domain of the structure, andthere is no such real number as 1/0. But on second thoughts this isnot a serious problem; any competent mathematician puts the condition‘x is not zero’ before dividing byx,and so it never matters what the value of 1/0 is, and we canharmlessly take it to be 42. But most model theorists areuncomfortable with any kind of division by zero, so they stick withplus, times and minus.

If L is the first-order language of signature K, thenTarski’s model-theoretic truth definition tells us when a sentence of L is true inA, and when anassignment of elements ofA to variables satisfies a formulaof L inA. Instead of talking of assignments satisfying aformula, model theorists often speak of the set ofn-tuplesof elements ofA that isdefined by a formulaφ(v1,…,vn);the connection is that ann-tuple(a1,…,an) isin the defined set if and only if the assignment taking eachvi toaisatisfies the formula.

If φ is a sentence, we write

A ⊨ φ

to mean that φ is true inA, or in other words,A is a model of φ. Ifφ(v1,…,vn)is a formula with free variables as shown, we write

A ⊨ φ[a]

to mean that then-tuple a is in the set defined by φ.(The entry onclassical logic uses the notation ‘A,s ⊨ φ’, wheres is any assignment to all the variables of L that assigns toeach variablevi free in φ thei-th element in then-tuple a.)

Two L-structures that are models of exactly the same sentences of Lare said to beelementarily equivalent. Elementaryequivalence is an equivalence relation on the class of allL-structures. The set of all the sentences of L that are true in theL-structureA is called thecomplete theory ofA, in symbols Th(A). A theory that is Th(A)for some structureA is said to becomplete. (By thecompleteness theorem for first-order logic, for which see the entry onclassical logic, a theory is complete if and only if it is maximal syntacticallyconsistent.) The two structuresA andB areelementarily equivalent if and only if Th(A) =Th(B).

To continue the example of the fieldR ofreal numbers: It is often not at all obvious whether two givenstructures are or are not elementarily equivalent. One of the greatestachievements of the pre-history of model theory was Tarski’sdescription in 1930 of Th(R) (which hepublished in full only after the war; see his book in the Bibliographybelow). This description implied among other things that thestructures elementarily equivalent toR areexactly the real-closed fields, a class of fields which was alreadyknown to the algebraists in its own right.

When mathematicians introduce a class of structures, they like todefine what they count as the basic maps between these structures. Thebasic maps between structures of the same signature K are calledhomomorphisms, defined as follows. Ahomomorphism fromstructure A to structure B is a functionf fromdom(A) to dom(B) with the property that for everyatomic formulaφ(v1,…,vn)and anyn-tuple a =(a1,…,an) ofelements ofA,

A ⊨ φ [a] ⇒B ⊨ φ [b]

where b is(f(a1),…,f(an)).If we have ‘⇔’ in place of ‘⇒’ inthe quoted condition, we say thatf is anembeddingofA intoB. Since the language includes =, anembedding ofA intoB is always one-to-one, thoughit need not be onto the domain ofB. If it is onto, then theinverse map from dom(B) to dom(A) is also ahomomorphism, and both the embedding and its inverse are said to beisomorphisms. We say that two structures areisomorphic if there is an isomorphism from one to the other.Isomorphism is an equivalence relation on the class of all structuresof a fixed signature K. If two structures are isomorphic then theyshare all model-theoretic properties; in particular they areelementarily equivalent.

IfA andB are structures of signature K withdom(A) a subset of dom(B), and the interpretationsinA of the symbols in K are just the restrictions of theirinterpretations inB, then we say thatA is asubstructure ofB and converselyB is anextension ofA. If moreoverB has someelements that are not inA, we say thatA is aproper substructure ofB andB is anproper extension ofA. IfB is a structureandX is a nonempty subset of dom(B), then there isa unique smallest substructure ofB whose domain contains allofX. It is known as thesubstructure of B generated byX, and we find it by first adding toX all the elementscB wherec are individual constants of K,and then closing off under the functionsFB whereF are function symbols of K.

For example the substructure of the fieldRgenerated by the number 1 consists of 1, 0 (since it is named by theconstant 0), 1+1, 1+1+1 etc., −1, −2 etc., in other wordsthe ring of integers. (There is no need to close off undermultiplication too, since the set of integers is already closed undermultiplication.) If we had included a symbol for 1/x too, thesubstructure generated by 1 would have been the field of rationalnumbers. So the notion of substructure is sensitive to the choice ofsignature.

2. Elementary maps

Let L be a first-order language and letA andB beL-structures. Supposee is a function which takes someelements ofA to elements ofB. We say thate is anelementary map if whenever a sequence ofelementsa1, …,an in the domain ofe satisfy aformulaφ(x1,…,xn)of L inA, their images undere satisfy the sameformula inB; in symbols

A ⊨φ(a1,…,an)⇒B ⊨φ(e(a1),…,e(an)).

We say thate is anelementary embedding ofA intoB ife is an elementary map and itsdomain is the whole domain ofA. As the name implies,elementary embeddings are always embeddings.

If there is an elementary embedding fromA toB thenA andB are elementarily equivalent. On the otherhand an embedding between elementarily equivalent structures, or evenbetween isomorphic structures, need not be elementary. (For example,writingZ for the abelian group of theintegers with signature consisting of 0 and +, the embedding fromZ toZ that takeseach integern to 2n is an embedding, and of courseZ is isomorphic to itself; but thisembedding is not elementary, since 1 satisfies the formula¬∃y(y +y =v1), but 2 doesn’t.)

We say thatA is anelementary substructure ofB, andB is anelementary extension ofA, ifA is a substructure ofB and theinclusion map is an elementary embedding. It’s immediate fromthe definitions that an elementary extension of an elementaryextension ofA is again an elementary extension ofA.

Elementary embeddings are natural maps to consider within first-ordermodel theory. Around 1950 Abraham Robinson was impressed that mapsbetween algebraic structures in general seem hardly ever to beelementary, whereas some important maps (such as embeddings betweentwo algebraically closed fields, or between two real-closed fields)turn out to be elementary. He was also surprised to find that thisfact about algebraically closed fields is another way of stating acelebrated theorem called the Hilbert Nullstellensatz. Theseobservations of Robinson have had a huge effect on the development ofmodel theory. In Robinson’s terminology, a first-order theory ismodel-complete if every embedding between models of thetheory is elementary. This notion has found many uses, and it oftenappears in applications of model theory in algebra.

A notion that is closely related to model-completeness, but should notbe confused with it, is elimination of quantifiers. Suppose L is afirst-order language, T is a theory in L, and Φ is a set offormulas of L. We say that T haselimination of quantifiers downto Φ if for every formulaφ(x1,…,xn)of L there is a formulaψ(x1,…,xn)in Φ such that in every model of T, φ and ψ are satisfiedby exactly the samen-tuples of elements(a1,…,an).(The ‘method of elimination of quantifiers’ discussed insection 2.2 ofTarski’s truth definitions) was a syntactic and pre-model-theoretic method for provingelimination of quantifiers down to a particular set of formulas.) Atheory is said tohave quantifier elimination if it haselimination of quantifiers down to quantifier-free formulas.

The connection between model-completeness and elimination ofquantifiers is as follows. Robinson showed that a theory ismodel-complete if and only if it has elimination of quantifiers downto existential formulas (i.e. formulas that either are quantifier-freeor consist of one or more existential quantifiers followed by aquantifier-free formula). So theories that have quantifier eliminationare model-complete, but the converse need not hold. Still, showingthat a theory is model-complete is sometimes a useful first steptowards showing that it has quantifier elimination.

To return to elementary embeddings: They have a number of propertiesthat make them useful. We have space for four.

The downward Löwenheim-Skolem theorem:
Suppose L is a first-order language which has κ formulas,A is an L-structure and λ is a cardinal which is atleast κ but less than the cardinality ofA. Supposealso thatX is a set of at most λ elements ofA. ThenA has an elementary substructure which hascardinality exactly λ and contains all the elements inX.

There is a proof of this in the entry onclassical logic, using Skolem hulls. Note that λ must be infinite since everyfirst-order language has infinitely many formulas.

The elementary chain theorem:
Suppose that L is a first-order language andA0,A1, … is a sequence (of any length) ofL-structures such that any structure in the sequence is an elementarysubstructure of all the later structures in the sequence. Then thereis a unique smallest L-structureB which contains all thestructures in the sequence as substructures; this structureBis an elementary extension of all the structures in the sequence.

The elementary amalgamation theorem:
Suppose L is a first-order language,A is an L-structure andB,C are two elementary extensions ofA.Then there are an elementary extensionD ofB and anelementary embeddinge ofC intoD suchthat (i) for each elementa ofA,e(a) =a, and (ii) ifc is an element ofC but notofA, thene(c) is not inB.

The elementary amalgamation theorem is a consequence of thecompactness theorem in the next section.

The upward Löwenheim-Skolem theorem:
Suppose L is a first-order language which has κ formulas,A is an L-structure whose cardinality is an infinite cardinalμ, and λ is a cardinal which is at least as great as bothκ and μ. ThenA has an elementary extension whosecardinality is λ.

This also follows from the compactness theorem, as shown in the entryonclassical logic. The name of the theorem is a little unfortunate, since the theoremwas first proved by Tarski, and Skolem didn’t even believe it(because he didn’t believe in uncountable cardinals).

There is another proof using the elementary amalgamation theorem andthe elementary chain theorem. One can show that the structureA has a proper elementary extensionA′. (Thereis a proof of this using the compactness theorem and the diagram lemma— see 3.1 and 3.2 below; another proof is by ultrapowers —see 4.1 below.) Now useA′ and againA′for the structuresB andC in the elementaryamalgamation theorem. ThenD as in the theorem is anelementary extension ofA, and by (ii) in the theorem, itmust contain elements that are not inA′, so that it isa proper elementary extension. Repeat to get a proper elementaryextension ofD, and so on until you have an infiniteelementary chain. Use the elementary chain theorem to find anelementary extension ofA that sits on top of this chain.Keep repeating these moves until you have an elementary extension ofA that has cardinality at least λ. Then if necessaryuse the downward Löwenheim-Skolem theorem to pull the cardinalitydown to exactly λ. This kind of argument is very common infirst-order model theory. By careful choice of the amalgams at thesteps in the construction, we can often ensure that the top structurehas further properties that we might want (such as saturation, see 4.2below).

3. Five grand theorems

The five theorems reported in this section are in some sense thepillars of classical model theory. All of them are theorems aboutfirst-order model theory. A great deal of the work done in the thirdquarter of the twentieth century was devoted to working out theconsequences of these theorems within first-order model theory, andthe extent to which similar theorems hold for languages that are notfirst-order.

3.1 The compactness theorem

If T is a first-order theory, and every finite subset of T has amodel, then T has a model.

There is a proof of this theorem in the entry onclassical logic. The theorem has several useful paraphrases. For example it isequivalent to the following statement:

Suppose T is a first-order theory and φ is a first-order sentence.If T ⊨ φ then there is a finite subset U of T such that U⊨ φ.

(See the entry onmodel theory for the notion ⊨ of model-theoretic consequence. To derive thesecond statement from the first, note that ‘T ⊨φ’ is true if and only if there is no model of the theory T∪ {¬ φ}.)

Anatolii Mal’tsev first gave the compactness theorem in 1938(for first-order logic of any signature), and used it in 1940/1 toprove several theorems about groups; this seems to have been the firstapplication of model theory within classical mathematics. Leon Henkinand Abraham Robinson independently rediscovered the theorem a fewyears later and gave some further applications. The theorem failsbadly for nearly allinfinitary languages.

3.2 The diagram lemma

IfA is an L-structure, then we form thediagram ofA as follows. First add to L a supply of new individualconstants to serve as names for all the elements ofA. (Thisillustrates how in first-order model theory we easily find ourselvesusing uncountable signatures. The ‘symbols’ in thesesignatures are abstract set-theoretic objects, not marks on a page.)Then using L and these new constants, thediagram ofA is the set of all the atomic sentences and negations ofatomic sentences that are true inA.

IfB′ is a model of the diagram ofA, andB isB′ with the new constants removed fromthe signature, then there is an embedding ofA intoB.

Namely, if an element ofA is named by a new constantc, then map that element to the element ofB′namedc. A variant of this lemma is used in the proof of theelementary amalgamation theorem.

3.3 The Lyndon interpolation theorem

This theorem may have the longest pedigree of any theorem of modeltheory, since it generalises the Laws of Distribution for syllogisms,which go back at least to the early Renaissance. The theorem iseasiest to state if we assume that our first-order languages havesymbols ∧, ∨ and ¬, but not → or ⇔. Then anoccurrence of a predicate symbol R in a formula φ is said to bepositive (resp.negative) if it lies within thescope of an even (resp. odd) number of occurrences of ¬.

Suppose L and M are first-order languages, L ∪ M is the smallestfirst-order language containing both L and M, and L ∩ M is thelanguage consisting of all the formulas which are in both L and M.Suppose T is a theory in L, U is a theory in M, and no (L ∪M)-structure is both a model of T and a model of U. Then there is asentence φ of L ∩ M which is true in all models of T and falsein all models of U. (This sentence φ is called theinterpolant.) Moreover every predicate symbol with a positiveoccurrence in φ has a positive occurrence in some sentence of Tand a negative occurrence in some sentence of U, and conversely everypredicate symbol with a negative occurrence in φ has a negativeoccurrence in some sentence of T and a positive occurrence in somesentence of U.

There are several proofs of this theorem, and not all of them aremodel-theoretic. Without the last sentence, the theorem is known asCraig’s interpolation theorem, since William Craig proved thisversion a few years before Roger Lyndon found the full statement in1959. As Craig noted at the time, his interpolation theorem gives aneat proof of Evert Beth’s definability theorem, which runs asfollows.

Suppose that L is a first-order language and M is the first-orderlanguage got by adding to L a new predicate symbol R. Suppose alsothat T is a theory in M. We say that Timplicitly defines Rif it is false that there are two M-structures which are models of T,have the same elements and interpret all the symbols of L in the sameway but interpret the symbol R differently. We say that Tdefines Rexplicitly if there is a formulaφ(x1,…,xn)of L such that in every model of T, the formulas φ andR(x1,…,xn) aresatisfied by exactly the samen-tuples(a1,…,an) ofelements. It is easy to see that if T defines R explicitly then itdefines R implicitly. (This fact is known asPadoa’smethod; Padoa used the failure of implicit definability as a wayof proving the failure of explicit definability.) Beth’s theoremis the converse:

Suppose that L, M, R and T are as above. If T defines R implicitlythen T defines R explicitly.

3.4 The omitting types theorem

This theorem needs some preliminary definitions. Suppose Lis a first-order language, T is a complete theory in L, and Φ is aset of formulas of L which all have the free variablex (andno other free variable). We say that an L-structureArealises Φ if there is an element ofA thatsatisfies all the formulas in Φ; ifA has no suchelement, we say thatAomits Φ. A formulaψ(x) of L, with just the free variablex, iscalled asupport of Φ in T if the consequences of Tinclude the sentence ∃xψ(x) and thesentence ∀x(ψ(x) →φ(x)) whenever φ(x) is a formula in Φ.It is not hard to see that if there is a support of Φ in T thenevery model of T realises Φ. The converse is not always true, butthe omitting types theorem tells us that it is true if we restrictourselves to countable first-order languages:
Suppose L is a first-order language which has countably many formulas.Suppose T is a complete theory in L, and Φ is a set of formulas ofL which all have the free variablex. Finally suppose thatevery model of T with just countably many elements realises Φ.Then there is a support of Φ in T. (In other words, there is afinite reason why the ‘type’ Φ can’t be omittedin any model of T.)

The omitting types theorem goes back to the mid 1950s. It verydefinitely depends on the language being first-order and countable. Ithas several useful generalisations, for examplemodel-theoreticforcing, which is an analogue of the forcing construction inset theory. In fact the games used for model-theoretic forcing (see the entry onlogic and games) can be adapted to prove the omitting types theorem too. There aresimilar but more complicated theorems for uncountable first-orderlanguages; some of these can be paraphrased as omitting types theoremsforinfinitary languages.

3.5 The initial model theorem

A quantifier-free formula is said to be aHorn formula (afterAlfred Horn) if it has one of the three forms

  • ψ,
  • φ1 ∧ … ∧ φn→ ψ,
  • ¬(φ1 ∧ … ∧φn),

where the formulas φ1, …,φn, ψ are all atomic. Auniversal Hornsentence (also known to the computer scientists as aHornclause) is a sentence that consists of universal quantifiersfollowed by a quantifier-free Horn formula; it is said to bestrict if no negation sign occurs in it (i.e. if itdoesn’t come from a quantifier-free Horn formula of the thirdkind).

Let T be a theory consisting of strict universal Horn sentences. ThenT has a modelA with the property that for every modelB of T there is a unique homomorphism fromA toB. (Such a modelA is called aninitialmodel of T. It is unique up to isomorphism.)

This theorem is a generalisation, due to Mal’tsev, of agroup-theoretic construction calledconstruction by generators andrelations. It is the main idea behindalgebraicspecification, which is one approach to the specification ofsystems in computer science. The required behaviour of the system iswritten down as a set of strict universal Horn sentences, and then theinitial model of these sentences is an abstract version of therequired system.

4. Three useful constructions

A construction is a procedure for building a structure. We havealready seen several constructions in the theorems above: for examplethe omitting types construction and the initial model construction.Here are three more.

4.1 Products and reduced products

IfA andB are L-structures, we form theirproductC =A ×B asfollows. The elements ofC are the ordered pairs(a,b) wherea is an element ofAandb is an element ofB. The predicate symbols areinterpreted ‘pointwise’, i.e. so that for example

(a,b) is in PC if and only ifais in PA andb is inPB.

The structuresA andB are called thefactors ofA ×B. In the same way wecan form products of any number of structures. If all the factors of aproduct are the same structureA, the product is called apower ofA. A theorem called theFeferman-Vaughttheorem tells us how to work out the complete theory of theproduct from the complete theories of its factors.

This construction has some variants. We can define an equivalencerelation on the domain of a productC, and then take astructureD whose elements are the equivalence classes; thepredicate symbols are interpreted inD so as to make thenatural map from dom(C) to dom(D) a homomorphism. Inthis case the structureD is called areducedproduct of the factors ofC. It is areducedpower ofA if all the factors are equal toA;in this case thediagonal map fromA toDis the one got by taking each elementa to the equivalenceclass of the element (a,a,…).

Suppose we use a setI to index the factors in a productC. Anultrafilter overI is a setU of subsets ofI with the properties

  • if setsX andY are inU then so istheir intersectionX ∩ Y;
  • ifX is inU andX ⊆ Y ⊆ IthenY is inU;
  • for each subsetX ofI, exactly one ofX and its complementI \ X is inU.

If we have an ultrafilterU overI, then we canconstruct a reduced product fromC by making two elements ofC equivalent if and only if the set of indices at which theyare equal is a set in the ultrafilterU. This is indeed anequivalence relation on the domain ofC, and the resultingreduced product is called anultraproduct of the factors ofC. IfC is a power ofA then thisultraproduct is called anultrapower ofA, and it issometimes writtenU-prodA. A theorem calledŁoś’s theorem describes what sentences aretrue in an ultraproduct. Its most useful consequence is thefollowing:

IfU is an ultrafilter then the diagonal map fromAtoU-prodA is an elementary embedding.

If the ultrafilterU isnonprincipal, i.e. containsno finite sets, then the diagonal map is not onto the domain ofU-prodA, and in factU-prodA isgenerally much larger thanA. So we have a way ofconstructing large elementary extensions. The axiom of choiceguarantees that every infinite set has many nonprincipal ultrafiltersover it. Ultrapowers are an essential tool for handling largecardinals in set theory (see the entry onset theory).

A remarkable (but in practice not terribly useful) theorem of SaharonShelah tells us that a pair of structuresA andBare elementarily equivalent if and only if they have ultrapowers thatare isomorphic to each other.

4.2 Saturation

SupposeA is an L-structure,X is a set of elementsofA,B is an elementary extension ofA andb,c are two elements ofB. Thenbandc are said to havethe same type over X if forevery formulaφ(v1,…,vn+1)of L and everyn-tuple d of elements ofX,

B ⊨ φ[b,d] ⇔B ⊨φ[c,d].

We say thatA issaturated if wheneverX isa set of elements ofA, of cardinality less than that ofA, andB is any elementary extension ofA,we always have that every element ofB has the same type overX as some element already inA.

This rather heavy definition gives little clue how useful saturatedstructures are. If every structure had a saturated elementaryextension, many of the results of model theory would be much easier toprove. Unfortunately the existence of saturated elementary extensionsdepends on features of the surrounding universe of sets. There aretechnical ways around this obstacle, for example using weakenings ofthe notion of saturation. We have two main ways of constructingelementary extensions with some degree of saturation. One is byultrapowers, using cleverly constructed ultrafilters. The other is bytaking unions of elementary chains, generalising the proof we gave forthe upward Löwenheim-Skolem theorem.

The existence of partially saturated elementary extensions of thefieldR of real numbers is the maintechnical fact behind Abraham Robinson’snonstandardanalysis. See Section 4 of the entry onmodel theory for more information on this. Though model theory provided the firststeps in nonstandard analysis, this branch of analysis rapidly becamea subject in its own right, and its links with first-order modeltheory today are rather slim.

4.3 Ehrenfeucht-Mostowski models

LetA be an L-structure,X a set of elements ofA and < a linear ordering ofX (not necessarilydefinable by a first-order formula). We say that (X,<) isanindiscernible sequence inA if for every naturalnumbern, and all elementsa1,…,an,b1,…,bn ofA suchthata1 < … <an andb1 <… <bn, the map taking eachai to the correspondingbi is an elementary map. If T is a theorywith infinite models, then T has models that are the Skolem hulls (seethe entry onclassical logic) of indiscernible sequences. These models are known asEhrenfeucht-Mostowski models, after the two Polish modeltheorists who first carried out this construction in the mid 1950s.These models tend to be the opposite of saturated; we can arrange thatvery few types over sets of elements are represented among theirelements. Some important distinctions between different models of settheory can be expressed in terms of the indiscernible sequences withinthese models; see the entry onset theory.

5. Three successful programmes

Every healthy branch of mathematics needs a set of problems that forma serious challenge for its researchers. We close with a briefintroduction to some of the research programmes that drove first-ordermodel theory forwards in the second half of the twentieth century. Thebook of Marcja and Toffalori in the bibliography gives furtherinformation about these programmes. There are other current programmesbesides these; see for example the handbook edited by Yuri Ershov,which is about model theory when the structures are builtrecursively.

5.1. Categoricity and classification

In 1904 Oswald Veblen described a theory ascategorical if ithas just one model up to isomorphism, i.e. it has a model and all itsmodels are isomorphic to each other. (The name was suggested to him byJohn Dewey, who also suggested the name ‘disjunctive’ forother theories. This pair of terms come from traditional logic asnames of types of sentence.) The depressing news is that there are nocategorical first-order theories with infinite models; we can see thisat once from the upward Löwenheim-Skolem theorem. In fact if T isa first-order theory with infinite models, then the strongest kind ofcategoricity we can hope for in T is that for certain infinitecardinals κ, T has exactly one model of cardinality κ, upto isomorphism. This property of T is calledκ-categoricity.

Now there is a heuristic principle that many people have used, thoughit seems to have no simple formulation. We suggest ‘Few isbeautiful’. The principle says that if a first-order theory Tconstrains its models (of a particular cardinality) to be all similarto each other, this can only be because the models of T have fewirregularities and asymmetries. So there should be a good structuraldescription of these models. One should expect that they are‘good structures’ from the point of view of classicalmathematics. As a first step, one easily sees from the upward anddownward Löwenheim-Skolem theorems that if T isκ-categorical for some κ at least as large as the numberof formulas in the language of T, then T must be a complete theory.From now on, T is a complete theory with infinite models; and forsimplicity we will assume that the language of T is countable.

In 1954 Jerzy Łoś announced that he could only find threekinds of example of theories T that are κ-categorical.Namely:

  • T istotally categorical if it is κ-categorical forevery infinite cardinal κ. A typical example is the completetheory of an infinite-dimensional vector space over a finitefield.
  • T isuncountably categorical (but not totallycategorical) if it is κ-categorical precisely when κ isuncountable. Essentially the only example that Łoś couldfind was the complete theory of an algebraically closed field; this isuncountably categorical by a well-known theorem of Steinitz.
  • T iscountably categorical (but not uncountablycategorical) if it is κ-categorical precisely when κ iscountable. A typical example is the complete theory of a dense linearordering with no first or last element; this is countably categoricalby a well-known theorem of Cantor.

Łoś asked whether there are any other possibilities besidesthese three. (Of course most complete theories are notκ-categorical for any κ.)

This question of Łoś was a tremendous stimulus to research,and it led to a classic paper of Michael Morley in 1965 which showedthat Łoś’s three possibilities are in fact the onlyones. One central idea of Morley’s analysis was that models ofan uncountably categorical theory have the smallest possible number oftypes of element; this led directly to the branch of model theorycalledstability theory, which studies theories that have alimited number of element types. These theories have the remarkableproperty that every infinite indiscernible sequence in any of theirmodels is indiscernible under any linear ordering whatever; so thesesequences are a kind of generalisation of bases of vector spaces.Another idea implicit in Morley’s work, but much clarified bylater work of William Marsh, John Baldwin and Alistair Lachlan, wasthat in any model of an uncountably categorical theory there is acentral core (called astrongly minimal set) which carries adependence relation obeying similar laws to linear dependence invector spaces. In terms of this dependence relation one can define adimension for the model, and what remains of the model outside thecore is so closely tied to the core that the dimension determines themodel up to isomorphism.

Saharon Shelah developed Morley’s ideas with greatresourcefulness and energy. His main aim was to stretch the ‘Fewis beautiful’ idea by showing that there are clear dividinglines between kinds of theory T. On one side of a dividing line aretheories with some good structural property that forces the number ofnonisomorphic models of a given cardinality to be small. On the otherside, every theory has (for example) two models of the samecardinality that are not isomorphic but are extremely hard to tellapart. Shelah coined the nameclassification theory for thisresearch. The text of Lascar listed below is an elegant introductionto this whole programme, from Łoś to Shelah. MeanwhileShelah himself has extended it far beyond first-order logic. Even inthe first-order case, Shelah had to invent new set-theoretictechniques (such as proper forcing) to carry out hisconstructions.

Classification theory has two related though fundamentally distinctaims: to classify the models of a theory (or to show that such aclassification is impossible) by some relatively simple combinatorialinvariants and to classify theories themselves by the presence orabsence of some fundamental structures within their models. With thesecond edition of Classification Theory, Shelah dropped the subtitle“And the number of non-isomorphic models” in order toemphasize the broader goals of the project. In general, a class oftheories may be recognized as classification theoretically robust ifit admits characterizations both in terms of cardinal invariants andwith respect to some absolute condition on formulae relative to thetheory. For example, by definition, a theory T in a language L isstable if there is some cardinal κ ≥ |L| so that for everymodel M of T of cardinality at most κ, the number of 1-typesover M is at most κ. Equivalently, a theory T is stable if noformula has theorder property relative to T. That is, thereis no L-formula φ(x;y) (wherex andy may be tuples of variables) so that for each natural numbern it is consistent with T that there bea1,…,an;b1,…,bn so thatφ(ai,bj)holds just in caseij.

5.2. Geometric model theory

Geometric model theory grew out of Michael Morley’s 1965 paper,but in a different direction from Shelah’s work (though today itmakes regular use of technical tools developed by Shelah in hisclassification programme). Morley had shown that models of anuncountably categorical theory have structural properties that areinteresting in their own right, regardless of the complete theory ofthe structure; so it became the custom to talk ofuncountablycategorical structures, meaning models of uncountably categoricaltheories. (And likewisetotally categorical structures.)Independently Boris Zilber in Siberia and Greg Cherlin in the UnitedStates noticed that any infinite group that is definable in anuncountably categorical structure must have many features in commonwith the algebraic groups studied by algebraic geometers. Zilber inparticular showed that many methods from algebraic geometry generaliseto the model-theoretic case. His secret weapon was Bezout’sTheorem from geometry, which he used to guide him to solutions of verydifficult model theoretic problems; for example his theorem that nototally categorical theory can be axiomatised by a finite number ofaxioms. (It was secret in the sense that it guided his intuition butnever appeared explicitly in his results.) Zilber also noticed animportant difference between the first and second ofŁoś’s examples above. Namely, in a vector space thesubspaces (i.e. the subsets closed under linear dependence) form amodular lattice; but the algebraically closed subsets of analgebraically closed field form a lattice that is not modular.

Partly because of the difficulty of communications between Siberia andthe West, these results of Zilber took some time to digest, and inpart they had to be rediscovered in the West. But when the message didfinally get through, the result was a new branch of model theory whichhas come to be known asgeometric model theory. The programmeis broadly to classify structures according to (a) what groups orfields are interpretable in them (in the sense sketched in the entryonmodel theory) and (b) whether or not the structures have ‘modulargeometries’; and then to use this classification to solveproblems in model theory and geometry. From the mid 1980s the leaderof this research was Ehud Hrushovski. In the early 1990s, using jointwork with Zilber, Hrushovski gave a model-theoretic proof (the firstcomplete proof to be found) of the geometric Mordell-Lang conjecturein all characteristics; this was a conjecture in classical diophantinegeometry. Bouscaren (ed.) 1998 is devoted to Hrushovski’s proofand the necessary background in model theory. Both (a) and (b) arefundamental to Hrushovski’s argument.

5.3. O-minimality

Of the three programmes described here, this is the oldest, since itgrew out of Tarski’s description of the complete theory of thefield of real numbers (which he proved by the method of quantifierelimination). In the course of giving this description, Tarski hadshown that every first-order formula φ(x) in the relevantlanguage, possibly with parameters, is satisfied by exactly the sameassignments as some boolean combination of formulas of the formx< s ort < x wheres,t areconstant terms naming parameters. Another way of saying this isthat

Every set of elements definable by a first-order formula is a finiteunion of open intervals with named endpoints, together with somefinite set of elements.

A linearly ordered structure with this property is said to beo-minimal. (The idea of the name is that o-minimality is ananalogue of ‘strong minimality’, in a form that makessense for structures that carry a linear ordering, whence‘o-’ for ordering.)

In 1982 Lou van den Dries showed that the fact that the field of realnumbers is o-minimal gives a large amount of useful information aboutthe definable sets of higher dimension, such as the family ofdefinable subsets of the real plane. Soon after this, Julia Knight,Anand Pillay and Charles Steinhorn noticed that if a structureA is o-minimal, then so is any structure elementarilyequivalent toA, and that van den Dries’ analysis ofhigher-dimensional definable set applies to all these structures.These results led to much activity on the frontier between modeltheory and function theory. Several old problems from model theory andfunction theory were solved. Alex Wilkie showed that the field of realnumbers with a symbol for exponentiation is o-minimal and has amodel-complete complete theory, and thereby gave a positive answer toTarski’s old problem of whether this structure allows aquantifier elimination, though his method was very far from thesyntactic analysis that Tarski had in mind. (This is one case where weneed to remember the difference between being model-complete andhaving quantifier elimination; see section 2 above. The questionwhether this particular theory has quantifier elimination is muchharder and is closely related to a deep conjecture of number theorycalled Schanuel’s conjecture; see the paper of Macintyre andWilkie.) We now know a wide range of ways of adding interestingfeatures to the field of real numbers in such a way that the resultingstructure is still o-minimal (and hence in some sense mathematicallytractable). Van den Dries has urged that o-minimal structures providea good setting for developing the ‘tame topology’programme of Alexander Grothendieck.

In 2006, Jonathan Pila and Alex Wilkie showed that provided oneremoves the subsets defined using only polynomial inequalities,subsets ofRn definablein an o-minimal expansion of the real field have few rational points.Subsequently, following a strategy first employed by Pila and UmbertoZannier to reprove the Manin-Mumford conjecture various authors haveused this o-minimal counting theorem to solve some important openproblems in diophantine geometry.

Kobi Peterzil and Sergei Starchenko have developed a theory ofo-minimalcomplex analysis. Just as with the classicalapproach to complex analysis, one may interpret the complex numbers asthe set of ordered pairs of real numbers with addition andmultiplication defined by the usual rules involving their real andimaginary parts. Of their results in this area, their algebraicitytheorem, which asserts that if a subset ofCn is complex analytic(meaning that it is closed and is locally defined by the vanishing offinitely many complex analytic functions) and definable in someo-minimal expansion of the real field, then it must be algebraic, thatis defined by the vanishing of polynomial equations, is the moststriking result and has been strong consequences in the study offunctional transcendence and homogeneous dynamics.

All three of these programmes generated new techniques for proof,constructions and classifications. As we should expect, researchershave explored the range of application of each technique. One resultof this has been the emergence of several useful classes offirst-order theories which relate to more than one of the threeprogrammes. For example a central tool of Shelah’sclassification theory was his notion offorking, afar-reaching generalisation of earlier algebraic notions of dependencerelation. The class ofsimple theories is defined by the factthat forking has certain nice properties while the class ofrosytheories is characterised by the existence of a good notion ofindependence coming from a further generalisation of forking calledþ-forking; several natural examples of simple theoriescame to light in geometric model theory, and the complete theories ofo-minimal structures are examples of rosy theories. Other classes oftheories, denoted by increasing obscure abbreviations, such at theclass of NSOP1(“not the stronger order property,one”) theories, have been isolated for which some of the seeminglycharacteristic features of stable theories persist in a modifiedform. For example, the afore mentionedNSOP1 are defined by the absence of a certain treeconfiguration but are also characterized by the presence of a goodtheory of independence, which goes under the name of“Kim-independence”. In parallel with these technicaladvances, first-order model theory continues to grow more closelyinvolved with problems in number theory, functional analysis and otherbranches of pure and, and even applied, mathematics.

In Shelah’s programme of classification theory, a central rôleis played bydividing lines. That is, the class of alltheories should be divided into those having some property and thosewhich do not and this division into these two classes should beauthentic in the sense that the classes should admit variousdefinitions of different characters. For example, the class of stabletheories may be described in many facially dissimilar ways as, forexample, those theories with respect to which every type is definable,those with respect to which no formula has the order property, orthose for which there is some infinite cardinalκ atleast as large as that of the language for which over every model ofsizeκ there are no more thanκ many1-types. Many of these classification theoretic classes aredefined by forbidding the existence of certain combinatorialconfigurations. For example, the class ofdependent orNIP (for “Not the Independence Property”) theories is definedby saying that forno formulaφ(x,y) is it possible to find a modelMand sequences(ai)i=0 and(bS) where theb sequence is indexed bythe subsets of the natural numbers so thatM ⊧φ(ai,bS) if and only ifi∈ S. Around the same time that Shelah isolated theindependence property, Vladimir Vapnik and Alexey Chervonenkisdiscovered the same notion in machine learning theory. Consequences ofthe model theoretic analysis of NIP have since been drawn in theory ofneural networks, extremal combinatorics, the theory of differentialprivacy, and machine learning theory more broadly.

Bibliography

  • Beth, E., 1953, “On Padoa’s method in the theory ofdefinition”,Nederlandse Akademie van Wetenschappen,Proceedings (Series A), 56: 330–339.
  • Bouscaren, E. (ed.), 1998,Model Theory and AlgebraicGeometry: An introduction to E. Hrushovski’s proof of thegeometric Mordell-Lang conjecture (Lecture Notes in Mathematics:Volume 1696), Berlin: Springer-Verlag.
  • Buechler, S., 1996,Essential Stability Theory, Berlin:Springer-Verlag.
  • Chang, C. and Keisler, J., 1990,Model Theory, Amsterdam:North-Holland.
  • Chatzidakis, Z.et al. (eds.), 2008,Model Theorywith Applications to Algebra and Analysis, Volumes 1 and 2,Cambridge: Cambridge University Press.
  • Dries, L. van den, 1998,Tame Topology and O-minimalStructures, Cambridge: Cambridge University Press.
  • Ealy, C. and Onshuus, A., 2007, “Characterizing rosytheories”,Journal of Symbolic Logic, 72:919–940.
  • Ehrig, H. and Mahr, B., 1985,Fundamentals of AlgebraicSpecification I: Equations and Initial Semantics, Berlin:Springer-Verlag.
  • Ershov, Y. (ed.), 1998,Handbook of Recursive Mathematics I,Recursive Model Theory, New York: Elsevier.
  • Hart, B., Lachlan, A. and Valeriote, M., 1996,Algebraic ModelTheory, Dordrecht: Kluwer.
  • Haskell, D., Pillay, A. and Steinhorn, C., 2000,Model Theory,Algebra, and Geometry, Cambridge: Cambridge UniversityPress.
  • Hodges, W., 1993,Model Theory, Cambridge: CambridgeUniversity Press.
  • –––, 1998, “The laws of distribution forsyllogisms”,Notre Dame Journal of Formal Logic, 39:221–230.
  • Karpinski, M. and A. Macintyre, 1997, Polynomial bounds for VCdimension of sigmoidal and general Pfaffian neural networks, 1stAnnual Dagstuhl Seminar on Neural Computing (1994),Journal ofComputer and System Sciences, 54(1[2]): 169–176.
  • Lascar, D., 1986,Stability in Model Theory, Harlow:Longman.
  • Macintyre, A. and Wilkie, A., 1996, “On the decidability ofthe real exponential field”, inKreiseliana: About andaround Georg Kreisel, P. Odifreddi (ed.), Wellesley MA : A. K.Peters, 441–467.
  • Marcja, A. and Toffalori, C., 2003,A Guide to Classical andModern Model Theory, Dordrecht: Kluwer.
  • Marker, D., 2002,Model Theory: An Introduction, NewYork: Springer-Verlag.
  • Morley, M., 1965, “Categoricity in power”,Transactions of the American Mathematical Society, 114:514–538.
  • Peterzil, Y. and S. Starchenko, Sergei, 2009, Complex analyticgeometry and analytic-geometric categories,Journal für diereine und angewandte Mathematik, 626: 39–74.
  • Pillay, A., 1996,Geometric Stability Theory, Oxford:Oxford University Press.
  • Pila, J., 2011, “O-minimality and the André-Oortconjecture forCn”,Annals of Mathematics (2), 173(3): 1779–1840.
  • Pila, J. and Wilkie, A., 2006, “The rational points of adefinable set”,Duke Mathematics Journal, 133(3):591–616.
  • Pila, J. and Zannier, U., 2008, “Rational points in periodicanalytic sets and the Manin-Mumford conjecture”,RendicontiLincei Matematica E Applicazioni, 19(2): 149–162.
  • Poizat, B., 2000,A Course in Model Theory, New York:Springer.
  • Shelah, S., 1990,Classification Theory, Amsterdam:North-Holland.
  • Tarski, A., 1951,A Decision Method for Elementary Algebra andGeometry, Berkeley: University of California Press.
  • Vaught, R., 1974, “Model theory before 1945”, inProceedings of the Tarski Symposium, L. Henkin,etal. (eds.), Providence RI : American Mathematical Society,153–172.
  • Veblen, O., 1904, “A System of Axioms for Geometry”,Transactions of the American Mathematical Society, 5(3):343–384
  • Wagner, F., 2000,Simple Theories, Dordrecht: KluwerAcademic Publishers.
  • Wilkie, A., 1996, “Model completeness results for expansionsof the real field by restricted Pfaffian functions and the exponentialfunction”,Journal of the American MathematicalSociety, 9: 1051–1094.

Other Internet Resources

Copyright © 2024 by
Wilfrid Hodges
Thomas Scanlon<scanlon@math.berkeley.edu>

Open access to the SEP is made possible by a world-wide funding initiative.
The Encyclopedia Now Needs Your Support
Please Read How You Can Help Keep the Encyclopedia Free

Browse

About

Support SEP

Mirror Sites

View this site from another server:

USA (Main Site)Philosophy, Stanford University

The Stanford Encyclopedia of Philosophy iscopyright © 2024 byThe Metaphysics Research Lab, Department of Philosophy, Stanford University

Library of Congress Catalog Data: ISSN 1095-5054


[8]ページ先頭

©2009-2025 Movatter.jp