Model theory began with the study of formal languages and theirinterpretations, and of the kinds of classification that a particularformal language can make. Mainstream model theory is now asophisticated branch of mathematics (see the entry onfirst-order model theory). But in a broader sense, model theory is the study of theinterpretation of any language, formal or natural, by means ofset-theoretic structures, with Alfred Tarski’struth definition as a paradigm. In this broader sense, model theory meets philosophyat several points, for example in the theory of logical consequenceand in the semantics of natural languages.
Sometimes we write or speak a sentence \(S\) that expresses nothingeither true or false, because some crucial information is missingabout what the words mean. If we go on to add this information, sothat \(S\) comes to express a true or false statement, we are said tointerpret \(S\), and the added information is called aninterpretation of \(S\). If the interpretation \(I\) happensto make \(S\) state something true, we say that \(I\) is amodel of \(S\), or that \(I\)satisfies \(S\), insymbols ‘\(I \vDash S\)’. Another way of saying that \(I\)is a model of \(S\) is to say that \(S\) istrue in \(I\),and so we have the notion ofmodel-theoretic truth, which istruth in a particular interpretation. But one should remember that thestatement ‘\(S\) is true in \(I\)’ is just a paraphrase of‘\(S\), when interpreted as in \(I\), is true’; somodel-theoretic truth is parasitic on plain ordinary truth, and we canalways paraphrase it away.
For example I might say
He is killing all of them,
and offer the interpretation that ‘he’ is AlfonsoArblaster of 35 The Crescent, Beetleford, and that ‘them’are the pigeons in his loft. This interpretation explains (a) whatobjects some expressions refer to, and (b) what classes somequantifiers range over. (In this example there is one quantifier:‘all of them’). Interpretations that consist of items (a)and (b) appear very often in model theory, and they are known asstructures. Particular kinds of model theory use particularkinds of structure; for example mathematical model theory tends to useso-calledfirst-order structures, model theory of modallogics usesKripke structures, and so on.
The structure \(I\) in the previous paragraph involves one fixedobject and one fixed class. Since we described the structure today,the class is the class of pigeons in Alfonso’s loft today, notthose that will come tomorrow to replace them. If Alfonso Arblasterkills all the pigeons in his loft today, then \(I\) satisfies thequoted sentence today but won’t satisfy it tomorrow, becauseAlfonso can’t kill the same pigeons twice over. Depending onwhat you want to use model theory for, you may be happy to evaluatesentences today (the default time), or you may want to record how theyare satisfied at one time and not at another. In the latter case youcan relativise the notion of model and write ‘\(I \vDash_tS\)’ to mean that \(I\) is a model of \(S\) at time \(t\). Thesame applies to places, or to anything else that might be picked up byother implicit indexical features in the sentence. For example if youbelieve in possible worlds, you can index \(\vDash\) by the possibleworld where the sentence is to be evaluated. Apart from using settheory, model theory is completely agnostic about what kinds of thingexist.
Note that the objects and classes in a structure carry labels thatsteer them to the right expressions in the sentence. These labels arean essential part of the structure.
If the same class is used to interpret all quantifiers, the class iscalled thedomain oruniverse of the structure. Butsometimes there are quantifiers ranging over different classes. Forexample if I say
One of those thingummy diseases is killing all the birds.
you will look for an interpretation that assigns a class of diseasesto ‘those thingummy diseases’ and a class of birds to‘the birds’. Interpretations that give two or more classesfor different quantifiers to range over are said to bemany-sorted, and the classes are sometimes called thesorts.
The ideas above can still be useful if we start with a sentence \(S\)that does say something either true or false without needing furtherinterpretation. (Model theorists say that such a sentence isfullyinterpreted.) For example we can considermisinterpretations \(I\) of a fully interpreted sentence\(S\). A misinterpretation of \(S\) that makes it true is known as anonstandard orunintended model of \(S\). The branchof mathematics called nonstandard analysis is based on nonstandardmodels of mathematical statements about the real or complex numbersystems; seeSection 4 below.
One also talks ofmodel-theoretic semantics of naturallanguages, which is a way ofdescribing the meanings ofnatural language sentences, not a way ofgiving themmeanings. The connection between this semantics and model theory is alittle indirect. It lies in Tarski’s truth definition of 1933.See the entry onTarski’s truth definitions for more details.
A sentence \(S\) divides all its possible interpretations into twoclasses, those that are models of it and those that are not. In thisway it defines a class, namely the class of all its models, written\(\Mod(S)\). To take a legal example, the sentence
The first person has transferred the property to the second person,who thereby holds the property for the benefit of the third person.
defines a class of structures which take the form of labelled4-tuples, as for example (writing the label on the left):
This is a typical model-theoretic definition, defining a class ofstructures (in this case, the class known to the lawyers astrusts).
We can extend the idea of model-theoretic definition from a singlesentence \(S\) to a set \(T\) of sentences; \(\Mod(T)\) is the classof all interpretations that are simultaneously models of all thesentences in \(T\). When a set \(T\) of sentences is used to define aclass in this way, mathematicians say that \(T\) is atheoryor aset of axioms, and that \(T\)axiomatises theclass \(\Mod(T)\).
Take for example the following set of first-order sentences:
\[\begin{align*}& \forall x\forall y\forall z (x + (y + z) = (x + y) + z). \\ & \forall x (x + 0 = x). \\ & \forall x (x + (-x) = 0). \\ & \forall x\forall y (x + y = y + x). \end{align*}\]Here the labels are the addition symbol ‘+’, the minussymbol ‘\(-\)’ and the constant symbol ‘0’. Aninterpretation also needs to specify a domain for the quantifiers.With one proviso, the models of this set of sentences are preciselythe structures that mathematicians know asabelian groups.The proviso is that in an abelian group \(A\), the domain shouldcontain the interpretation of the symbol 0, and it should be closedunder the interpretations of the symbols + and \(-\). In mathematicalmodel theory one builds this condition (or the correspondingconditions for other function and constant symbols) into thedefinition of a structure.
Each mathematical structure is tied to a particular first-orderlanguage. A structure contains interpretations of certain predicate,function and constant symbols; each predicate or function symbol has afixed arity. The collection \(K\) of these symbols is called thesignature of the structure. Symbols in the signature areoften callednonlogical constants, and an older name for themisprimitives. The first-order language of signature \(K\) isthe first-order language built up using the symbols in \(K\), togetherwith the equality sign =, to build up its atomic formulas. (See theentry onclassical logic.) If \(K\) is a signature, \(S\) is a sentence of the language ofsignature \(K\) and \(A\) is a structure whose signature is \(K\),then because the symbols match up, we know that \(A\) makes \(S\)either true or false. So one defines the class of abelian groups to bethe class of all those structures of signature \(+\), \(-\), \(0\)which are models of the sentences above. Apart from the fact that ituses a formal first-order language, this is exactly thealgebraists’ usual definition of the class of abelian groups;model theory formalises a kind of definition that is extremely commonin mathematics.
Now the defining axioms for abelian groups have three kinds of symbol(apart from punctuation). First there is the logical symbol = with afixed meaning. Second there are the nonlogical constants, which gettheir interpretation by being applied to a particular structure; oneshould group the quantifier symbols with them, because the structurealso determines the domain over which the quantifiers range. And thirdthere are the variables \(x, y\) etc. This three-level pattern ofsymbols allows us to define classes in a second way. Instead oflooking for the interpretations of the nonlogical constants that willmake a sentence true, wefix the interpretations of thenonlogical constants by choosing a particular structure \(A\), and welook for assignments of elements of \(A\) to variables which will makea given formula true in \(A\).
For example let \(\mathbb{Z}\) be the additive group of integers. Itselements are the integers (positive, negative and 0), and the symbols\(+\), \(-\), \(0\) have their usual meanings. Consider theformula
\[ v_1 + v_1 = v_2. \]If we assign the number \(-3\) to \(v_1\) and the number \(-6\) to\(v_2\), the formula works out as true in \(\mathbb{Z}\). We expressthis by saying that the pair \((-3,-6)\)satisfies thisformulain \(\mathbf{Z}\). Likewise (15,30) and (0,0) satisfyit, but \((2,-4)\) and (3,3) don’t. Thus the formuladefines a binary relation on the integers, namely the set ofpairs of integers that satisfy it. A relation defined in this way in astructure \(A\) is called afirst-order definable relation in\(A\). A useful generalisation is to allow the defining formula to useadded names for some specific elements of \(A\); these elements arecalledparameters and the relation is thendefinable withparameters.
This second type of definition, defining relations inside a structurerather than classes of structure, also formalises a commonmathematical practice. But this time the practice belongs to geometryrather than to algebra. You may recognise the relation in the field ofreal numbers defined by the formula
\[ v_1^2 + v_2^2 = 1. \]It’s the circle of radius 1 around the origin in the real plane.Algebraic geometry is full of definitions of this kind.
During the 1940s it occurred to several people (chiefly AnatoliiMal’tsev in Russia, Alfred Tarski in the USA and AbrahamRobinson in Britain) that the metatheorems of classical logic could beused to prove mathematical theorems about classes defined in the twoways we have just described. In 1950 both Robinson and Tarski wereinvited to address the International Congress of Mathematicians atCambridge Mass. on this new discipline (which as yet had no name– Tarski proposed the name ‘theory of models’ in1954). The conclusion of Robinson’s address to that Congress isworth quoting:
[The] concrete examples produced in the present paper will have shownthat contemporary symbolic logic can produce useful tools –though by no means omnipotent ones – for the development ofactual mathematics, more particularly for the development of algebraand, it would appear, of algebraic geometry. This is the realisationof an ambition which was expressed by Leibniz in a letter to Huyghensas long ago as 1679. (Robinson 1952, 694)
In fact Mal’tsev had already made quite deep applications ofmodel theory in group theory several years earlier, but under thepolitical conditions of the time his work in Russia was not yet knownin the West. By the end of the twentieth century, Robinson’shopes had been amply fulfilled; see the entry onfirst-order model theory.
There are at least two other kinds of definition in model theorybesides these two above. The third is known asinterpretation(a special case of the interpretations that we began with). Here westart with a structure \(A\), and we build another structure \(B\)whose signature need not be related to that of \(A\), by defining thedomain \(X\) of \(B\) and all the labelled relations and functions of\(B\) to be the relations definable in \(A\) by certain formulas withparameters. A further refinement is to find a definable equivalencerelation on \(X\) and take the domain of \(B\) to be not \(X\) itselfbut the set of equivalence classes of this relation. The structure\(B\) built in this way is said to beinterpreted in thestructure \(A\).
A simple example, again from standard mathematics, is theinterpretation of the group \(\mathbb{Z}\) of integers in thestructure \(\mathbb{N}\) consisting of the natural numbers 0, 1, 2etc. with labels for 0, 1 and +. To construct the domain of\(\mathbb{Z}\) we first take the set \(X\) of all ordered pairs ofnatural numbers (clearly a definable relation in \(\mathbb{N})\), andon this set \(X\) we define the equivalence relation \(\sim\) by
\[ (a,b) \sim(c,d) \text{ if and only if } a + d = b + c \](again definable). The domain of \(\mathbb{Z}\) consists of theequivalence classes of this relation. We define addition on\(\mathbb{Z}\) by
\[ (a,b) + (c,d) = (e,f) \text{ if and only if } a + c + f = b + d + e. \]The equivalence class of \((a,b)\) becomes the integer \(a - b\).
When a structure \(B\) is interpreted in a structure \(A\), everyfirst-order statement about \(B\) can be translated back into afirst-order statement about \(A\), and in this way we can read off thecomplete theory of \(B\) from that of \(A\). In fact if we carry outthis construction not just for a single structure \(A\) but for afamily of models of a theory \(T\), always using the same definingformulas, then the resulting structures will all be models of a theory\(T'\) that can be read off from \(T\) and the defining formulas. Thisgives a precise sense to the statement that the theory \(T'\) isinterpreted in the theory \(T\). Philosophers of science havesometimes experimented with this notion of interpretation as a way ofmaking precise what it means for one theory to be reducible toanother. But realistic examples of reductions between scientifictheories seem generally to be much subtler than this simple-mindedmodel-theoretic idea will allow. See the entry onintertheory relations in physics.
The fourth kind of definability is a pair of notions, implicitdefinability and explicit definability of a particular relation in atheory. See section 3.3 of the entry onfirst-order model theory.
Unfortunately there used to be a very confused theory aboutmodel-theoretic axioms, that also went under the name of implicitdefinition. By the end of the nineteenth century, mathematicalgeometry had generally ceased to be a study of space, and it hadbecome the study of classes of structures which satisfy certain‘geometric’ axioms. Geometric terms like‘point’, ‘line’ and ‘between’survived, but only as the primitive symbols in axioms; they no longerhad any meaning associated with them. So the old question, whetherEuclid’s parallel postulate (as a statement about space) wasdeducible from Euclid’s other assumptions about space, was nolonger interesting to geometers. Instead, geometers showed that if onewrote down an up-to-date version of Euclid’s other assumptions,in the form of a theory \(T\), then it was possible to find models of\(T\) which fail to satisfy the parallel postulate. (See the entry ongeometry in the 19th century for the contributions of Lobachevski and Klein to this achievement.)In 1899 David Hilbert published a book in which he constructed suchmodels, using exactly the method of interpretation that we have justdescribed.
Problems arose because of the way that Hilbert and others describedwhat they were doing. The history is complicated, but roughly thefollowing happened. Around the middle of the nineteenth century peoplenoticed, for example, that in an abelian group the minus function isdefinable in terms of 0 and + (namely: \(-a\) is the element \(b\)such that \(a + b = 0)\). Since this description of minus is in factone of the axioms defining abelian groups, we can say (using a termtaken from J. D. Gergonne, who should not be held responsible for thelater use made of it) that the axioms for abelian groupsimplicitly define minus. In the jargon of the time, one saidnot that the axioms define the function minus, but that they definetheconcept minus. Now suppose we switch around and try todefine plus in terms of minus and 0. This way round it can’t bedone, since one can have two abelian groups with the same 0 and minusbut different plus functions. Rather than say this, the nineteenthcentury mathematicians concluded that the axioms only partially defineplus in terms of minus and 0. Having swallowed that much, they went onto say that the axioms together form an implicit definition of theconcepts plus, minus and 0 together, and that this implicit definitionis only partial but it says about these concepts precisely as much aswe need to know.
One wonders how it could happen that for fifty years nobody challengedthis nonsense. In fact some people did challenge it, notably thegeometer Moritz Pasch who in section 12 of hisVorlesungenüber Neuere Geometrie (1882) insisted that geometric axiomstell us nothing whatever about the meanings of ‘point’,‘line’ etc. Instead, he said, the axioms give usrelations between the concepts. If one thinks of a structureas a kind of ordered \(n\)-tuple of sets etc., then a class\(\Mod(T)\) becomes an \(n\)-ary relation, and Pasch’s accountagrees with ours. But he was unable to spell out the details, andthere is some evidence that his contemporaries (and some more recentcommentators) thought he was saying that the axioms may not determinethe meanings of ‘point’ and ‘line’, but theydo determine those of relational terms such as ‘between’and ‘incident with’! Frege’s demolition of theimplicit definition doctrine was masterly, but it came too late tosave Hilbert from saying, at the beginning of hisGrundlagen derGeometrie, that his axioms give ‘the exact andmathematically adequate description’ of the relations‘lie’, ‘between’ and ‘congruent’.Fortunately Hilbert’s mathematics speaks for itself, and one cansimply bypass these philosophical faux pas. The model-theoreticaccount that we now take as a correct description of this line of workseems to have surfaced first in the group around Giuseppe Peano in the1890s, and it reached the English-speaking world through BertrandRussell’sPrinciples of Mathematics in 1903.
Suppose \(L\) is a language of signature \(K, T\) is a set ofsentences of \(L\) and \(\phi\) is a sentence of \(L\). Then therelation
\[ \Mod(T) \subseteq \Mod(\phi) \]expresses that every structure of signature \(K\) which is a model of\(T\) is also a model of \(\phi\). This is known as themodel-theoretic consequence relation, and it is written forshort as
\[ T \vDash \phi \]The double use of \(\vDash\) is a misfortune. But in the particularcase where \(L\) is first-order, the completeness theorem (see theentry onclassical logic) tells us that ‘\(T \vDash \phi\)’ holds if and only ifthere is a proof of \(\phi\) from \(T\), a relation commonlywritten
\[ T \vdash \phi \]Since \(\vDash\) and \(\vdash\) express exactly the same relation inthis case, model theorists often avoid the double use of \(\vDash\) byusing \(\vdash\) for model-theoretic consequence. But since whatfollows is not confined to first-order languages, safety suggests westick with \(\vDash\) here.
Before the middle of the nineteenth century, textbooks of logiccommonly taught the student how to check the validity of an argument(say in English) by showing that it has one of a number of standardforms, or by paraphrasing it into such a form. The standard forms weresyntactic and/or semantic forms of argument in English. The processwas hazardous: semantic forms are almost by definition not visible onthe surface, and there is no purely syntactic form that guaranteesvalidity of an argument. For this reason most of the old textbooks hada long section on ‘fallacies’ – ways in which aninvalid argument may seem to be valid.
In 1847 George Boole changed this arrangement. For example, tovalidate the argument
All monarchs are human beings. No human beings are infallible.Therefore no infallible beings are monarchs.
Boole would interpret the symbols \(P, Q, R\) as names of classes:
\(P\) is the class of all monarchs.
\(Q\) is the class of all human beings.
\(R\) is the class of all infallible beings.
Then he would point out that the original argument paraphrases into aset-theoretic consequence:
\[ (P \subseteq Q), (Q \cap R = 0) \vDash(R \cap P = 0) \](This example is from Stanley Jevons, 1869. Boole’s own accountis idiosyncratic, but I believe Jevons’ example representsBoole’s intentions accurately.) Today we would write \(\forallx(Px \rightarrow Qx)\) rather than \(P \subseteq Q\), but this isessentially the standard definition of \(P \subseteq Q\), so thedifference between us and Boole is slight.
Insofar as they follow Boole, modern textbooks of logic establish thatEnglish arguments are valid by reducing them to model-theoreticconsequences. Since the class of model-theoretic consequences, atleast in first-order logic, has none of the vaguenesses of the oldargument forms, textbooks of logic in this style have long sinceceased to have a chapter on fallacies.
But there is one warning that survives from the old textbooks: If youformalise your argument in a way that isnot amodel-theoretic consequence, it doesn’t mean the argument isnot valid. It may only mean that you failed to analyse theconcepts in the argument deeply enough before you formalised. The oldtextbooks used to discuss this in a ragbag section called‘topics’ (i.e. hints for finding arguments that you mighthave missed). Here is an example from Peter of Spain’s 13thcenturySummulae Logicales:
‘There is a father. Therefore there is a child.’ …Where does the validity of this argument come from? From the relation.The maxim is: When one of a correlated pair is posited, then so is theother.
Hilbert and Ackermann, possibly the textbook that did most toestablish the modern style, discuss in their section III.3 a verysimilar example: ‘If there is a son, then there is afather’. They point out that any attempt to justify this byusing the symbolism
\[ \exists xSx \rightarrow \exists xFx \]is doomed to failure. “A proof of this statement is possibleonly if we analyze conceptually the meanings of the two predicateswhich occur”, as they go on to illustrate. And of course theanalysis finds precisely the relation that Peter of Spain referredto.
On the other hand if your English argument translates into an invalidmodel-theoretic consequence, a counterexample to the consequence maywell give clues about how you can describe a situation that would makethe premises of your argument true and the conclusion false. But thisis not guaranteed.
One can raise a number of questions about whether the modern textbookprocedure does really capture a sensible notion of logicalconsequence. For example in Boole’s case the set-theoreticconsequences that he relies on are all easily provable by formalproofs in first-order logic, not even using any set-theoretic axioms;and by the completeness theorem (see the entry onclassical logic) the same is true for first-order logic. But for some other logics itis certainly not true. For instance the model-theoretic consequencerelation for some logics of time presupposes some facts about thephysical structure of time. Also, as Boole himself pointed out, histranslation from an English argument to its set-theoretic formrequires us to believe that for every property used in the argument,there is a corresponding class of all the things that have theproperty. This comes dangerously close to Frege’s inconsistentcomprehension axiom!
In 1936 Alfred Tarski proposed a definition of logical consequence forarguments in a fully interpreted formal language. His proposal wasthat an argument is valid if and only if: under any allowedreinterpretation of its nonlogical symbols, if the premises are truethen so is the conclusion. Tarski assumed that the class of allowedreinterpretations could be read off from the semantics of thelanguage, as set out in histruth definition. He left it undetermined what symbols count as nonlogical; in fact hehoped that this freedom would allow one to define different kinds ofnecessity, perhaps separating ‘logical’ from‘analytic’. One thing that makes Tarski’s proposaldifficult to evaluate is that he completely ignores the question wediscussed above, of analysing the concepts to reach all the logicalconnections between them. The only plausible explanation I can see forthis lies in his parenthetical remark about
the necessity of eliminating any defined signs which may possiblyoccur in the sentences concerned, i.e. of replacing them by primitivesigns.
This suggests to me that he wants his primitive signs to bebystipulation unanalysable. But then by stipulation it will bepurely accidental if his notion of logical consequence captureseverything one would normally count as a logical consequence.
Historians note a resemblance between Tarski’s proposal and onein section 147 of Bernard Bolzano’sWissenschaftslehreof 1837. Like Tarski, Bolzano defines the validity of a proposition interms of the truth of a family of related propositions. Unlike Tarski,Bolzano makes his proposal for propositions in the vernacular, not forsentences of a formal language with a precisely defined semantics.
On all of this section, see also the entry onlogical consequence.
A sentence \(S\) defines its class \(\Mod(S)\) of models. Given twolanguages \(L\) and \(L'\), we can compare them by asking whetherevery class \(\Mod(S)\), with \(S\) a sentence of \(L\), is also aclass of the form \(\Mod(S')\) where \(S'\) is a sentence of \(L'\).If the answer is Yes, we say that \(L\) isreducible to\(L'\), or that \(L'\) isat least as expressive as\(L\).
For example if \(L\) is a first-order language with identity, whosesignature consists of 1-ary predicate symbols, and \(L'\) is thelanguage whose sentences consist of the four syllogistic forms (All\(A\) are \(B\), Some \(A\) are \(B\), No \(A\) are \(B\), Some \(A\)are not \(B)\) using the same predicate symbols, then \(L'\) isreducible to \(L\), because the syllogistic forms are expressible infirst-order logic. (There are some quarrels about which is the rightway to express them; see the entry on the traditionalsquare of opposition.) But the first-order language \(L\) is certainly not reducible to thelanguage \(L'\) of syllogisms, since in \(L\) we can write down asentence saying that exactly three elements satisfy \(Px\), and thereis no way of saying this using just the syllogistic forms. Or movingthe other way, if we form a third language \(L''\) by adding to \(L\)the quantifier \(Qx\) with the meaning “There are uncountablymany elements \(x\) such that …”, then trivially \(L\) isreducible to \(L''\), but the downward Loewenheim-Skolem theorem showsat once that \(L''\) is not reducible to \(L\).
These notions are useful for analysing the strength of database querylanguages. We can think of the possible states of a database asstructures, and a simple Yes/No query becomes a sentence that elicitsthe answer Yes if the database is a model of it and No otherwise. Ifone database query language is not reducible to another, then thefirst can express some query that can’t be expressed in thesecond.
So we need techniques for comparing the expressive strengths oflanguages. One of the most powerful techniques available consists ofthe back-and-forth games of Ehrenfeucht and Fraïssébetween the two players Spoiler and Duplicator; see the entry onlogic and games for details. Imagine for example that we play the usual first-orderback-and-forth game \(G\) between two structures \(A\) and \(B\). Thetheory of these games establishes that if some first-order sentence\(\phi\) is true in exactly one of \(A\) and \(B\), then there is anumber \(n\), calculable from \(\phi\), with the property that Spoilerhas a strategy for \(G\) that will guarantee that he wins in at most\(n\) steps. So conversely, to show that first-order logic can’tdistinguish between \(A\) and \(B\), it suffices to show that forevery finite \(n\), Duplicator has a strategy that will guarantee shedoesn’t lose \(G\) in the first \(n\) steps. If we succeed inshowing this, it follows that any language which does distinguishbetween \(A\) and \(B\) is not reducible to the first-order languageof the structures \(A\) and \(B\).
These back-and-forth games are immensely flexible. For a start, theymake just as much sense on finite structures as they do on infinite;many other techniques of classical model theory assume that thestructures are infinite. They can also be adapted smoothly to manynon-first-order languages.
In 1969 Per Lindström used back-and-forth games to give someabstract characterisations of first-order logic in terms of itsexpressive power. One of his theorems says that if \(L\) is a languagewith a signature \(K, L\) is closed under all the first-ordersyntactic operations, and \(L\) obeys the downward Loewenheim-Skolemtheorem for single sentences, and the compactness theorem, then \(L\)is reducible to the first-order language of signature \(K\). Thesetheorems are very attractive; see Chapter XII of Ebbinghaus, Flum andThomas for a good account. But they have never quite lived up to theirpromise. It has been hard to find any similar characterisations ofother logics. Even for first-order logic it is a little hard to seeexactly what the characterisations tell us. But very roughly speaking,they tell us that first-order logic is the unique logic with twoproperties: (1) we can use it to express arbitrarily complicatedthings about finite patterns, and (2) it is hopeless fordiscriminating between one infinite cardinal and another.
These two properties (1) and (2) are just the properties offirst-order logic that allowed Abraham Robinson to build hisnonstandard analysis. The background is that Leibniz, when heinvented differential and integral calculus, used infinitesimals, i.e.numbers that are greater than 0 and smaller than all of 1/2, 1/3, 1/4etc. Unfortunately there are no such real numbers. During thenineteenth century all definitions and proofs in the Leibniz stylewere rewritten to talk of limits instead of infinitesimals. Now let\(\mathbb{R}\) be the structure consisting of the field of realnumbers together with any structural features we care to give namesto: certainly plus and times, maybe the ordering, the set of integers,the functions sin and log, etc. Let \(L\) be the first-order languagewhose signature is that of \(\mathbb{R}\). Because of the expressivestrength of \(L\), we can write down any number of theorems ofcalculus as sentences of \(L\). Because of the expressive weakness of\(L\), there is no way that we can express in \(L\) that\(\mathbb{R}\) has no infinitesimals. In fact Robinson used thecompactness theorem to build a structure \(\mathbb{R}'\) that is amodel of exactly the same sentences of \(L\) as \(\mathbb{R}\), butwhich has infinitesimals. As Robinson showed, we can copyLeibniz’s arguments using the infinitesimals in \(\mathbb{R}'\),and so prove that various theorems of calculus are true in\(\mathbb{R}'\). But these theorems are expressible in \(L\), so theymust also be true in \(\mathbb{R}\).
Since arguments using infinitesimals are usually easier to visualisethan arguments using limits, nonstandard analysis is a helpful toolfor mathematical analysts. Jacques Fleuriot in his Ph.D. thesis (2001)automated the proof theory of nonstandard analysis and used it tomechanise some of the proofs in Newton’sPrincipia.
Tomodel a phenomenon is to construct a formal theory thatdescribes and explains it. In a closely related sense, youmodel a system or structure that you plan to build, bywriting a description of it. These are very different senses of‘model’ from that in model theory: the ‘model’of the phenomenon or the system is not a structure but a theory, oftenin a formal language. TheUnified Modeling Language, UML forshort, is a formal language designed for just this purpose. It’sreported that the Australian Navy once hired a model theorist for ajob ‘modelling hydrodynamic phenomena’. (Pleasedon’t enlighten them!)
A little history will show how the word ‘model’ came tohave these two different uses. In late Latin a ‘modellus’was a measuring device, for example to measure water or milk. By thevagaries of language, the word generated three different words inEnglish: mould, module, model. Often a device that measures out aquantity of a substance also imposes a form on the substance. We seethis with a cheese mould, and also with the metal letters (called‘moduli’ in the early 17th century) that carry ink topaper in printing. So ‘model’ comes to mean an object inhand that expresses the design of some other objects in the world: theartist’s model carries the form that the artist depicts, andChristopher Wren’s ‘module’ of St Paul’sCathedral serves to guide the builders.
Already by the late 17th century the word ‘model’ couldmean an object that shows the form, not of real-world objects, but ofmathematical constructs. Leibniz boasted that he didn’t needmodels in order to do mathematics. Other mathematicians were happy touse plaster or metal models of interesting surfaces. The models ofmodel theory first appeared as abstract versions of this kind ofmodel, with theories in place of the defining equation of a surface.On the other hand one could stay with real-world objects but showtheir form through a theory rather than a physical copy in hand;‘modelling’ is building such a theory.
We have a confusing halfway situation when a scientist describes aphenomenon in the world by an equation, for example a differentialequation with exponential functions as solutions. Is the model thetheory consisting of the equation, or are these exponential functionsthemselves models of the phenomenon? Examples of this kind, wheretheory and structures give essentially the same information, providesome support for Patrick Suppes’ claim that “the meaningof the concept of model is the same in mathematics and the empiricalsciences” (1969, 12). Several philosophers of science havepursued the idea of using an informal version of model-theoreticmodels for scientific modelling. Sometimes the models are described asnon-linguistic – this might be hard to reconcile with ourdefinition of models in section 1 above.
Cognitive science is one area where the difference between models andmodelling tends to become blurred. A central question of cognitivescience is how we represent facts or possibilities in our minds. Ifone formalises these mental representations, they become somethinglike ‘models of phenomena’. But it is a serious hypothesisthat in fact our mental representations have a good deal in commonwith simple set-theoretic structures, so that they are‘models’ in the model-theoretic sense too. In 1983 twoinfluential works of cognitive science were published, both under thetitleMental Models. The first, edited by Dedre Gentner andAlbert Stevens, was about people’s‘conceptualizations’ of the elementary facts of physics;it belongs squarely in the world of ‘modelling ofphenomena’. The second, by Philip Johnson-Laird, is largelyabout reasoning, and makes several appeals to ‘model-theoreticsemantics’ in our sense. Researchers in the Johnson-Lairdtradition tend to refer to their approach as ‘modeltheory’, and to see it as allied in some sense to what we havecalled model theory.
Pictures and diagrams seem at first to hover in the middle groundbetween theories and models. In practice model theorists often drawthemselves pictures of structures, and use the pictures to think aboutthe structures. On the other hand pictures don’t generally carrythe labelling that is an essential feature of model-theoreticstructures. There is a fast growing body of work on reasoning withdiagrams, and the overwhelming tendency of this work is to seepictures and diagrams as a form of language rather than as a form ofstructure. For example Eric Hammer and Norman Danner (1996) describe a‘model theory of Venn diagrams’; the Venn diagramsthemselves are the syntax, and the model theory is a set-theoreticalexplanation of their meaning. (A curious counterexample is thehorizontal line diagrams of the 12th century Baghdad Jewish scholarAbū l-Barakāt they represent structures and notpropositions, and Abū l-Barakāt uses them to expressmodel-theoretic consequence in syllogisms. Further details are inHodges 2018 on model-theoretic consequence.)
The model theorist Yuri Gurevich introducedabstract statemachines (ASMs) as a way of using model-theoretic ideas forspecification in computer science. According to the Abstract StateMachine website (see Other Internet Resources below),
any algorithm can be modeled at its natural abstraction level by anappropriate ASM. … ASMs use classical mathematical structuresto describe states of a computation; structures are well-understood,precise models.
The book of Börger and Stärk cited below is an authoritativeaccount of ASMs and their uses.
Today you can make your name and fortune by finding a goodrepresentation system. There is no reason to expect that every suchsystem will fit neatly into the syntax/semantics framework of modeltheory, but it will be surprising if model-theoretic ideas don’tcontinue to make a major contribution in this area.
The sections above considered some of the basic ideas that fed intothe creation of model theory, noting some ways in which these ideasappeared either in mathematical model theory or in other disciplinesthat made use of model theory. None of this is particularlyphilosophical, except in the broad sense that philosophers work withideas. But as mathematical model theory has become more familiar tophilosophers, it has increasingly become a source of material forphilosophical questions. In 2018 two books appeared that directlyaddressed this philosophical use of model theory, though in verydifferent ways.
In the first book, Button and Walsh 2018, the authors present aninvitation to the reader to help create a discipline,‘philosophy and model theory’, which is gradually cominginto existence. (This is partly belied by the large amount ofcarefully-worked material in the book.) Mathematics in general is asource of fundamental philosophical worries. For examplemathematicians refer to entities that we have no causal interactionwith (such as the number π or the set of real numbers), and thiscreates questions about how we can identify these entities toourselves or each other, and how we can discover facts about them.These problems are not new or peculiar to model theory; butmathematical model theory is the part of mathematics most concernedwith ‘reference’ and ‘isomorphism types’ and‘indiscernibility’, notions which go directly to thephilosophical problem areas. The authors give clear analyses ofexactly what the issues are in key discussions in these areas.
The second book, Baldwin 2018, presents mathematical model theory ofthe period from 1970 to today as a source of material for thediscipline of philosophy of mathematical practice. This disciplinestudies the work of particular mathematicians within their historicalcontext, and asks such questions as: Why did this mathematician preferclassifications in terms ofX to classifications in terms ofY? Why did this group of mathematical researchers choose toformalise their subject matter using such-and-such a language or setof symbols? How did they decide what to formalise and what to leaveunformalised? The discipline is partly historical, but it looks forconceptual justifications of the historical choices made. (See theentriesmathematical style andexplanation in mathematics.) Baldwin has a long history of work in mathematical model theory, sohe can answer questions like those above from personal knowledge. Thisbook gives a rich supply of examples, explained with helpful picturesand remarkably little technical notation.
How to cite this entry. Preview the PDF version of this entry at theFriends of the SEP Society. Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entryatPhilPapers, with links to its database.
diagrams: and diagrammatical reasoning |geometry: in the 19th century |logic: classical |logical consequence |logical truth |mathematics, philosophy of |models in science |physics: intertheory relations in |physics: structuralism in |square of opposition |Tarski, Alfred |Tarski, Alfred: truth definitions
View this site from another server:
The Stanford Encyclopedia of Philosophy iscopyright © 2024 byThe Metaphysics Research Lab, Department of Philosophy, Stanford University
Library of Congress Catalog Data: ISSN 1095-5054