The Notation inPrincipia Mathematica

First published Thu Aug 19, 2004; substantive revision Tue Apr 5, 2022

Principia Mathematica [PM] by A.N. Whitehead and BertrandRussell, published 1910–1913 in three volumes by CambridgeUniversity Press, contains a derivation of large portions ofmathematics using notions and principles of symbolic logic. Thenotation in that work has been superseded by the subsequentdevelopment of logic during the 20^th century, to the extentthat the beginner has trouble reading PM at all. This article providesan introduction to the symbolism of PM, showing how that symbolism canbe translated into a more contemporary notation using concepts whichshould be familiar to anyone who has had a first course in symboliclogic or set theory. This translation is offered as an aid to learningthe original notation, which itself is a subject of scholarly dispute,and embodies substantive logical doctrines so that it cannot simply bereplaced by contemporary symbolism. Learning the notation, then, is afirst step to learning the distinctive logical doctrines ofPrincipia Mathematica.

1. Why Learn the Symbolism inPrincipia Mathematica?

Principia Mathematica [PM] was written jointly by AlfredNorth Whitehead and Bertrand Russell over several years, and publishedin three volumes, which appeared between 1910 and 1913. It presents asystem of symbolic logic and then turns to the foundations ofmathematics to carry out the logicist project of defining mathematicalnotions in terms of logical notions and proving the fundamental axiomsof mathematics as theorems of logic. While hugely important in thedevelopment of logic, philosophy of mathematics and more broadly of“Early Analytic Philosophy”, the work itself is no longerstudied for these topics. As a result the very notation of the workhas become alien to contemporary students of logic, and that hasbecome a barrier to the study ofPrincipia Mathematica. Weinclude a series of definitions of notions such as transfinitecardinal numbers, well-orderings, rational and real numbers. These aredefined differently in the theory of types of PM than than inaxiomatic set theory.

This entry is intended to assist the student of PM in reading thesymbolic portion of the work. What follows is a partial translation ofthe symbolism into a more contemporary notation, which should befamiliar from other articles in this Encyclopedia, and which is quitestandard in contemporary textbooks of symbolic logic. No completealgorithm is supplied, rather various suggestions are intended to helpthe reader learn the symbolism of PM. Many issues of interpretationwould be prejudged by only using contemporary notation, and manydetails that are unique to PM depend on that notation. It will be seenbelow, with some of the more contentious aspects of the notation, thatdoctrines of substance are built into the notation of PM. Replacingthe notation with a more modern symbolism would drastically alter thevery content of the book.

2. Primitive Symbols of Mathematical Logic (Part I)

Below the reader will find, in the order in which they are introducedin PM, the following symbols, which are briefly described. More detailis provided in what follows:

∗

pronounced “star”; indicates a number, or chapter,as in ∗1, or ∗20.

a centered dot (an old British decimal point); indicates anumbered sentence in the order by first digit (all the 0s precedingall the 1s etc.), then second digit, and so on. The first definitionsand propositions of ∗1 illustrate this“lexicographical” ordering: 1·01, 1·1,1·11, 1·2, 1·3, 1·4, 1·5,1·6, 1·7, 1·71, 1·72.

\(\vdash\)

the assertion-sign; precedes anassertion, either anaxiom (i.e., aprimitive proposition, which are alsoannotated “\(\Pp\)”) or atheorem.

\(\Df\)

the definition sign; follows adefinition.

\(.\), \(:\), \(:.\), \(::\), etc.

are dots used for delimiting punctuation; in contemporary logic,we use ( ), [ ], \(\{\ \}\), etc.

\(p, q, r\), etc.

arepropositional variables.

\(\lor\), \(\supset\), \(\osim\), \(\equiv\) and . , :, :., etc.

are the familiar sententialconnectives, correspondingto “or”, “if-then”,“not”, “if and only if, and“and” respectively. (The dual use of dots forconjunction and punctuation will be explained below.) In the SecondEdition of PM, 1925–27, the Sheffer Stroke“\(\mid\)” is the one primitive connective. It means“not both …and ___”.

\(x, y, z\), etc.

areindividual variables, which are to be read with“typical ambiguity”, i.e., with theirlogicaltypes to be filled in (see below).

\(a, b, c\), etc.

areindividual constants, and stand for individuals (ofthe lowest type). These occur only in the Introduction to PM, and notin the official system.

\(xRy, aRb, R(x)\), etc.

areatomic predications, in which the objects named bythe variables or constants stand in the relation \(R\) or have theproperty \(R\). These occur only in the Introduction.“\(a\)” and “\(b\)” occur as constants only inthe Second Edition. The predications \(R(x), R(x,y)\), etc., are usedonly in the Second Edition.

\(\phi\), \(\psi\), \(\chi\), etc.,
and \(f, g\), etc.

are higher-order variables which range overpropositionalfunctions, no matter whether those functions are simple orcomplex.

\(\phi x\), \(\psi x\), \(\phi(x,y)\), etc.

open atomic formulas in which both “\(x\)” and“\(\phi\)” are free. [An alternative interpretation is toview “\(\phi x\)” as a schematic letter standing for aformula in which the variable “\(x\)” is free.]

\(\hat{\phantom{x}}\)

the circumflex; when placed over a variable in an open formula(as in “\(\phi \hat{x}\)”) results in a term for afunction. [This matter is controversial. See Landini 1998.] When thecircumflected variable precedes a complex variable, the resultindicates a class, as in \(\hat{x}\phi x\) which is the class of \( x\) which are \( \phi \), \( \{ x \mid \phi x\} \) in modern notation.

\(\phi\hat{x}, \psi\hat{x}, \phi(\hat{x},\hat{z}),\) etc.

Terms for propositional functions. Here are examples of suchterms which are constants: “\(\hat{x}\) is happy”,“\(\hat{x}\) is bald and \(\hat{x}\) is happy”, “\(4\lt \hat{x} \lt 6\)”, etc. If we apply, for example, thefunction “\(\hat{x}\) is bald and \(\hat{x}\) is happy” tothe particular individual \(b\), the result is the proposition“\(b\) is bald and \(b\) is happy”.

\(\exists\) and ( )

are thequantifiers “there exists” and“for all” (“every”), respectively. Forexample, where \(\phi x\) is a simple or complex open formula,

\((\exists x)\phi x\)	asserts	“there exists an \(x\) such that \(\phi x\)”
\((\exists \phi)\phi x\)	asserts	“there exists a propositional function \(\phi\) such that\(\phi x\)”
\((x)\phi x\)	asserts	“every \(x\) is such that \(\phi x\)”
\((\phi)\phi x\)	asserts	“every propositional function \(\phi\) is such that \(\phix\)”

[These were used by Peano. More recently, \(\forall\) has been addedfor symmetry with \(\exists\). Some scholars see the quantfiers\((\phi)\) and \((\exists \phi\)) as substitutional.]

\(\phi x \supset_x \psi x\)
\(\phi x \equiv_x \psi x\)

This is notation that is used to abbreviate universallyquantified variables. In modern notation, these become \(\forallx(\phi x \supset \psi x)\) and \(\forall x(\phi x \equiv \psi x)\),respectively. See the definitions for this notation at the end ofSection 3.2 below.

\(\bang\)

pronounced “shriek”; indicates that a function ispredicative, as in \(\phi \bang x\) or \(\phi\bang \hat{x}\).SeeSection 7.

the identity symbol; expressesidentity, which is adefined notion in PM, not primitive as in contemporary logic.

\(\atoi\)

read as “the”; is the invertediota ordescription operator and is used in expressions for definitedescriptions, such as \((\atoi x)\phi x\) (which is read: the \(x\)such that \(\phi x\)).

[\((\atoi x)\phi x\)]

a definite description in brackets; this is ascopeindicator for definite descriptions.

\(E\bang\)

is defined at ∗14·02, in the context \(E\bang(\atoi x)\phi x\), to mean that the description \((\atoi x)\phi x\) isproper, i.e., there is one and only one thing that is\(\phi\).

\(\exists\bang\)

is defined at ∗24·03, in the context \(\exists\bang \alpha\), to mean that the class \(\alpha\) isnon-empty, i.e., has a member.

The evolution of this selection of primitive symbols out of Peano'ssymbolism is traced in Elkind and Zach (2023).

3. The Use of Dots in PM

An immediate obstacle to reading PM is the unfamiliar use of dots forpunctuation, instead of the more common parentheses and brackets. Thesystem is precise, and can be learned with just a little practice. Theuse of dots for punctuation is not unique to PM. Originating withPeano, it was later used in works by Alonzo Church, W.V.O. Quine, andothers, but it has now largely disappeared. Alan Turing made a studyof the use of dots from a computational point of view in 1942,presumably in his spare time after a day’s work at BletchleyPark breaking the codes of the Enigma Machine. Turing suggests thatthe use of juxtaposition to indicate conjunction is similar to the useof juxtaposition arithmetic to indicate multiplication:

In most systems there is some operation which is described simply byjuxtaposition, without any special operator. In Church’s systemthis is the application of a function to its argument; inRussell’s it is conjunction and in algebra it ismultiplication”. (Turing 1942, 151)

In his earlier work, such as The Principles of Mathematics,from 1903, Russell followed Peano’s practice of indicatingconjunction by simply juxtaposing formulas. Thus the conjunction of\(p\) and \(q \) was written \(p q \). Russell began to use thepunctuation dot for conjunction by 1905. The use of dots forpunctuation in logic is now only of historical interest, although sometextbooks use a raised dot \(p \cdot q \) for conjunction. Below wewill explain the dual use of dots for punctuation and conjunction inPM.

The best way to learn to use it is to look at a few samples which aretranslated to formulae using parentheses, and thus to get the feel forit. What follows is an explanation as presented in PM, pages9–10, followed by a number of examples which illustrate each ofits clauses:

The use of dots. Dots on the line of the symbols have two uses, one tobracket off propositions, the other to indicate the logical product oftwo propositions. Dots immediately preceded or followed by“\(\lor\)” or “\(\supset\)” or“\(\equiv\)” or “\(\vdash\)”, or by“\((x)\)”, “\((x,y)\)”,“\((x,y,z)\)” … or “\((\exists x)\)”,“\((\exists x,y)\)”, “\((\exists x,y,z)\)”… or “\([(\atoi x)(\phi x)]\)” or“\([R‘y]\)” or analogous expressions, serve tobracket off a proposition; dots occurring otherwise serve to mark alogical product. The general principle is that a larger number of dotsindicates an outside bracket, a smaller number indicates an insidebracket. The exact rule as to the scope of the bracket indicated bydots is arrived at by dividing the occurrences of dots into threegroups which we will name I, II, and III. Group I consists of dotsadjoining a sign of implication \((\supset)\) or equivalence\((\equiv)\) or of disjunction \(( \lor)\) or of equality bydefinition \((=\Df)\). Group II consists of dots following bracketsindicative of an apparent variable, such as \((x)\) or \((x,y)\) or\((\exists x)\) or \((\exists x,y)\) or \([(\atoi x)(\phi x)]\) oranalogous expressions. Group III consists of dots which stand betweenpropositions in order to indicate a logical product. Group I is ofgreater force than Group II, and Group II than Group III. The scope ofthe bracket indicated by any collection of dots extends backwards orforwards beyond any smaller number of dots, or any equal number from agroup of less force, until we reach either the end of the assertedproposition or a greater number of dots or an equal number belongingto a group of equal or superior force. Dots indicating a logicalproduct have a scope which works both backwards and forwards; otherdots only work away from the adjacent sign of disjunction,implication, or equivalence, or forward from the adjacent symbol ofone of the other kinds enumerated in Group II. Some examples willserve to illustrate the use of dots. (PM, 9–10)

For a deeper discussion of this passage on dot notation, see thesupplement on:

The Use of Dots for Punctuation and For Conjunction.

3.1 Some Basic Examples

Consider the following series of extended examples, in which weexamine propositions in PM and then discuss how to translate them stepby step into modern notation. (Symbols below are sometimes used asnames for themselves, thus avoiding some otherwise needed quotationmarks. Russell is often accused of confusing use and mention, so theremay well be some danger in this practice.)

Example 1

\[\tag*{∗1·2}{\vdash} \colon p \lor p \ldot {\supset} \ldot p \quad\Pp\]

This is the second assertion of “star” 1. It is in fact anaxiom or “Primitive Proposition” as indicated by the‘\(\Pp\)’. That this is an assertion (axiom or theorem)and not a definition is indicated by the use of“\(\vdash\)”. (By contrast, a definition would omit theassertion sign but conclude with a “\(\Df\)” sign.) Nowthe first step in the process of translating ∗1·2 intomodern notation is to note the colon. Recall, from the above quotedpassage, that “a larger number of dots indicates an outsidebracket, a smaller number indicates an inside bracket”. Thus,the colon here (which consists of a larger number of dots than thesingle dots occurring on the line in ∗1·2) represents anoutside bracket. The brackets “[” and “]”represent the colon in ∗1·2. The scope of the colon thusextends past any smaller number of dots (i.e., one dot) to the end ofthe formula. Since formulas are read from left to right the expression“past” means “to the right of”.

So, the first step is to translate ∗1·2 to:

\[\vdash[ p \lor p \ldot {\supset} \ldot p]\]

Next, the dots around the “\(\supset\)” are represented inmodern notation by the parenthesis around the antecedent andconsequent. Recall, in the above passage, we find “… dotsonly work away from the adjacent sign of disjunction, implication, orequivalence …”. Thus, the next step in the translationprocess is to move to the formula: \[\vdash [(p \lor p) \supset(p)]\]

Finally, standard modern conventions allow us to delete the outerbrackets and the parentheses around single letters, yielding:

\[\vdash(p \lor p) \supset p\]

Our next example involves conjunction:

Example 2

\[ \tag*{∗3·01} p \sdot q \ldot {=} \ldot \osim(\osim p \lor \osim q) \quad\Df\]

The dual use of dots to “indicate” conjunction andpunctuation can be understood through a careful examination of thedetails of the paragraph on the use of dots from pages 9 to 11 of PM.Try reading the dots as punctuation first, and then, if thatwon’t work, those dots must indicate conjunction.

∗3·01 defines the use of dots to indicate conjunction.(That first dot, when read as punctuation would extend until an equalnumber of dots, namely the dot before the = sign, yielding theincoherent expression: “( \(p ( q) =_\mathit{df} ( \osim (\osimp \lor \osim q) ) \) ”. It must, therefore, indicate aconjunction.) The dots around a sign of equality by definition\(=_\mathit{df} \) are in Group I, and so the parentheses that replacethem extend to the ends of the expression:

\[(p \: .\: q) =_\mathit{df} ( \osim (\osim p \lor \osim q) )\] Then, wedelete the outer parentheses on the right and left as unnecessary forinterpreting the formula, so we have: \[p . q =_\mathit{df} \osim (\osim p \lor \osim q)\] in a modern notation\[ p \amp q =_\mathit{df} \osim (\osim p \lor \osim q) \]

Notice that the scope of the negation sign “\(\osim\)” in∗3·01 is not indicated with dots, even in the PM system,but rather uses parentheses.

Example 3

\[\tag*{∗9·01} \osim \{(x) \sdot \phi x\} \ldot {=} \ldot (\exists x) \sdot \osim \phi x \quad\Df\]

We apply the rule “dots only work away from the adjacent sign ofdisjunction, implication, or equivalence, or forward from the adjacentsymbol of one of the other kinds enumerated in Group II” (whereGroup II includes “\((\exists x)\)”). In this case thefirst dot extends to the punctuation symbol \( \} \) which is allowedoptionally to replace dots. No such punctuation after the quantifier(or after the negation) occur in the modern equivalent which would be:\[\osim (x)\phi x =_\mathit{df} (\exists x)\osim \phi x\] or \[\osim \forall x\phi x =_\mathit{df} \exists x\osim \phi x\]

The ranking of connectives in terms of relative “force”,orscope, is a standard convention in contemporary logic. Ifthere are no explicit parentheses to indicate the scope of aconnective those which have precedence in the ranking are presumed tobe the principal connective, and so on for subformulas. Thus, insteadformulating the following DeMorgan’s law as the cumbersome:

\[[(\osim p) \lor (\osim q)] \equiv[\osim (p \amp q)]\]

we nowadays write it as:

\[\osim p \lor \osim q \equiv\osim (p \amp q)\]

This simpler formulation follows from the convention that \(\equiv\)has wider scope than \(\lor\) and &, and the latter have widerscope than \(\osim\). Indeed parentheses are often unneeded around\(\equiv\), given a further convention on which \(\equiv\) has widerscope than \(\supset\). Thus, the formula \(p \supset q \equiv\osimp\lor q\) becomes unambiguous. We might represent these conventions bylisting the connectives in groups with those with widest scope at thetop:

\[\begin{array}{c}\equiv \\\supset \\\amp, \lor \\\osim \end{array}\]

For Whitehead and Russell, however, the symbols \(\supset\),\(\equiv\), \(\lor\) and \(\ldots =\ldots \Df\), in Group I, are ofequal force. Group II consists of the variable binding expressions,quantifiers and scope indicators for definite descriptions, and GroupIII consists of conjunctions. Negation is below all of these. So theranking in PM would be:

\[\begin{array}{c}\supset, \equiv, \lor \text{ and } \ldots = \ldots \quad\Df \\(x), (x,y) \ldots (\exists x), (\exists x,y) \ldots [(\atoi x)\phi x] \\p \sdot q \\\osim \end{array}\]

This is what Whitehead and Russell seem to mean when they say“Group I is of greater force than Group II, and Group II thanGroup III.” Consider the following:

Example 4

\[\tag*{∗3·12} {\vdash} \colon \osim p \ldot {\lor} \ldot \osim q \ldot {\lor} \ldot p \sdot q\]

This theorem illustrates how to read multiple uses of the same numberof dots within one formula. Grouping “associates to theleft” both for dots and for a series of disjunctions, followingthe convention of reading from left to right and the definition:

\[\tag*{∗2·33} p \vee q \vee r \ldot {=} \ldot (p \vee q) \vee r \quad\Df\]

In ∗3·12, the first two dots around the \(\lor\) simply“work away” from the connective. The second“extends” until it meets with the next of the same number(the third single dot). That third dot, and the fourth “workaway” from the second \(\lor\), and the final dot indicates aconjunction with least force. The result, formulated with all possiblepunctuation for maximum explicitness, is:

\[\{[(\osim p) \lor (\osim q)] \lor (p \amp q)\}\]

If we employ all the standard conventions for dropping parentheses,this becomes:

\[(\osim p \lor \osim q) \lor (p \amp q)\]

This illustrates the passage in the above quotation which says“The scope of the bracket indicated by any collection of dotsextends backwards or forwards beyond any smaller number of dots, orany equal number from a group of less force, until we reach either theend of the asserted proposition or a greater number of dots or anequal number belonging to a group of equal or superiorforce.”

Before we look at a wider range of examples, a detailed exampleinvolving quantified variables will prove to be instructive. Whiteheadand Russell follow Peano’s practice of expressing universallyquantified conditionals (such as “All \(\phi\)s are\(\psi\)s”) with the bound variable subscripted under theconditional sign. Similarly with universally quantified biconditionals(“All and only \(\phi\)s are \(\psi\)s”). That is, theexpressions “\(\phi x \supset_x \psi x\)” and“\(\phi x \equiv_x \psi x\)” are defined as follows:

\[\tag*{∗10·02} \phi x \supset_x \psi x \ldot {=} \ldot (x) \ldot \phi x \supset \psi x \quad\Df\] \[\tag*{∗10·03} \phi x \equiv_x \psi x \ldot {=} \ldot (x) \ldot \phi x \equiv \psi x \quad\Df \]

and correspond to the following more modern formulas,respectively:

\[\forall x(\phi x \supset \psi x)\] \[\forall x(\phi x \equiv \psi x)\]

As an exercise the reader might be inclined to formulate a rigorousalgorithm for converting PM into a particular contemporary symbolism(with conventions for dropping parentheses), but the best way to learnthe system is to look over a few more examples of translations, andthen simply begin to read formulae directly.

3.2 More Examples

In the examples below, each formula number is followed first byPrincipia notation and then its modern translation. Noticethat in ∗1·5 parentheses are used for punctuation inaddition to dots. (Primitive Propositions ∗1·2,∗1·3, ∗1·4, ∗1·5, and∗1·6 together constitute the axioms for propositionallogic in PM.) Proposition ∗1·5 was shown to be redundantby Paul Bernays in 1926. It can be derived from appropriate instancesof the others and the rule of modus ponens.

∗1·3	\({\vdash} \colon q \ldot {\supset} \ldot p \lor q \quad\Pp\) \(q \supset p \lor q\)
∗1·4	\({\vdash} \colon p \lor q \ldot {\supset} \ldot q \lor p\quad\Pp\) \(p \lor q \supset q \lor p\)
∗1·5	\({\vdash} \colon p \lor (q \lor r ) \ldot {\supset} \ldot q\lor (p \lor r ) \quad\Pp\) \(p \lor (q \lor r ) \supset q \lor (p \lor r )\)
∗1·6	\({\vdash} \colondot q \supset r \ldot {\supset} \colon p \lor q\ldot {\supset} \ldot p \lor r \quad\Pp\) \((q \supset r ) \supset(p \lor q \supset p \lor r )\)
∗2·03	\({\vdash} \colon p \supset \osim q \ldot {\supset} \ldot q\supset\osim p\) \((p \supset\osim q) \supset(q \supset\osim p)\)
∗3·3	\({\vdash} \colondot p \sdot q \ldot {\supset} \ldot r \colon{\supset} \colon p \ldot {\supset} \ldot q \supset r\) \([(p \amp q) \supset r] \supset [p \supset(q \supset r)]\)
∗4·15	\({\vdash} \colondot p \sdot q \ldot {\supset} \ldot \osim r\colon {\equiv} \colon q \sdot r \ldot {\supset} \ldot \osim p\) \([(p \amp q )\supset\osim r ] \equiv [(q \amp r )\supset\osim p ]\)

∗5·71	\({\vdash} \colondot q \supset\osim r \ldot {\supset} \colon p\lor q \sdot r \ldot {\equiv} \ldot p \sdot r\) \((q \supset\osim r) \supset \{ [(p \lor q) \amp r ] \equiv (p \amp r)\}\)
∗9·04	\(p \ldot {\lor} \ldot (x) \ldot \phi x \colon {=} \ldot (x)\ldot \phi x \lor p \quad\Df\) \(p \lor \forall x\phi x =_\mathit{df} \forall x(\phi x \lor p)\)

∗9·521	\({\vdash} \colons (\exists x) \ldot \phi x \ldot {\supset}\ldot q \colon {\supset} \colondot (\exists x) \ldot \phi x \ldot{\lor} \ldot r \colon {\supset} \ldot q \lor r\) [\((\exists x\phi x) \supset q] \supset [((\exists x\phi x) \lor r)\supset (q \lor r)\)]
∗10·55	\({\vdash} \colondot (\exists x) \ldot \phi x \sdot \psi x\colon \phi x \supset_x \psi x \colon {\equiv} \colon (\exists x)\ldot \phi x \colon \phi x \supset_x \psi x\) \(\exists x(\phi x \amp \psi x) \amp \forall x(\phi x \supset \psi x)\equiv \exists x\phi x \amp \forall x(\phi x \supset \psi x)\)

Notice that there are two uses of double dots ‘:’ in∗10·55 to indicate conjunctions.

4. Propositional Functions

There are two kinds of functions in PM. Propositional functions suchas “\(\hat{x}\) is a natural number” are to bedistinguished from the more familiar mathematical functions, which arecalled “descriptive functions” (PM, Chapter \(\ast\)31).Descriptive functions are defined using relations and definitedescriptions. Examples of descriptive functions are \(x + y\) and“the successor of \(n\)”.

Focusing on propositional functions, Whitehead and Russell distinguishbetween expressions with a free variable (such as “\(x\) ishurt”) and names of functions (such as “\(\hat{x}\) ishurt”) (PM, 14–15). The propositions which result from theformula by assigning allowable values to the free variable“\(x\)” are said to be the “ambiguous values”of the function. Expressions using the circumflex notation, such as\(\phi \hat{x}\) only occur in the introductory material in thetechnical sections of PM and not in the technical sections themselves(with the exception of the sections on the theory of classes),prompting some scholars to say that such expressions do not reallyoccur in the formal system of PM. This issue is distinct from thatsurrounding the interpretation of such symbols. Are they“term-forming operators” which turn an open formula into aname for a function, or simply a syntactic device, a placeholder, forindicating the variable for which a substitution can be made in anopen formula? If they are to be treated as term-forming operators, themodern notation for \(\phi \hat{x}\) would be \(\lambda x\phi x\). The\(\lambda\)-notation has the advantage of clearly revealing that thevariable \(x\) isbound by the term-forming operator\(\lambda\), which takes a predicate \(\phi\) and yields a term\(\lambda x\phi x\) (which in some logics is a singular term that canoccur in the subject position of a sentence, while in other logics isa complex predicative expression). Unlike \(\lambda\)-notation, the PMnotation using the circumflex cannot indicate scope. The functionexpression “\(\phi(\hat{x},\hat{y}\))” is ambiguousbetween \(\lambda x\lambda y\phi xy\) and \(\lambda y\lambda x\phixy\), without some further convention. Indeed, Whitehead and Russellspecified this convention for relations in extension (on p. 200 in theintroductory material of ∗21, in terms of the order of thevariables), but the ambiguity is brought out most clearly by using\(\lambda\) notation: the first denotes the relation of being an \(x\)and \(y\) such that \(\phi xy\) and the second denotes the converserelation of being a \(y\) and \(x\) such that \(\phi xy\).

5. The Missing Notation for Types and Orders

This section explains notation that is not inPrincipiaMathematica. Except for some notation for “relative”types in \(\ast 63\), and again in early parts of Volume II, there arefamously no symbols for types inPrincipia Mathematica!Sentences are generally to be taken as “typicallyambiguous” and so standing for expressions of a whole range oftypes and so just as there are no individual or predicate constants,there are no particular functions of any specific type. So not onlydoes one not see how to symbolize the argument:

All men are mortal
Socrates is a man
Therefore, Socrates is mortal

but also there is no indication of the logical type of the function“\(\hat{x}\) is mortal”. The project of PM is to reducemathematics to logic, and part of the view of logic behind thisproject is that logical truths are all completely general. Thederivation of truths of mathematics from definitions and truths oflogic will thus not involve any particular constants other than thoseintroduced by definition from purely logical notion. As a result nonotation is included in PM for describing those types. Those of us whowish to consider PM as a logic which can be applied, must supplementit with some indication of types.

Readers should note that the explanation of types outlined below isnot going to correspond with the statements about types in the text ofPM. Alonzo Church [1976] developed a simple, rational reconstructionof the notation for both the simple and ramified theory of types asimplied by the text of PM. (There are alternative, equivalentnotations for the theory of types.) The full theory can be seen as adevelopment of thesimple theory of types.

5.1 Simple Types

A definition of the simple types can be given as follows:

\(\iota\) (Greek iota) is the type for anindividual.
Where \(\tau_1,\ldots,\tau_n\) are any types, then\(\ulcorner(\tau_1,\ldots,\tau_n)\urcorner\) is the type of apropositional function whose arguments are of types\(\tau_1,\ldots,\tau_n\), respectively.
\(\ulcorner\)( )\(\urcorner\) is the type of propositions.

Here are some intuitive ways to understand the definition of type.Suppose that “Socrates” names an individual. (We are hereignoring Russell’s considered opinion that such ordinaryindividuals are in fact classes of classes of sense data, and so of amuch higher type.) Then the individual constant “Socrates”would be of type \(\iota\). A monadic propositional function whichtakes individuals as arguments is of type \((\iota)\). Suppose that“is mortal” is a predicate expressing a property ofindividuals. The function “\(\hat{x}\) is mortal” will beof the type \((\iota)\). A two-place orbinary relationbetween individuals is of type \((\iota,\iota)\). Thus, a relationexpression like “parent of” and the function“\(\hat{x}\) is a parent of \(\hat{z}\)” will be of type\((\iota,\iota)\).

Propositional functions of type \((\iota)\) are often called“first order”; hence the name “first orderlogic” for the familiar logic where the variables only rangeover arguments of first order functions. A monadic function ofarguments of type \(\tau\) are of type \((\tau)\) and so functions ofsuch functions are of type \(((\tau))\). “Second orderlogic” will have variables for the arguments of such functions(as well as variables for individuals). Binary relations betweenfunctions of type \(\tau\) are of type \((\tau,\tau)\), and so on, forrelations of having more than 2 arguments. Mixed types are defined bythe above. A relation between an individual and a proposition (such as“\(\hat{x}\) believes that \(\hat{P}\)”) will be of type\((\iota\),( )).

5.2 Ramified Types

To construct a notation for the full ramified theory of types of PM,another piece of information must be encoded in the symbols. Churchcalls the resulting system one ofr-types. The key idea oframified types is that any function defined using quantification overfunctions of some given type has to be of a higher “order”than those functions. To use Russell’s example:

\(\hat{x}\) has all the qualities that great generals have

is a function true of persons (i.e., individuals), and from the pointof view ofsimple type theory, it has the same simple logicaltype as particular qualities of individuals (such as bravery anddecisiveness). However, in ramified type theory, the above functionwill be of a higher order than those particular qualities ofindividuals, since unlike those particular qualities, it involves aquantification over those qualities. So, whereas the expression“\(\hat{x}\) is brave” denotes a function of r-type\((\iota)/1\), the expression “\(\hat{x}\) has all the qualitiesthat great generals have” will have r-type \((\iota)/2\). Inthese r-types, the number after the “/” indicates thelevel of the function. The order of the functions will bedefined and computed given the following definitions.

Church defines the r-types as follows:

\(\iota\) (Greek iota) is the r-type for anindividual.
Where \(\tau_1,\ldots,\tau_m\) are any r-types and \( n \) is apositive integer, \(\ulcorner(\tau_1,\ldots,\tau_m)/n\urcorner\) is anr-type; this is the r-type of a \(m\)-ary propositional function oflevel \(n\), which has arguments of r-types\(\tau_1,\ldots,\tau_m\).

Theorder of an entity is defined as follows (here we nolonger follow Church, for he defines orders for variables, i.e.,expressions, instead of orders for the things the variables rangeover):

the order of an individual (of r-type \(\iota)\) is 0,
the order of a function of r-type \((\tau_1,\ldots,\tau_m)/n\) is\(n+N\), where \(N\) is the greatest of the order of the arguments\(\tau_1,\ldots,\tau_m\).

These two definitions are supplemented with a principle whichidentifies the levels of particular defined functions, namely, thatthe level of a defined function should be one higher than the highestorder entity having a name or variable that appears in the definitionof that function.

To see how these definitions and principles can be used to compute theorder of the function “\(\hat{x}\) has all the qualities thatgreat generals have”, note that the function can be representedas follows, where “\(x, y\)” are variables ranging overindividuals of r-type \(\iota\) (order 0),“GreatGeneral\((y)\)” is a predicate denoting apropositional function of r-type \((\iota)/1\) (and so of order 1),and “\(\phi\)” is a variable ranging over propositionalfunctions of r-type \((\iota)/1\) (and so of order 1) such asgreat general,bravery,leadership,skill,foresight, etc.:

\[(\phi)\{[(y)(\textrm{GreatGeneral}(y) \supset \phi(y)] \supset \phi \hat{x} \}\]

We first note that given the above principle, the r-type of thisfunction is \((\iota)/2\); the level is 2 because the level of ther-type of this function has to be one higher than the highest order ofany entity named (or in the range of a variable used) in thedefinition. In this case, the denotation of GreatGeneral, and therange of the variable “\(\phi\)”, is of order 1, and noother expression names or ranges over an entity of higher order. Thus,the level of the function named above is defined to be 2. Finally, wecompute the order of the function denoted above as it was defined: thesum of the level plus the greatest of the orders of the arguments ofthe above function. Since the only arguments in the above function areindividuals (of order 0), the order of our function is just 2.

Quantifying over functions of r-type \((\tau)/n\) of order \(k\) in adefinition of a new function yields a function of r-type\((\tau)/n+1\), and so a function of order one higher, \(k+1\). Twokinds of functions, then, can be of thesecond order: (1)functions of first-order functions of individuals, of r-type\(((\iota)/1)/1\), and (2) functions of r-type \((\iota)/2\), such asour example “\(\hat{x}\) has all the qualities that greatgenerals have”. This latter will be a function true ofindividuals such as Napoleon, but of a higher order than simplefunctions such as “\(\hat{x}\) is brave”, which are ofr-type \((\iota)/1\).

Logicians today use a different notion of “order”. Today,first-order logic is a logic with only variables for individuals.Second order logic is a logic with variables for both individuals andproperties of individuals. Third-order logic is a logic with variablesfor individuals, properties of individuals, and properties ofproperties of individuals. And so forth. By contrast, Church wouldcall these logics, respectively, the logic of functions of the types\((\iota)/1\) and \((\iota,\ldots,\iota)/1\), the logic of functionsof the types \(((\iota)/1)/1\) and\(((\iota,\ldots,\iota)/1,\ldots,(\iota,\ldots,\iota)/1)/1\), and thelogic of functions of the types \((((\iota)/1)/1)/1\) etc. (i.e., thelevel-one functions of the functions of the preceding type). GivenChurch’s definitions, these are logics of first-, second- andthird-order functions, respectively, thus coinciding with the modernterminology of “\(n\)^th-order logic”.

6. Variables

As mentioned previously, there are no individual or predicateconstants in the formal system of PM, only variables. TheIntroduction, however, makes use of the example “\(a\) standingin the relation \(R\) to \(b\)” in a discussion of atomic facts(PM, 43). Although “\(R\)” is later used as a variablethat ranges over relations in extension, and“\(a,b,c,\ldots\)” are individual variables, let ustemporarily add them to the system as predicate and individualconstants, respectively, in order to discuss the use of variables inPM.

PM makes special use of the distinction between “real”, orfree, variables and “apparent”, or bound, variables. Since“\(x\)” is a variable, “\(xRy\)” will be anatomic formula in our extended language, with “\(x\)” and“\(y\)” real variables. When such formulae are combinedwith the propositional connectives \(\osim\), \(\lor\), etc., theresult is amatrix. For example, “\(aRx \ldot {\lor}\ldot xRy\)” would be a matrix.

As we saw earlier, there are also variables which range overfunctions: “\(\phi\), \(\psi\), \(\ldots,f, g\)”, etc. Theexpression “\(\phi x\)” thus contains two variables andstands for a proposition, in particular, the result of applying thefunction \(\phi\) to the individual \(x\).

Theorems are stated with real variables, which gives them a specialsignificance with regard to the theory. For example,

\[\tag*{∗10·1} \vdash \colon (x) \ldot \phi x \ldot {\supset} \ldot \phi y \quad\Pp\]

is a fundamental axiom of the quantificational theory of PM. In thisPrimitive Proposition the variables “\(\phi\)” and“\(y\)” are real (free), and the “\(x\)” isapparent (bound). As there are no constants in the system, this is theclosest that PM comes to a rule of universal instantiation.

Whitehead and Russell interpret “\((x) \sdot \phi x\)” as“the proposition which assertsall the values for\(\phi \hat{x}\)” (PM 41). The use of the word “all”has special significance within the theory of types. They present thevicious circle principle, which underlies the theory oftypes:

… generally, given any set of objects such that, if we supposethe set to have a total, it will contain members which presuppose thistotal, then such as set cannot have a total. By saying that the sethas “no total”, we mean, primarily, that no significantstatement can be made about “all its members”. (PM, 37)

Specifically, then, a quantified expression, since it talks about(presupposes) “all” the members of a totality, mustexpress a member of a different, higher, logical type than thosemembers in order to observe the vicious circle principle. Thus, wheninterpreting a bound variable, we must assume that it ranges over aspecific type of entity, and so types must be assigned to the otherentities represented by expressions in the formula, in observance withthe theory of types.

A question arises, however, once one realizes that the statements ofprimitive propositions and theorems in PM such as ∗10·1are taken to be “typically ambiguous” (i.e., ambiguouswith respect to type). These statements are actually schematic andrepresent all the possible specific assertions which can be derivedfrom them by interpreting types appropriately. But if statements like∗10·1 are schemata and yet have bound variables, how dowe assign types to the entities over which the bound variables range?The answer is to first decide which type of thing the free variablesin the statement range over. For example, assuming that the variable\(y\) in ∗10·1 ranges over individuals (of type\(\iota)\), then the variable \(\phi\) must range over functions oftype \((\iota)/n\), for some \(n\). Then the bound variable \(x\) willalso range over individuals. If, however, we assume that the variable\(y\) in ∗10·1 ranges overfunctions of type\((\iota)/1\), then the variable \(\phi\) must range over functions oftype \(((\iota)/1)/m\), for some \(m\). In this case, the boundvariable \(x\) will range over functions of type \((\iota)/1\).

So \(y\) and \(\phi\) are called “real” variables in∗10·1 not only because they are free but also becausethey can range over any type. Whitehead and Russell frequently saythat real variables are taken to ambiguously denote “any”of their instances, while bound variables (which also ambiguouslydenote) range over “all” of their instances (within alegitimate totality, i.e. type).

7. Predicative Functions and Identity

The exclamation mark “!” following a variable for afunction and preceding the argument, as in “\(f\bang\hat{x}\)”, “\(\phi \bang x\)”, “\(\phi\bang\hat{x}\)”, indicates that the function ispredicative,that is, of the lowest order which can apply to its arguments. InChurch’s notation, this means that predicative functions are allof the first level, with types of the form \((\ldots)/1\). As aresult, predicative functions will be of order one more than thehighest order of any of their arguments. This analysis is based onquotations like the following, in theIntroduction to PM:

We will define a function of one variable as predicative when it is ofthe next order above that of its argument, i.e., of the lowest ordercompatible with its having that argument. (PM, 53)

Unfortunately in the summary of ∗12, we find “Apredicative function is one which contains no apparent variables,i.e., is a matrix” [PM, 167]. Reconciling this statement withthat definition in theIntroduction is a problem forscholars.

To see the shriek notation in action, consider the followingdefinition of identity:

\[\tag*{∗13·01} x = y \ldot {=} \colon (\phi) \colon \phi \bang x \ldot {\supset} \ldot \phi \bang y \quad\Df \]

That is, \(x\) is identical with \(y\) if and only if \(y\) has everypredicative function \(\phi\) which is possessed by \(x\). (Of coursethe second occurrence of “=” indicates a definition, anddoes not independently have meaning. It is the first occurrence,relating individuals \(x\) and \(y\), which is defined.)

To see how this definition reduces to the more familiar definition ofidentity (on which objects are identical iff they share the sameproperties), we need the Axiom of Reducibility. The Axiom ofReducibility states that for any function there is an equivalentfunction (i.e., one true of all the same arguments) which ispredicative:

Axiom of Reducibility: \[\tag*{∗12·1} \vdash \colon (\exists f) \colon \phi x \ldot {\equiv_x} \ldot f\bang x \quad\Pp\]

To see how this axiom implies the more familiar definition ofidentity, note that the more familiar definition of identity is:

\[ x = y \ldot {=} \colon (\phi) \colon \phi x \ldot {\supset} \ldot \phi y \quad\Df\]

for \(\phi\) of “any” type. (Note that this differs from∗13·01 in that the shriek no longer appears.) Now toprove this, assume both ∗13·01 and the Axiom ofReducibility, and suppose, for proof byreductio, that \(x =y\), and \(\phi x\), and not \(\phi y\), for some function \(\phi\) ofarbitrary type. Then, the Axiom of Reducibility ∗12·1guarantees that there will be a predicative function \(\psi \bang\),which is coextensive with \(\phi\) such that \(\psi \bang x\) but not\(\psi \bang y\), which contradicts ∗13·01.

8. Definite Descriptions

The inverted Greek letter iota “\(\atoi\)” is used in PM,always followed by a variable, to begin a definite description.\((\atoi x) \phi x\) is read as “the \(x\) such that \(x\) is\(\phi\)”, or more simply, as “the \(\phi\)”. Suchexpressions may occur in subject position, as in \(\psi(\atoi x) \phix\), read as “the \(\phi\) is \(\psi\)”. The formal partof Russell’s famous “theory of definitedescriptions” consists of a definition of all formulas“…\(\psi(\atoi x) \phi x\)…” in which adescription occurs. To distinguish the portion \(\psi\) from the restof a larger sentence (indicated by the ellipses above) in which theexpression \(\psi(\atoi x) \phi x\) occurs, thescope of thedescription is indicated by repeating the definite description withinbrackets:

\[[(\atoi x) \phi x] \sdot \psi(\atoi x) \phi x\]

The notion of scope is meant to explain a distinction which Russellfamously discusses in “On Denoting” (1905). Russell saysthat the sentence “The present King of France is not bald”is ambiguous between two readings: (1) the reading where it says ofthe present King of France that he is not bald, and (2) the readingwhich denies that the present King of France is bald. The formerreading requires that there be a unique King of France on the list ofthings that are not bald, whereas the latter simply says that there isnot a unique King of France that appears on the list of bald things.Russell says the latter, but not the former, can be true in acircumstance in which there is no King of France. Russell analyzesthis difference as a matter of the scope of the definite description,though as we shall see, some modern logicians tend to think of thissituation as a matter of the scope of the negation sign. Thus, Russellintroduces a method for indicating the scope of the definitedescription.

To see how Russell’s method of scope works for this case, wemust understand the definition which introduces definite descriptions(i.e., the inverted iota operator). Whitehead and Russell define:

\[\tag*{∗14·01} [(\atoi x) \phi x] \sdot \psi(\atoi x) \phi x \ldot {=} \colon(\exists b) \colon \phi x \ldot {\equiv_x} \ldot x=b \colon \psi b\quad\Df \]

This kind of definition is called acontextual definition,which are to be contrasted withexplicit definitions. Anexplicit definition of the definition description would have to looksomething like the following:

\[(\atoi x)(\phi x) = \colon \ldots \quad\Df\]

which would allow the definite description to be replaced in anycontext by whichever defining expression fills in the ellipsis. Bycontrast, ∗14·01 shows how a sentence, in which there isoccurrence of a description \((\atoi x)(\phi x)\) in a context\(\psi\), can be replaced by some other sentence (involving \(\phi\)and \(\psi\)) which is equivalent. To develop an instance of thisdefinition, start with the following example:

Example.
The present King of France is bald.

Using \(PKFx\) to represent the propositional function of being apresent King of France and \(B\) to represent the propositionalfunction of being bald, Whitehead and Russell would represent theabove claim as:

\[[(\atoi x)(PKFx)] \sdot B(\atoi x)(PKFx)\]

which by ∗14·01 means:

\[(\exists b) \colon PKFx \ldot {\equiv_x} \ldot x=b \colon Bb\]

In words, there is one and only one \(b\) which is a present King ofFrance and \(b\) is bald. In modern symbols, using \(b\)non-standardly, as a variable, this becomes:

\[(\exists b)[\forall x(PKFx \equiv x=b) \amp Bb]\]

Now we return to the example which shows how the scope of thedescription makes a difference:

Example.
The present King of France is not bald.

There are two options for representing this sentence.

\[[(\atoi x)(Kx)] \sdot \osim B(\atoi x)(Kx)\]

and

\[\osim [(\atoi x)(Kx)] \sdot B(\atoi x)(Kx)\]

In the first, the description has “wide” scope, and in thesecond, the description has “narrow” scope. Russell saysthat the description has “primary occurrence” in theformer, and “secondary occurrence” in the latter. Giventhe definition ∗14·01, the two PM formulas immediatelyabove become expanded into primitive notation as:

\[\begin{align}(\exists b) \colon PKFx \equiv_x x=b \colon \osim Bb\\\osim (\exists b) \colon PKFx \equiv_x x=b \colon Bb\end{align}\]

In modern notation these become:

\[\begin{align}\exists x[\forall y(PKFy \equiv y=x) \amp \osim Bx]\\\osim \exists x[\forall y(PKFy \equiv y=x) \amp Bx]\end{align}\]

The former says that there is one and only one object which is apresent King of France and this object is not bald; i.e., there isexactly one present King of France and he is not bald. This reading isfalse, given that there is no present King of France. The latter saysit is not the case that there is exactly one thing that is a presentKing of France and that object is bald. This reading is true becausethere is not even one present King of France.

Although Whitehead and Russell take the descriptions in these examplesto be the expressions which have scope, the above readings in bothexpanded PM notation and in modern notation suggest why some modernlogicians take the difference in readings here to be a matter of thescope of the negation sign.

9. Classes, Relations, and Functions

The circumflex “ˆ” over a variable preceding aformula is used to indicate a class, thus \(\hat{x} \psi x\) is theclass of things \(x\) which are such that \(\psi x\). In modernnotation we represent this class as \(\{x \mid \psi x\}\), which isread: the class of \(x\) which are such that \(x\) has \(\psi\).Recall that “\(\phi \hat{x}\)”, with the circumflex over avariable after the predicate variable, expresses the propositionalfunction of being an \(x\) such that \(\phi x\). In the type theory ofPM, the class \(\hat{x} \phi x\) has the same logical type as thefunction \(\phi \hat{x}\). This makes it appropriate to use thefollowing contextual definition, which allows one to eliminate theclass term \(\hat{x} \psi x\) from occurrences in the context \(f\):\[\tag*{∗20·01} f\{ \hat{z}(\psi z)\} \ldot {=} \colon (\exists \phi) \colon \phi \bang x \ldot {\equiv_x} \ldot \psi x \colon f \{ \phi\bang \hat{z}\} \quad\Df\] or in modern notation: \[f\{z \mid \psi z\} =_\mathit{df} \exists \phi[\forall x(\phi x \equiv \psi x) \amp f(\lambda x \phi x)]\] where \(\phi\) is apredicative function of \(x\)

Note that \(f\) has to be interpreted as a higher-order function whichis predicated of the function \(\phi \bang \hat{z}\). In the modernnotation used above, the language has to be a typed language in which\(\lambda\) expressions are allowed in argument position. As waspointed out later (Chwistek 1924, Gödel 1944, and Carnap 1947)there should be scope indicators for class expressions just as thereare for definite descriptions. (The possibility of scope ambiguitiesin propositions about classes is mentioned in the final sentences oftheIntroduction (PM I, 84)). Chwistek, for example, proposedcopying the notation for definite descriptions, thus replacing∗20·01 with:

\[[\hat{z}(\psi z)] \sdot f\{ \hat{z}(\psi z)\} \ldot {=} \colon (\exists \phi) \colon \phi \bang x \ldot {\equiv_x} \ldot \psi x \colon f \{ \phi\bang \hat{z} \}\]

Contemporary formalizations of set theory make use of something likethese contextual definitions, when they require an“existence” theorem of the form \(\exists x\forall y(y \inx \equiv \ldots y\ldots)\), in order to justify the introduction of asingular term \(\{y \mid \ldots y\ldots \}\). See Suppes (1960). Giventhe law of extensionality, it follows from \(\exists x\forall y(y \inx \equiv \ldots y\ldots)\) that there is a unique such set. Therelation of membership in classes \(\in\) is defined in PM by firstdefining a similar relationship between objects and propositionalfunctions: \[\tag*{∗20·02}x \in (\phi\bang \hat{z}) \ldot {=} \ldot \phi \bang x \quad\Df \] or, in modern notation: \[x \in \lambda z\phi z =_\mathit{df} \phi x\]

∗20·01 and ∗20·02 together are then usedto define the more familiar notion of membership in a class. Theformal expression “\(y \in \{ \hat{z}(\phi z)\}\)” can nowbeen seen as a context in which the class term occurs; it is theneliminated by the contextual definition ∗20·01.(Exercise)

In PM there is a class of all classes, Cls, defined as: \[\tag*{∗20·03} \Cls = \hat{\alpha} \{ (\exists \phi ). \alpha = \hat{z} (\phi ! z) \} \quad\Df \]

PM also has Greek letters for classes: \(\alpha, \beta, \gamma\), etc.These will appear as real (free) variables, apparent (bound) variablesand in abstracts for propositional functions true of classes, as in\(\phi \hat{\alpha}\). Only definitions of the bound Greek variablesappear in the body of the text, the others are informally defined intheIntroduction: \[\tag*{∗20·07} (\alpha) \sdot f \alpha \ldot {=} \ldot (\phi) \sdot f \{ \hat{z}(\phi\bang z)\} \quad\Df\] or, in modern notation,\[\forall \alpha\, f\alpha =_\mathit{df} \forall \phi f\{z\mid\phi z\}\] where \(\phi\) is a predicative function.

Thus universally quantified class variables are defined in terms ofquantifiers ranging over predicative functions. Likewise forexistential quantification: \[\tag*{∗20·071} (\exists \alpha) \sdot f \alpha \ldot {=} \ldot (\exists \phi) \sdot f \{ \hat{z}(\phi\bang z)\} \quad\Df\] or, in modern notation,\[\exists \alpha\, f\alpha =_\mathit{df} \exists \phi f\{z\mid\phi z\}\] where \(\phi\) is a predicative function.

Expressions with a Greek variable to the left of \(\in\) are defined:\[\tag*{∗20·081}\alpha \in \psi\bang \hat{\alpha} \ldot {=} \ldot \psi \bang \alpha \quad\Df\]

These definitions do not cover all possible occurrences of Greekvariables. In the Introduction to PM, further definitions of \(f\alpha\) and \(f \hat{\alpha}\) are proposed, but it is remarked thatthe definitions are in some way peculiar and they do not appear in thebody of the work. The definition considered for \(f \hat{\alpha}\)is:

\[f \hat{\alpha} \ldot {=} \ldot (\exists \psi) \sdot \hat{\phi} \bang x \equiv_x \psi \bang x \sdot f \{ \psi\bang \hat{z} \}\]

or, in modern notation,

\[\lambda \alpha\, f\alpha =_\mathit{df} \lambda \phi f\{x \mid \phi x\}\]

That is, \(f \hat{\alpha}\) is an expression naming the function whichtakes a function \(\phi\) to a proposition which asserts \(f\) of theclass of \(\phi\)s. (The modern notation shows that in the proposeddefinition of \(f \hat{\alpha}\) in PM notation, we shouldn’texpect \(\alpha\) in the definiens, since it is really a boundvariable in \(f \hat{\alpha}\); similarly, we shouldn’t expect\(\phi\) in the definiendum because it is a bound variable in thedefiniens.) One might also expect definitions like∗20·07 and ∗20·071 to hold for cases inwhich the Roman letter “\(z\)” is replaced by a Greekletter. The definitions in PM are thus not complete, but it ispossible to guess at how they would be extended to cover alloccurrences of Greek letters. This would complete the project of the“no-classes” theory of classes by showing how all talk ofclasses can be reduced to the theory of propositional functions.

10. Concluding Mathematical Logic

Although students of philosophy usually read no further than∗20 in PM, this is in fact the point where the“construction” of mathematics really begins. ∗21presents the “General Theory of Relations” (the theory ofrelations in extension; in contemporary logic these are treated assets of ordered pairs, following Wiener). \(\hat{x} \hat{y} \psi(x,y)\) is the relation between \(x\) and \(y\) which obtains when\(\psi(x, y)\) is true. In modern notation we represent this as as theset of ordered pairs \(\{\langle x, y \rangle \mid \psi(x, y)\}\),which is read: the set of ordered pairs \(\langle x, y \rangle\) whichare such that \(x\) bears the relation \(\psi\) to \(y\).

The following contextual definition (∗21·01) allows oneto eliminate the relation term \(\hat{x} \hat{y}\psi (x, y)\) fromoccurrences in the context \(f\):

\[ f \{ \hat{x} \hat{y} \psi ( x, y )\} \ldot {=} \colondot(\exists \phi) \colon \phi \bang ( x, y ) \ldot {\equiv_{x,y}} \ldot \psi( x,y ) \colon f \{ \phi\bang (\hat{u}, \hat{v} )\} \quad\Df\]

or in modern notation:

\[f \{\langle x, y \rangle \mid \psi( x, y )\} =_\mathit{df} \exists \phi[\forall xy (\phi(x, y) \equiv \psi( x, y) ) \ampf ( \lambda u \lambda v \phi(u,v))]\]

where \(\phi\) is a predicative function of \(u\) and \(v\).

Principia does not analyze relations (or mathematicalfunctions) in terms of sets of ordered pairs, but rather takes thenotion of propositional function as primitive and defines relationsand functions in terms of them. The upper case letters \({R}, {S}\)and \({T}\), etc., are used after ∗21 to stand for these“relations in extension”, and are distinguished frompropositional functions by being written between the arguments. Thusit is \(\psi(x,y)\) with arguments after the propositional functionsymbol, but \(xRy\). From ∗21 functions “\(\phi\) and\(\psi\)”, etc., disappear and only relations in extension,\({R}\), \({S}\) and \({T}\), etc., appear in the pages ofPrincipia. While propositional functions might be“intensional”, that is two functions may be true of thesame objects yet not be identical, no distinct relations in extensionare true of all the same objects. The logic ofPrincipia isthus “extensional”, from page 200 in volume I, through tothe end in Volume III.

∗22 on the “Calculus of Classes” presents theelementary set theory of intersections, unions and the empty set whichis often all the set theory used in elementary mathematics of othersorts. The student looking for the set theory ofPrincipia tocompare it with, say the Zermelo-Fraenkel system, will have to look atvarious numbers later in the text. The Axiom of Choice is defined at∗88 as the “Multiplicative Axiom” and a version ofthe Axiom of Infinity appears at ∗120 in Volume II as“Infin ax”. The set theory ofPrincipia comesclosest to Zermelo’s axioms of 1908 among the various familiaraxiom systems, which means that it lacks the Axiom of Foundation andAxiom of Replacement of the now standard Zermelo-Fraenkel axioms ofset theory. The system ofPrincipia differs importantly fromZermelo’s in that it is formulated in the simple theory oftypes. As a result, for example, there are no quantifiers ranging overall sets, and there is a set of all things (for each type).

∗30 on “Descriptive Functions” provides Whiteheadand Russell’s analysis of mathematical functions in terms ofrelations and definite descriptions. Frege had used the notion offunction, in the mathematical sense, as a basic notion in his logicalsystem. Thus a Fregean “concept” is a function fromobjects as arguments to one of the two “truth values” asits values. A concept yields the value “True” for eachobject to which the concept applies, and “False” for allothers. Russell, from well before discovering his theory ofdescriptions, had preferred to analyze functions in terms of therelation between each argument and value, and the notion of“uniqueness”. With modern symbolism, his view would beexpressed as follows. For each function \(\lambda x f(x)\), there willbe some relation (in extension) \(R\), such that the value of thefunction for an argument \(a\), that is \(f(a)\), will be the uniqueindividual which bears the relation \(R\) to \(a\). The result is thatthere are no function symbols inPrincipia. As Whitehead andRussell say, the familiar mathematical expressions such as“\(\sin \pi/2\)” will be analyzed with a relation and adefinite description, as a “descriptive function”. The“descriptive function”, \(R‘y\) (the \(R\) of\(y)\), is defined as follows:

\[\tag*{∗30·01} R‘y = (\atoi x)xRy \quad\Df\]

If the relation \(R\) is between persons \( x \) and \( y \) when \( x\) is the father of \( y \) then the function will take an individual\(y \) as argument to the value \(x \) as their father. For instance,if \(xRy\) is the relation ‘ \(x\) is father of \(y\)’,then R‘\(y\) is the function which maps \(y\) to the father of\(y\) (if he exists). Note that here the left argument \(x\)corresponds to the value of the function, whereas the right argument\(y\) of R is the argument, or input, to the function R‘\(y\).Likewise if \(xSy\) is the relation of a number to its successor,\(n\) to \(n + 1\), then S‘\(y\) would be the argument of thefunction which maps \(y\) to the number that it succeeds, rather thanexpressing the “successor function ” which maps a numberto its successor. This is the reverse of the order that is nowcommonly used when relating functions and relations. Nowadays wereduce functions to a binary relation between the argument in thefirst place and value in the second place. This may lead to someconfusion in the definitions of notions such as thedomainandrange of a relation below.

We conclude this section by presenting a number of prominent examplesfrom these remainder of Volume I, with their intuitive meaning,location in PM, definition in PM, and a modern version. (Some of thesenumbers are theorems rather than definitions.) Note, however, that themodern formulations will sometimes logically differ from the originalversion in PM, such as by treating relations as classes of orderedpairs, etc. More prominent is the practice in PM of defining notionsas relations, or higher order relations between relations, rather thanas functions determined by those relations. In his account of thelogic ofPrincipia, W.V. Quine (1951) objects to thecomplexity and even redundancy of much of this symbolism in comparisonwith axiomatic set theory. These formulas can be worked out, however,with a step by step application of the definitions.

For each formula number, we present the information in the followingformat:

PM Symbol

(Intuitive Meaning) [Location]
PM Definition
Modern Version

\(\ast 22 \)Calculus of Classes

\(\alpha \subset \beta\)	(\(\alpha\) is asubset of \(\beta\)) [∗22·01] \(x\in \alpha \ldot {\supset_x} \ldot x\in \beta\) \(\alpha \subseteq \beta\)
\(\alpha \cap \beta\)	(theintersection of \(\alpha\) and \(\beta)\) [∗22·02] \(\hat{x} (x \in \alpha \sdot x \in \beta\)) \(\alpha \cap \beta\)
\(\alpha \cup \beta\)	(theunion of \(\alpha\) and \(\beta\)) [∗22·03] \(\hat{x} (x \in \alpha \lor x \in \beta\)) \(\alpha \cup \beta\)
\(-\alpha\)	(thecomplement of \(\alpha)\) [∗22·04] \(\hat{x} (x\osim \in \alpha\)) [i.e., \(\hat{x} \osim (x \in\alpha\)) by ∗20·06] \(\{x \mid x \not\in \alpha \}\)
\(\alpha - \beta\)	(\(\alpha\)minus \(\beta)\) [∗22·05] \(\alpha \cap -\beta\) \(\{x \mid x\in \alpha \amp x\not\in \beta \}\)

\(\ast 23 \)Calculus of Relations

\(R \: \subset \! \! \! · \: S\)	(\(R\) is asubrelation \(S\)) [∗23·01] \(xRy \: . \supset_{x,y} \: . xSy \) \(\forall x \forall y (x R y \: \supset \: x S y)\)
\(R \: \dot{\cap} \: S\)	(theintersection of \(R\) and \(S\)) [∗23·02] \(\hat{x}\hat{y} (x R y \: . \: x S y)\) \(\{\langle x, y \rangle \| Rxy \; \amp \; Sxy \}\)
\( \dot{-} R \)	(thenegation of \(R\)) [∗23·04] \( \dot{-} R = \hat{x} \hat{y} \{ \sim (xRy) \} \) \(\{\langle x, y \rangle \| \sim Rxy \}\)

\(\ast 24 \)The Existence of Classes

\(\mathrm{V}\)	(theuniversal class) [∗24·01] \(\hat{x} (x\) = \(x)\) \(\mathrm{V}\) or \(\{x \mid x = x\}\)
\(\Lambda\)	(theempty class) [∗24·02] \(-\mathrm{V}\) \(\varnothing\)
\(\exists! \alpha \)	(the class \( \alpha \)exists) [∗24·03] \((\exists x ). x \in \alpha \) \(\exists x ( x \in \alpha )\)

\(\ast 25 \)The Existence of Relations

\(\dot{\exists}! R\)

(the relation \(R\)exists) [∗25·03]
\((\exists x,y) .\: xRy\)
\( \exists x \exists y Rxy\)

\(\ast 30 \)Descriptive Functions

\(R‘y\)

(the \(R\) of \(y)\) (a descriptive function) [∗30·01]
(\(\atoi x)(xRy)\)
\(R‘y\) is the (possibly partial) function where \(f_R (y) = x\) if \(x\) R\( y\) and this \(x\) is unique, and otherwise isundefined.

\(\ast 31 \)Converses of Relations

Cnv	(the relation between a relation and itsconverse) [∗31·01] \(\hat{Q} \hat{P} \{ xQy . \equiv_{x,y} . yPx \}\) \(\{ \langle Q , P \rangle \: \mid \: \forall x \forall y (Qxy \equivPyx) \: \} \)
\(\breve{R}\)	(theconverse of \( R \) ) [∗31·02] \(\hat{x} \hat{z} (zRx)\) \(\{\langle x,z\rangle \mid Rzx\}\)

\(\ast 32 \)Referents and Relata of a Given Term

\(\overrightarrow{R}‘y\)	(theR-predecessors of \(y)\) [∗32·01] \(\hat{x} (xRy)\) \(\{x \mid Rxy \}\)
\(\overleftarrow{R}‘x\)	(theR-successors of \(x)\) [∗32·02] \(\hat{z} (xRz)\) \(\{z \mid Rxz \}\)

\(\ast 33 \)Domains and Fields of Relations

\( \text{D} ‘R\)	(thedomain of \(R)\) [∗33·01] \(\hat{x} \{ (\exists y) \sdot xRy \}\) \(\{x \mid \exists y Rxy \}\) also: \(\mathcal{D}` R\)
\( \backd ‘R\)	(theconverse domain (range) of \(R)\) [∗33·02] \(\hat{z} \{(\exists x) \sdot xR z \}\) \(\{z \mid \exists x Rxz \}\) also \(\mathcal{R}`R\)
\(C‘R\)	(thefield of \(R)\) [∗33·03] \(\hat{x} \{(\exists y): xRy \ldot {\lor} \ldot yRx\}\) \(\{x \mid \exists y (Rxy \lor Ryx)\}\) also \(\mathcal{F}`R\)

\(\ast 34 \)The Relative Product of Two Relations

\(R\mid S\)

(therelative product of \(R\) and \(S)\) [∗34·01]
\(\hat{x} \hat{z} \{(\exists y) \sdot xRy \sdot ySz \}\)
\(R \circ S\) or \(\{\langle x,z\rangle \mid \exists y(Rxy \ampSyz)\}\)

\(\ast 35 \)Limited Domains and Converse Domains

\(\alpha \upharpoonleft R\)	(therestriction of the domain of \(R\) to \(\alpha )\) [∗35·01] \(\hat{x} \hat{y}[ x\in \alpha \sdot xRy]\) \(\{\langle x,y \rangle \mid x \in \alpha \amp Rxy \}\)
\(R \restriction \beta\)	(therestriction of the range of \(R\) to \(\beta)\) [∗35·02] \(\hat{x} \hat{y}[xRy \sdot y \in \beta]\) \(\{\langle x,y \rangle \mid Rxy \amp y \in \beta\}\)
\(\alpha \uparrow \beta\)	(the relation of members of \(\alpha\) to members of \(\beta)\) [∗35·02]. \(\hat{x} \hat{y}[x \in \alpha \sdot y \in \beta\)] \(\{\langle x,y\rangle \mid x \in \alpha \amp y\in \beta \}\), theCartesian product of \(\alpha\) and \(\beta \).

\(\ast 36 \)Relations with Limited Fields

\(P \restriction \! \! \! \downharpoonright \alpha\)

(therestriction of \(R\) to \(\alpha)\) [∗36·01]
\(\alpha \upharpoonleft P \restriction \alpha\)
\(\{\langle x,y \rangle \mid x \in \alpha \amp y \in \alpha \amp Rxy\}\)

\(\ast 37 \)Plural Descriptive Functions

\(R ‘‘\beta\)	(the terms which have the relation \(R\) to members of\(\beta\)) [∗37·01] ∗ \(\hat{x} \{(\exists y) \sdot y\in \beta \sdot x Ry\}\) \(\{x \mid \exists y(y\in \beta \amp Rxy)\}\)
\( R_{\in}\)	(the relation of \(\alpha\) to \(\beta \) when \(\alpha\) is theclass of terms which have \(R \) to members of \(\beta\)) [∗37·02] \(\hat{\alpha} \hat{\beta} ( \alpha = R ‘‘ \beta )\) \(\{\langle \alpha, \beta \rangle \mid x \in \alpha \amp \exists z ( y\in z \amp z \in \beta \amp Ryz ) \}\)

\(\ast 38\)Double descriptive functions. PM uses ametalinguistic variable “\( \venus \)” that can bereplaced by a any of a range of relations between individuals,classes, or relations, that are treated asoperations ontheir arguments. The operation of intersection can be represented as ahigher order function of its first argument. Thus \(\cap \beta `\alpha = \alpha \cap \beta\).

\(\venus \: y\)

(the relation of \( x \: \venus \: y \) to \(x \) for any \( x\)) [∗38·02]
\(\hat{u} \hat{x} ( u = x \: \venus\: y)\)
\(\{\langle u,x \rangle \mid u = x \: \venus \: y \}\)

This notion will be used later. An example with the notion of relativeproduct is an instance, thus:

\(\mid R\)	(the relation of one power of \(R\) to the next) [∗38·02] \(\hat{P}\hat{S}(P = R \mid S )\) \(\{ \langle P, S \rangle \mid P = R \circ S \}\)
\(\alpha \: \venus_{\! \! \! ,,} \: y\)	(the class of values of \(x \: \venus \: y \: \) when \(x\) isan \( \alpha \)) [∗38·03] \( \venus \: y \) “ \(\alpha \) \(\{u \mid \exists x ( x \in \alpha \: \amp \: u = x \: \venus y \: )\: \}\)
\(s ‘ \kappa\)	(thesum or union of the \(\kappa\)s) [∗40·02] \(\hat{x} \{ ( \exists \alpha ). \: \alpha \in \kappa \; . \; x \in\alpha \}\) \(\cup \kappa\), or \(\{ x \mid \exists \beta ( \beta \in \kappa \ampx \in \beta ) \}\)
\(\dot{s} ‘ \lambda \)	(thesum of the relations in \(\lambda\)) [∗41·02] \(\hat{x}\hat{y} \{ ( \exists R ). \: R \in \lambda \; . \; x R y \}\) \(\{ \langle x, y \rangle \mid \exists R \: (R \in \lambda \; \amp \;Rxy ) \}\)

11. Prolegomena to Cardinal Arithmetic (Part II)

Contemporary philosophers would consider the transition to mathematicsto begin with the theory of sets (or proper classes which are toolarge to be sets), but in PM that is also a part of MathematicalLogic. The Prolegomena to Arithmetic thus begins with the definitionsin terms of logic of explicitly arithmetical notions, the cardinalnumbers 1 and 2.

\(I \)	(the relation of identity) [∗50·01] \( I = \hat{x}\hat{y} (x = y) \) \(\{ \langle x , y \rangle \mid x = y \}\)
\(J \)	(the relation of diversity) [∗50·02] \( I = \dot{-} I \) \(\{ \langle x , y \rangle \mid x \neq y \}\)
\(\iota ‘ x\)	(theunit class of \(x\)) as defined by theorem [∗51·1] from definition [∗51·01] \(\hat{y} (y = x)\) \(\{ y \mid y = x \}\) (thesingleton \(x\))
\(\mathbf{1}\)	(the cardinal number 1) [∗52·01] \(\hat{\alpha} \{ (\exists x) \sdot \alpha = \iota‘x \}\) \(\{ \alpha \mid \exists x \; ( \alpha = \{x \} ) \}\) (the class ofall singletons) The variable \( x \) is typically ambiguous here, so will be adistinct number 1 for each type that \( x \) can assume. This applies to 2, as well, and all the natural numbers, as we willsee below.
\(\mathbf{2}\)	(the cardinal number 2) [∗54·02] \(\hat{\alpha} \{ (\exists x,y) \sdot x \neq y \sdot \alpha =\iota‘x \cup \iota‘y \}\) \(\{ \alpha \mid \exists y \exists z( y \neq z \amp \alpha = \{y \}\cup \{z\} ) \}\) (the class of all pairs)
\(x \downarrow y\)	(theordinal couple of \(x\) and \(y\)) [∗55·01] \(\iota‘x \uparrow \iota‘y\) \(\langle x, y \rangle\) (theordered pair \(\langle x,y\rangle\))

The paperback abridged edition ofPrincipia Mathematica to∗56 only goes this far, so the remaining definitions have onlybeen available to those with access to the full three volumes of PM.Russell did not make the decision to end the 1962 abridged version atthis point, but the choice is understandable. It is here thatcontemporary set theory begins to look even more different from PM.Set theory follows Norbert Wiener (1914) by representing relations assets of ordered pairs, which themselves defined as sets.(Wiener’s proposal of \( \langle x, y \rangle =_\mathit{df} \{\{ \{x \}, \emptyset \}, \{ \{y \} \} \} \) has generally beenreplaced by Kuratowski’s simpler \(\{ \{x \}, \{ x,y \} \}\)) .The remainder of PM examines the structure of relations that lead tothe mathematics of natural and real numbers, and the portion of thetheory of transfinite sets that can be carried out in the theory oftypes. This looks very different from the development of these notionsin axiomatic set theory.

\(\Cl\ ` \alpha\)	(the class ofsubclasses of \(\alpha\)) [∗60·12] \(\hat{\beta} (\beta \subset \alpha) \}\) \(\wp{\alpha}\), thepower set of \(\alpha\), \(\{x \| x\subseteq \alpha\}\)
\(\Cl\ \ex\ ` \alpha\)	(the class ofexistent subclasses of \(\alpha\)) [∗60·13] \( \hat{\beta} (\beta \subset \alpha \; . \; \exists ! \beta ) \}\) \(\{x \| x \subseteq \alpha \; \amp \; x \neq \emptyset\}\)
\(\Rl\ ` P\)	(the class ofsub relations of \(P\)) [∗61·12] \(\hat{R} \{ R \subset \! \! \! \! \cdot \; P \}\) \(\{ R \mid \forall x \forall y ( \langle x, y \rangle \in R \supset\langle x,y \rangle \: \in P) \}\)
\(\in\)	(therelation of membership in a class) [∗62·01] \(\hat{x} \hat{\alpha} ( x \in \alpha)\) \(\{ \langle x,y \rangle \mid \: x \in y \}\)

\(\ast 63\)Relative Types of Classes. The theory of types inPM allows for expressions relating classes of different types. The gapbetween set theory and the theory of classes in PM comes from the lackof a cumulative theory of classes ofany type. These PMsystem allows definitions of relations between, say, individuals andclasses of individuals. These are needed in the account of realnumbers in terms of classes of classes of ratios in Volume III.

\(t‘x\)	(the type of which \(x\) is a member) [∗63·01] \(\iota ` x \cup - \iota ` x\) \(\{ x \} \cup \{ y \mid y \not \in \{ x \} \}\)
\(t_0‘\alpha\)	(the type in which \(\alpha\) is contained) [∗63·02] \(\alpha \cup - \alpha\) \(\alpha \cup \{ x \mid x \not \in \alpha \}\)
\(t_1‘ \kappa\)	(the type next below that in which \(\kappa\) is contained) [∗63·03] \(t_0 `s`\kappa\) \(\cup \: \{ \: \cup \{ \alpha \mid \alpha \in \kappa \} \: , \: \{\beta \mid \beta \not \in \cup \{ \alpha \mid \alpha \in \kappa \} \}\: \}\)
\(t_{11}‘ \alpha\)	(the type of pairs of classes of types \(t_1 ‘ \alpha\)) [∗64·022] \(t ‘ ( t_1 ‘ \alpha \uparrow t_1 ‘ \alpha )\) The type of a pair of classes of a given type will be the same as thatof classes of those classes. This definition is in order as it stands,but would be very complex to write in contemporary notation. We leaveit as an open problem for readers to devise a concise formulaformulation.
\(\alpha \rightarrow \beta\)	(The relations withreferents in \( \alpha \) andrelata in \( \beta \) ) (from \( \alpha \)onto \(\beta \)) [∗70·01] \(\hat{R} (\overrightarrow{R}“\backd ‘R \subset \alpha\sdot \overleftarrow{R}“D‘R \subset \beta )\) \( \{ R \mid \forall x \forall y \: ( Rxy \supset [ \: \{z \mid Rxz \}\in \alpha \: \amp \: \{u \mid Rxu \} \in \beta \} \: ]\: \} \) Since 1 is the class of singleton classes, \((1 \rightarrow 1) \) willbe the class of one to one (surjective) relations.
\(\alpha \mathbin{\overline{\mathrm{sm}}} \beta\)	(the class of similarity relations between \(\alpha\) and\(\beta\)) [∗73·03] \( \{ R \mid R \in 1 \rightarrow 1 \: .\: \alpha = D‘R \: .\:\beta = \backd ‘R \} \) \(\{f \mid f : \alpha \stackrel{1-1}{\longrightarrow} \beta\}\)
\(\mathrm{sm}\)	(the relation ofsimilarity) [∗73·02] \(\hat{\alpha} \hat{\beta}(\exists! \alpha\mathbin{\overline{\mathrm{sm}}} \beta)\) \(\alpha \approx \beta\)

\(\ast 80\)Selections. Aselection function for aclass \(\kappa\) is a function \(f\) making each element \(x\) of\(\kappa\) to a member of \(x\). These are denoted by \( \in_{\Delta}`\kappa\). The cardinal number of the product of two classes \(\alphaX \beta\) is the cardinal number of the class of all pairs of membersselected from \(\alpha\) and \(\beta\), so the guarantee that suchselections exist is called theMultiplicative Axiom in PM.This is now known as theAxiom of Choice, which had beenidentified as an assumption used in proofs in set theory by ErnstZermelo in 1904. In PM it is defined as asserting that if a class\(\kappa\) is a set of mutually exclusive, non-empty classes, thenthere exists a class \(\mu\) which contains exactly one member of eachelement of \(\kappa\).

\(\in_{\Delta}`\kappa\)	(theselective relations for \(\in\)) [∗80·01] \((1 \rightarrow Cls) \; \cap \; Rl` \in \; \cap \;\overleftarrow{\backd}`\kappa\) \(\{ f \mid \forall \alpha (\alpha \in \kappa \supset f(\alpha) \in\alpha ) \}\)
\(\Cls ^2 \ \excl\)	(class of mutually exclusive classes) [∗84·01] \(\hat{\kappa} ( \alpha , \beta \in \kappa \; .\; \alpha \neq \beta \;. \supset_{\alpha , \beta} . \; \alpha \cap \beta = \Lambda )\) \(\{ \kappa \mid \forall \alpha \forall \beta (\alpha , \beta \in\kappa \; \amp \; \alpha \neq \beta \supset \alpha \cap \beta =\emptyset)\}\)
\(\Cls\ \ex ^{2} \ \excl\)	(class of mutually exclusive non-empty classes) [∗84·03] \(\Cls^2 \; \excl \; - \overleftarrow{\in} ` \Lambda\) \(\{ \kappa \mid \; \forall \alpha (\alpha \in \kappa \supset \alpha\neq \emptyset) \; \amp\) \(\; \forall \alpha \forall \beta \: [\alpha \in \kappa \: \amp \: \beta \in \kappa \supset (\alpha = \beta\vee \alpha \cap \beta = \emptyset) ] \}\)
Mult ax	(theMultiplicative Axiom) [∗88·03 ] \(\kappa \: \epsilon \: \mathrm{Cls \; ex^2 excl} \: .\supset_{\kappa} \: : (\exists \mu) : \alpha \: \epsilon \: \kappa \:. \supset_{\alpha} . \: \mu \cap \alpha \: \epsilon \: 1\) \(\forall \kappa \{ [ \; \forall \alpha (\alpha \in \kappa \supset\alpha \neq \emptyset ) \; \amp\) \(\; \forall \alpha \forall \beta \:(\alpha \in \kappa \: \amp \: \beta \in \kappa \supset (\alpha = \beta\vee \alpha \cap \beta = \emptyset)) \:] \;\supset\) \(\; \exists \mu\forall \alpha \exists x \: (\alpha \in \kappa \supset \mu \cap \alpha= \{x \} ) \}\)

\(\ast 90\)Inductive Relations. The concluding section ofVolume I presents a generalization of the structure of the naturalnumbers that underlies the principle of mathematical induction.

\(R_*\)	(theancestral of \(R)\) [∗90·01] \(\hat{x} \hat{y} \{ x \in C‘ R \colon \breve{R}“\mu\subset \mu \sdot x \in \mu \ldot {\supset_{\mu}} \ldot y \in \mu \}\) \(\{ \langle x, y \rangle \mid x \in \mathcal{F}`R \; \amp\) \(\;\forall_{\mu} [ \forall z \forall w [( z \in \mu \; \amp \; Rzw )\supset w \in \mu ] \supset y \in \mu ] \}\) Now written \(R^*\) this follows Frege’sdefinition: \(y\) is in all the \(R\)-hereditary classes that contain\(x\).
\(R_{\text{ts}}\)	(the relation between \( R \) and the series of itspowers \( R^n\) for \(n \gt 0\) , i.e., \( R (= R^1) \),\(R^2\) \(R^3\), etc. ) [∗91·02] \(( \: \mid R)_{\ast}\) \(\{ \langle P,S \rangle \mid P = R^n \: \amp \: S = R^{n+1} \} \)
Pot \(‘ R\)	(thepositive powers, i.e.,Potentia, of\(R\)) [∗91·03] \(\overrightarrow{R}_{\text{ts}} ‘ R\) \(\{ S \: \mid \; \exists n >0 \: ( S = R^n ) \} \)
\(R_{\text{po}}\)	(the union of the positive powers of \(R\)) [∗91·05] \(\dot{s}‘ \text{Pot} ‘ R\) \( \{ \langle x, y \rangle \mid \exists S \: \exists n \gt 0 \:( S =R^n \: \amp \: Sxy ) \} \)
\(xB‘P\)	( \(x\)begins the relation \(P\)) [∗93·01] \(x \in D ‘ P - \backd ‘P\) \(\{ x \mid \exists y \: Pxy \; \amp \sim \exists z Pzx \}\)
\(x \:\) min\(_P ` \alpha\)	( \(x\) is aminimal member of \( \alpha \) withrespect to \(P\)) [∗93·02] \(x \in \alpha \cap C`P - \breve{P} `` \alpha\) \(x \in \alpha \; \amp \; x \in \mathcal{F}P \; \amp \; \sim \exists z\: (Pzx \: \amp \: z \in \alpha )\)
\(\stackrel{\leftrightarrow}{R} `x\)	( thefamily of \(R\), ancestry and posterity) [∗97·01] \(\overrightarrow{R}`x \cup (\iota ` x \cap C ` R)\cup\overleftarrow{R}` x\) \(\{ y \mid Rxy \; \vee \; ( y = x \: \amp \: x \in \mathcal{F} `R) \;\vee \; Rxy \} \)

12. Cardinal Arithmetic (Part III)

With \(\ast 100\) at the start of Volume II,PrincipiaMathematica finally begins developing the theory of cardinalnumbers with theFrege-Russell Definition of numbers asclasses of equinumerous classes.

\(\mathrm{N_c}\)	(the relation between a class and its cardinal number) [∗100·01] \(\overrightarrow{\mathrm{sm}}\) \(\{ \langle \alpha, \beta \rangle \mid \beta = \{ \gamma \mid \gamma\approx \alpha \} \} \)
\(\mathrm{NC}\)	(theCardinal Numbers) [∗100·02] \( \mathrm{D} ‘ \mathrm{N_c} \) \( \{ \alpha \mid \exists \beta ( \alpha = \{ \gamma \mid \gamma\approx \beta \} \: \}\)
\(\mathbf{0}\)	(the cardinal number 0) [∗101·01] \(0 = \mathrm{N_c}‘\Lambda\) \(\{\varnothing\}\) The class of all classes equinumerous with theempty set is just the singleton containing the empty set.
\(\text{N}_0 \text{c}‘\alpha\)	( thehomogeneous cardinal of \(\alpha\)) [∗103·01] \(\text{Nc}‘ \alpha \; \cap \; t‘\alpha\) \(\{ \beta \mid \beta \approx \alpha \}\) for \(\beta\) of the sametype as \(\alpha\)
\(\text{N}_0\text{C}\)	(theHomogeneous Cardinals) [∗103·02] \(\text{D}‘ \text{N}_0 \text{c} \) \(\{ \alpha \mid \exists \beta (\alpha \; \text{is the homogenouscardinal of}\; \beta ) \}\)
\(\alpha + \beta\)	(thearithmetic sum of \(\alpha\) and \(\beta\)) [∗110·01] \(\downarrow (\Lambda \cap \beta)“\iota“\alpha \cup(\Lambda \cap \alpha) \downarrow“\iota“\beta \) This is the union of \(\alpha\) and \(\beta\)after they are made disjoint by pairing each element of \(\beta\) with\(\{ \alpha \}\) and each element of \(\alpha\) with \(\{ \beta \}\).The classes \(\alpha\) and \(\beta\) are intersected with the emptyclass, \(\Lambda\), to adjust the type of the elements of thesum. \(\{ \langle \{ a \} , \emptyset \rangle \mid a \in \alpha \} \cup \{\emptyset , \{b \} \rangle \mid b \in \beta \}\)
\(\mu +_c \nu\)	(thecardinal sum of \(\mu\) and \(\nu\)) [∗110·02] \(\hat{\xi}\{(\exists \alpha,\beta) \sdot \mu = \mathrm{N_0c}‘\alpha \sdot \nu = \mathrm{N_0c}‘\beta\sdot\xi\,\mathrm{sm}(\alpha + \beta)\}\) Cardinal addition is the arithmetic sum ofhomogeneous cardinals: \( \{\gamma \mid \exists \alpha \exists \beta \: [ \gamma \approx (\alpha + \beta ) ] \: \} \) when \(\alpha \) and \( \beta \) arehomogenous cardinals.

The reader can now appreciate why this elementary theorem is notproved until page 83 of Volume II ofPM:

\[\tag*{∗110·643} 1 +_c 1 = 2\]

Whitehead and Russell remark that “The above proposition isoccasionally useful. It is used at least three times, in…”. This joke reminds us that the theory of naturalnumbers, so central to Frege’s works, appears in PM as only aspecial case of a general theory of cardinal and ordinal numbers andeven more general classes of isomorphic structures.

\(\beta \times \alpha\)	(theproduct of classes) [∗113·02] \(s` \alpha \downarrow_{\! \! \!,,} ``\beta\) \(\{ \langle x , y \rangle \mid x \in \beta \; \amp \; y \in \alpha\}\)
\(\mu \times_{\text{c}} \nu\)	(theproduct of homogenous cardinal numbers) [∗113·03] \(\hat{\xi} \{ (\exists \alpha, \beta) . \: \mu = N_0\text{c}` \alpha\: . \: \nu = N_0\text{c}` \beta \; . \; \xi \: \text{sm} \: (\alpha\times \beta)\}\) If \(\mu = \bar{\bar{\alpha}} \; \amp \; \nu = \bar{\bar{\beta}}\)then \(\mu \times \nu = \{ \beta \mid \beta \approx ( \alpha \times\beta) \}\)
\(\alpha\) exp \( \beta \)	(theexponentiation of classes) [∗116·01] Prod ‘ \( \alpha \downarrow_{\! \! \! ,,} \)‘‘ \(\beta \) \(\{ f \mid \mathcal{D} f = \beta \; \amp \;\mathcal{R} f \subseteq \alpha \}\)
\(\mu^{\nu}\)	(theexponentiation of cardinal numbers) [∗116·02] \(\hat{\gamma} \{ (\exists \alpha, \beta) . \: \mu = N_0\text{c}`\alpha \: . \: \nu = \mathrm{N}_0\text{c}` \beta \; . \; \gamma\) sm\((\alpha\) exp \(\beta)\}\) \(\{ \gamma \mid \exists \alpha \exists \beta \: ( \mu =\bar{\bar{\alpha}} \; \amp \; \nu = \bar{\bar{\beta}} \; \amp \;\gamma \approx \alpha^{\beta} )\}\)

The following theorem, that the cardinality of the power set of\(\alpha\) is 2 raised to the power of the cardinality of \(\alpha\),\(\; \bar{\bar{ \wp{\alpha} }} = 2^{\bar{\bar{\alpha}}}\), is called“Cantor’s Proposition”, and is said to be“very useful” (PM II, 140):

\[\tag*{∗116·72} \text{Nc}‘\text{Cl}‘\alpha = 2^{\text{Nc}‘\alpha} \]

Next the notion ofgreater than arbitrary cardinals, finiteand infinite. The cardinal number of \( \alpha \) is greater than thecardinal number of \( \beta \) just in case there is a subset of \(\alpha \) that is equinumerous with \( \beta \), but there is nosubset of \(\beta \) that is equinumerous with \(\alpha \).Cantor’s famous “diagonal argument” shows thatcardinal number \( \aleph_c \) of the class of real numbers is greaterthan \( \aleph_0 \), the cardinal number of the class of naturalnumbers.

\(\mu \gt \nu \)

(greater than) [∗117·01]
\( (\exists \alpha, \beta ) \: . \: \mu = \text {N}_0\text{c}`\alpha\: . \: \nu = \mathrm{N}_0\text{c}` \beta \: . \: \exists ! \:\text{Cl} ‘ \alpha \: \cap \: \text{Nc} ‘ \beta \: . \:\sim \exists ! \: \text{Cl} ‘ \beta \: \cap \: \text{Nc} ‘\alpha \)
\( \exists \delta (\delta \in \wp{\alpha} \; \amp \; \delta \in\bar{\bar{\beta}} ) \; \amp \; \sim \exists \gamma (\gamma \in\wp{\beta} \; \amp \; \gamma \in \bar{\bar{\alpha}} ) \)

The more familiar result,Cantor’s Theorem, proves thepower set of \(\alpha\) is strictly larger, \(2^{\bar{\bar{\alpha}}}\gt \bar{\bar{ \alpha }}\).

\[\tag*{∗117·661} \mu \in \text{N}_0\text{C} \; . \: \supset \: . \; 2^{\mu} \gt \mu \]

NC induct	(theInductive Cardinals) [∗120·01] \(\hat{\alpha}\{\alpha({+_c}1)_* 0\}\) \(\{x \mid 0 S^* x\}\) The inductive cardinals are the“natural numbers”, that is, 0 and all thosecardinal numbers that are related to 0 by the ancestral of the“successor relation” \(S\), where \(xSy\) just in case \(y= x +1\).
Infin ax	(theAxiom of Infinity) [∗120·03] \(\alpha \in \text{NC induct}\:\sdot \supset_{\alpha} \sdot \:\exists!\alpha\) \(\forall \alpha (\alpha \in \{x \mid 0S^* x\} \supset \alpha \neq\varnothing)\)

The Axiom of Infinity asserts that all inductive cardinals arenon-empty. (Recall that 0 = \(\{ \varnothing \}\), and so 0 is notempty.) The Axiom of Infinity is not a “primitiveproposition” but instead to be listed as an“hypothesis” where used, that is as the antecedent of aconditional, where the consequent will be said to depend on the axiom.Technically it is not anaxiom of PM as[∗120·03] is adefinition, so this is justfurther notation in PM!

Prog	(Progressions, or \(\omega\) orderings) [∗122·01] \((1 \rightarrow 1) \cap \hat{R} (D`R = \overleftarrow{R_{\ast}}‘‘ B ‘R)\) \(\{ R \mid R\) is isomorphic to the ancestral of a relation for whichevery subset of the domain has a first element. }

“By a ‘progression’ we mean a series which is likethe series of the inductive cardinals in order of magnitude (assumingthat all the inductive cardinals exist)i.e. a series whoseterms can be called \(1_R, 2_R, 3_R, \ldots \nu_R, \ldots \). It isnot convenient todefine a progression as a series which isordinally similar to that of the inductive cardinals, both becausethis definition only applies if we assume the axiom of infinity, andbecause we have in any case to show that (assuming the axiom ofinfinity) the series of inductive cardinals has certain properties,which can be used to afford a direct definition ofprogressions.” (PM II, 245)

\(\aleph_0\)

(the smallest of Cantor’s transfinite cardinals) [∗123·01]
\(D‘‘\)Prog
\(\bar{\bar{\omega}}\)

13. Relation Arithmetic (Part IV)

The notion ofrelation number, is the generalization of thenotion of a well-ordering to an arbitrary relation. Just as a cardinalnumber is defined in PM as a class of equinumerous classes, anarbitrary relation number is a class ofordinally similarrelations.

\(S^{;}Q\)	(S is acorrelator of Q) [∗150·01] \(S \mid Q \mid \breve{S}\) \(S \circ Q \circ S^{-1}\)
\(P \: \overline{\text{smor}} \: Q\)	(the class ofsimilarities of between \(P\) and \(Q\)) [∗151·01] \(\hat{S} \{ S \in 1 \rightarrow 1 \; . \; C‘Q = \backd ‘S\; . \; P = S^{;} Q \}\) \(\{ f \mid \mathcal{F} P \stackrel{1-1}{\longrightarrow} \mathcal{F}Q \; \amp \; \forall x \forall y [ (x \in \mathcal{D} f ) \supset Pxy\equiv Q f(x) f(y)]\}\)
\(P \;\) smor \(\:Q\)	(\(P\) isordinary similar to \(Q\) ) [∗151·02] \(\{ \langle P, Q \rangle \mid \exists ! \; P \:\overline{\text{smor}} \: Q \} \) \(P \cong Q\) ( \(P\) isisomorphic to \(Q\)).
Nr\(`P\)	(therelation number of \(P\)) [∗152·01] \(\overrightarrow{\text{smor}} ‘ P\) \(\{ Q \mid P \cong Q\}\)

\(\ast 170\) The relation offirst differences orders classeson the basis of an ordering of their members. The method is avariation on the notion oflexicographic ordering of classesas in the alphabetical ordering of words in a dictionary. See Fraenkel(1968). PM uses two versions of the notion.

\(P_{\text{cl}}\)

(ordering of classes byfirst differences of \(P\)) [∗170·01]
\(\hat{\alpha}\hat{\beta} \{ \alpha, \beta \in \text{Cl}`C`P \; . \;\exists ! \; \alpha - \beta - \breve{P}`` ( \beta - \alpha ) \}\)
For \(\prec\) an ordering of individuals, \(\alpha \prec_{\text{cl}}\beta\) is
\(\{ \langle \alpha , \beta \rangle \mid \alpha\) , \(\beta \subseteq\mathcal{F}( \prec ) \; \amp \; \alpha \not\subseteq \beta \; \amp \;\forall x \forall y (x \in \beta \; \amp \; y \not \in \alpha \;\supset \; y \prec x )\: \} \)

This is explained in the Summary of \(\ast 170\):“\(\alpha\) and \(\beta\) each pick out terms from \(C`P\), andthese terms have an order conferred by \(P\); we suppose that theearlier terms selected by \(\alpha\) and \(\beta\) are perhaps thesame, but sooner or later, if \(\alpha \neq \beta\), we must come toterms which belong to one but not the other. We assume that theearliest terms of this sort belong to \(\alpha\), not to \(\beta\); inthis case, \(\alpha\)has to \( \beta\) the relation \(P_{\text{cl}}\).That is, where \(\alpha\) and \(\beta\) begin to differ, it is termsof \(\alpha\) that we come to, not terms of \(\beta\). We do notassume that there is afirst term which belongs to \(\alpha\)but not \(\beta \), since this would introduce undesirablerestrictions in case \(P\) is not well-ordered.” (PM II, 399)

\(P_{\text{lc}}\)

(converse ordering of classes byfirstdifferences of \(P\)) [∗170·02]
Cnv \(` (\breve{P})_{\text{cl}}\)
\( \{ \langle \alpha, \beta \rangle \mid \alpha \prec_{cl} \beta \} \)

“Thus \(\alpha P_{\text{lc}} \beta\) means, roughly speaking,that \(\beta - \alpha\) goes on longer than \( \alpha - \beta\), justas \(\alpha P_{\text{cl}} \beta\) means that \( \alpha - \beta\)begins sooner. This if \(P\) is the relation of earlier and later intime, and \(\alpha\) and \(\beta\) are the times when \(A\) and \(B\)respectively are out of bed, “\(\alpha P_{\text{cl}}\beta\)” will mean that \(A\) gets up earlier than \(B\) and“\(\alpha P_{\text{lc}} \beta\)” will mean that \(B\) goesto bed later than \(A\).” (PM II, 401)

14. Series (Part V)

“Series” in PM arelinear orderings.Volume II concludes halfway through this part, with Volume IIIbeginning at \(\ast250\) and the theory of well-orderings. Theseconcepts are defined in the now standard way. This section is onlyalien to the modern reader because of the notation.

trans \( P \)	(\(P\) istransitive relations) [∗201·1] \(P^2 \subset \! \! \! \! \cdot \;\; P\) \(\forall x \forall y \forall z (Pxy \: \amp \: Pyz \: \supset \:Pxz)\)
connex P	(P isconnected) [∗202·1] \(x \in C` P \; . \supset_x . \; \stackrel{\leftrightarrow}{P} ` x =C`P\) \(\forall x \forall y [(x, y \in \mathcal{F} P ) \: \amp \: x \neq y\supset Pxy \vee Pyx ]\)
Ser	(series) [∗204·01] Rl \(\: ‘ J \cap\) trans \(\cap\) connex \(\{ P \mid \; \forall x \forall y (Pxy \supset x \neq y) \; \amp \;P\) is transitive \(\amp \; P\) is connected \(\}\) or \(P\) is alinear ordering
sect	(sections) [∗211·01] sect \( ‘ P = \hat{\alpha} ( \alpha \subset C ‘ P \:. \: P“ \alpha \subset \alpha) \) \(\{ \alpha \mid \alpha \subset \mathcal{F}P \: \amp \: \forall x [\exists y ( y \in \alpha \: \amp \: Pxy ) \supset x \in \alpha \: ] \:\} \)

“The theory of the modes of separation of a series into twoclasses, one of which wholly precedes the other, and which togethermake up the whole series, is of fundamental importance. \( \ldots \)Any class which can be the first of such a pair we call asection of our series.” (PM II, 603)

\(\varsigma ‘ P\)

(the series ofsegments of \(P\)) [∗212·01]
\(P_{\text{lc}} \restriction \! \! \! \downharpoonright\)D\(`P_{\in}\)

TheSummary of \(\ast 211\) explains the definition asfollows: “The members of D\( “P_{\epsilon} \) are calledthesegments of the series generated byP. In aseries in which every sub-class has a maximum or a sequent [immediatesuccessor (cf. \( ∗206 ) ] \), \( \text{D} “P_{\epsilon}= \overrightarrow{P}“C‘P \) \((∗211·38)\),i.e. the predecessors of a class are always the predecessors of asingle term, namely the maximum of the class if it exists, or thesequent if no maximum exists. \( \ldots \) Thus in general the seriesof segments will be larger than the original series. For example, ifour original series is of the type of the series of rationals in orderof magnitude, the series of segments is of the type of the series ofreal numbers, i.e. the type of the continuum.” (PM II, 603)

We have no need for a special notation for the series of sections,since, in virtue of ∗211·13, it is \(\varsigma ‘P_{\ast} \ldots \). (PM II, 628)

Volume III begins with \(\ast 250\) onwell-orderings. Anordinal number is then defined as a class of ordinary similarwell-orderings.

Bord	(well ordered relations -Bene ordinata) [∗250·01] \(\hat{P} \{\) Cl ex \(‘C ‘P \subset \backd ‘\text{min}_{P} \}\) \(\{ P \mid \forall \alpha \; [ \: (\alpha \subseteq \mathcal{F}P \:\amp \: \alpha \neq \emptyset )\supset \exists x (x \in \alpha \amp\forall z (z \in \alpha \supset \sim Pzx )\: )\: ] \)
\(\Omega\)	(the well ordered series) [∗250·02] Ser \(\; \cap \;\) Bord \(\Omega\) is the class of well ordered linear orderings.
NO	(Ordinal Numbers) [∗251·01] Nr “ \(\Omega\) The ordinal numbers are classes of isomorphic well-ordered linearorderings.

“Zermelo’s Theorem”, that the Multiplicative Axiom(Axiom of Choice) implies that every set can be well-ordered, isderived in \(\ast 258\). This was first proved in Zermelo (1904).

\[\tag*{∗258·32 } \mu \sim \in \: 1 \: . \; \exists ! \in_{\Delta}‘\text{Cl ex} ‘ \mu \:. \supset . \: \mu \: \epsilon \: C“ \Omega \]

15. Quantity (Part VI)

The last section of PM studies therational numbers andreal numbers. They are constructed from relations betweenentities, such asbeing longer than, orbeing heavierthan, that might be measured with a ruler or balance scale.Contemporarymeasurement theory studies the relations amongentities in order to determine whichscales, or systems ofindependently characterized numbers, might be assigned to them asrepresenting the “quantities” of various properties, suchas length or weight, that they possess. Notice that the real numbersare not constructed as classes of rational numbers, but are of auniform type as “Dedekind cuts” in a series of classes ofratios. In PM, as in contemporary mathematics, because the class(segment) of rational numbers \( \{ r | r^2 \leq 2 \} \) willhave no rational number as aleast upper bound , that classitself will be identified with the irrational number \( \sqrt{2}\).The rational number \( 1/2 \) is identified with its (lower) segmentof rationals, \( \{ r | r \lt 1/ 2 \} \).

\(U\)	(greater than for inductive cardinals) [∗300·01] \((+_c 1)_{\text{po}} \; \upharpoonright \! \! \! \downharpoonright \;( \text{NC induct } - \iota ` \Lambda)\) \(\{ \langle n, m \rangle \mid \; n \gt m \}\)
Prm	(relative primes) [∗302·01] \(\hat{\rho}\hat{\sigma} \{ \rho , \sigma \: \epsilon \: \text{NCinduct} \; :\) \(\;\rho = \xi \times_c \tau \: . \: \sigma = \eta\times_c \tau . \supset_{\xi, \eta, \tau} . \tau = 1\}\) \(r \; \text{and} \; s \; \text{are}\)relatively prime iff\(\forall j \: \forall l \: \forall k \: [ ( r = j \times k \; \amp \;s = l \times k) \supset k = 1]\)
\((\rho , \sigma) \text{Prm}_{\tau} (\mu , \nu )\)	(\(\rho / \sigma \; \text{is}\; \mu / \nu\) in itslowestterms and \(\tau\) is the highest common factor of \(\mu \;\text{and} \; \nu\) ) [∗302·02] \(\rho \: \text{Prm} \: \sigma \; . \; \tau \in \text{NC induct} -\iota ` 0 \; . \; \mu = \rho \times_{\text{c}} \tau \; . \) \(\; \nu =\sigma \times_{\text{c}} \tau\) The ratio of \(r/s\) is \(m /n\) in its lowest terms with \(k\) as itshighest common factor \(=_{\text{df}}\) \(r\) and \(s\) are relativelyprime and \( m = r \times k \; \amp \; n = k \times s\)
\((\rho , \sigma) \text{Prm}(\mu , \nu )\)	(the ratio \(\rho / \sigma \; \text{is}\; \mu / \nu \; \text{inits}\) lowest terms) [∗302·03] \((\exists \tau) . \: (\rho, \sigma ) \text{Prm}_{\tau} (\mu , \nu)\) The ratio of \(r/s\) is \(m /n\) in its lowest terms \(=_{\text{df}}\exists k (\text{The ratio of}\; r/s \; \text{ is} \; m /n \text{inits lowest terms with}\; k\) as its highest common factor.)
\(\mu / \nu\)	(theratio of relations \( \mu \) and \( \nu \)) [∗303·01] \(\hat{R} \hat{S} \{ (\exists \rho, \sigma) . (\rho, \sigma)\text{Prm}(\mu, \nu) . \: \dot{\exists} ! \; R^{\sigma} \: \dot{\cap}\: S^{\rho} \}\) \(\{ \langle R, S \rangle \mid \exists r \exists s \:( r/s \) is \(m/n \) in lowest terms and \(\exists x \exists y (R^s xy \amp S^r xy)\}\)

“A distance on a line is a one-one relation whose conversedomain (and its domain too) is the whole line. If we call two suchdistances \(R\) and \(S\), we may say that they have the ratio \(\mu /\nu\) if, starting from some point \(x\) , \(\nu\) repetitions of\(R\) take us to the same point \(y\) as we reach by \(\mu\;\)repetitions of \(S\),i.e., if \(xR^{\nu} y \: . \: xS^{\mu} y\).” (PM III, 260)

Rat def	(definite ratios) [∗303·05] \(\hat{X} \{ (\exists \mu, \nu) . \: \mu , \nu \: \epsilon \:\text{D}`U \: \cap \; \backd `U . \: X = (\mu / \nu) \upharpoonright\!\!\! \downharpoonright \; t_{11} ` \mu \}\) The class of ratios restricted to members of a given type. Notice that the following definitions will not depend on the Axiom ofInfinity.
\(X \lt_r Y\)	(less than between ratios) [∗304·01] \(( \exists \mu, \nu, \rho , \sigma).\) \(\: \mu, \nu, \rho , \sigma\: \epsilon \: \text{Nc induct} \: . \sigma \neq 0 .\) \(\; \mu\times_c \sigma \lt \nu \times_c \rho \: . \; X = \mu / \nu \: . \; Y= \rho / \sigma\) \(X \lt Y =_{\text{df}} \exists j \: \exists k \: \exists m \: \existsn\) \(\: ( j \times m \lt k \times n \; \amp \; X = j/k \; \amp \; Y =n/m )\)
\(H\)	(the relationless than between definite ratios) [∗304·02] \(\hat{X} \hat{Y} \{ X,Y \: \in \: \text{Rat def} \: . \: X \lt_r Y\}\) \(\{ \langle r, s \rangle \mid \; r \: \text{is rational} \; \amp \; s\: \text{is rational} \; \amp \; r \lt s \}\) “H” is a capitaleta“\(\eta\)”, Cantor’s symbol for rational numbers.
\(\Theta\)	(real numbers) [∗310·01] \((\varsigma ‘ H ) \upharpoonright \!\!\! \downharpoonright \; (- \iota ‘ \Lambda - \iota ‘\text{D}‘ H)\)

“The series of real numbers other than 0 and infinity” (PMIII, 316) are the series of the segments of rational numbers otherthan the empty class and the whole series.

16. Conclusion

This summary cites about 110 of the definitions in PM. The last eightpages (667–674) of Volume I of the second edition (1925)consists of a complete list of 498 definitions from all three volumes.Correspondence in the Bertrand Russell Archives confirms that this wascompiled by Dorothy Wrinch. Her list can be used to trace every one ofthe other defined expressions of PM back to the notation discussed inthis entry.

Bibliography

Boolos G. , Burgess, J., and Jeffrey, R., 2007,Computabilityand Logic, 5th edition, Cambridge: Cambridge University Press.
Carnap, R., 1947,Meaning and Necessity, Chicago:University of Chicago Press.
Church, A., 1976, “Comparison of Russell’s Resolutionof the Semantical Antinomies with That of Tarski”,Journalof Symbolic Logic, 41: 747–60.
Chwistek, L., 1924, “The Theory of ConstructiveTypes”,Annales de la Société Polonaise deMathématique (Rocznik Polskiego TowarzystwaMatematycznego), II: 9–48.
Curry, H.B., 1937, “On the use of Dots as Brackets inLogical Expressions”,Journal of Symbolic Logic, 2:26–28.
Elkind, Landon D.C., and Zach, R., 2023, “The Genealogy of\( \vee \)”,Review of Symbolic Logic, 16(3):862–899.
Feys, R. and Fitch, F.B., 1969,Dictionary of Symbols ofMathematical Logic, Amsterdam: North Holland.
Fraenkel, A.A., 1968,Abstract Set Theory, Amsterdam:North Holland.
Gödel, K., 1944, “Russell’s MathematicalLogic”, in P.A. Schilpp, ed.,The Philosophy of BertrandRussell, LaSalle: Open Court, 125–153.
Krivine, J-L., 1971,Introduction to Axiomatic SetTheory, Dordrecht: D. Reidel.
Landini, G., 1998,Russell’s Hidden SubstitutionalTheory, New York and Oxford: Oxford University Press.
Linsky, B., 1999,Russell’s Metaphysical Logic,Stanford: CSLI Publications.
–––, 2009, “From Descriptive Functions toSets of Ordered Pairs”, inReduction – Abstraction– Analysis, A. Hieke and H. Leitgeb (eds.), Ontos: Munich,259–272.
–––, 2011,The Evolution of PrincipiaMathematica: Bertrand Russell’s Manuscripts and Notes for theSecond Edition, Cambridge: Cambridge University Press.
Quine, W.V.O., 1951, “Whitehead and the Rise of ModernLogic”,The Philosophy of Alfred North Whitehead, ed.P.A. Schilpp, 2nd edition, New York: Tudor Publishing,127–163.
Russell, B., 1905, “On Denoting”,Mind(N.S.), 14: 530–538.
Suppes, P., 1960,Axiomatic Set Theory, Amsterdam: NorthHolland.
Turing, A.M., 1942, “The Use of Dots as Brackets inChurch’s System”,Journal of Symbolic Logic,7:146–156.
Whitehead, A.N. and B. Russell, [PM],PrincipiaMathematica, Cambridge: Cambridge University Press,1910–13, 2nd edition, 1925–27.
Whitehead, A.N. and B. Russell, 1962,Principia Mathematica to∗56, Cambridge: Cambridge University Press.
Zermelo, E., 1904, “Proof that every set can bewell-ordered”, inFrom Frege to Gödel, J. vanHeijenoort (ed.), Cambridge, Mass: Harvard University Press, 1967,139–141.

Academic Tools

How to cite this entry.
Preview the PDF version of this entry at theFriends of the SEP Society.
Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO).
Enhanced bibliography for this entryatPhilPapers, with links to its database.

Other Internet Resources

Principia Mathematica, first edition (1910–13), reproduced in the University ofMichigan Historical Math Collection.
Russell’s “On Denoting”, from the reprint inLogic and Knowledge (R. Marsh, ed.,1956) of the original article inMind 1905, typed into HTMLby Cosma Shalizi (Center for the Study of Complex Systems, U.Michigan)

Acknowledgments

The author would like to thank: Gregory Landini, Dick Schmitt, FranzFritsche, Rafal Urbaniak, Adam Trybus, Pawel Manczyk, KennethBlackwell, Dirk Schlimm, and Paolo Argolo for corrections to thisentry. Axel Boldt studied the most recent revisions and found numerousmathematical errors and among other insights pointed out the use ofdouble dots for conjunction at [∗10·55] and the oddityinvolved in the PM notions of the domain and range of a function.

Open access to the SEP is made possible by a world-wide funding initiative.
The Encyclopedia Now Needs Your Support
Please Read How You Can Help Keep the Encyclopedia Free

Browse

About

Support SEP

Mirror Sites

View this site from another server:

USA (Main Site)Philosophy, Stanford University

Info about mirror sites

Library of Congress Catalog Data: ISSN 1095-5054

Movatterモバイル変換