Principia Mathematica

First published Tue May 21, 1996; substantive revision Wed Jun 23, 2021

This entry briefly describes the history and significance of AlfredNorth Whitehead and Bertrand Russell’s monumental but littleread classic of symbolic logic,Principia Mathematica (PM),first published in 1910–1913. The content of PM is described ina section by section synopsis, stated in modernized logical notationand described following the introductory notes from each of the threevolumes. The original notation is presented in a companion article ofthis Encyclopedia,The Notation of Principia Mathematica. The content of PM is described so as to facilitate a comparison withGottlob Frege’sBasic Laws of Arithmetic whichwas subject to Russell’s Paradox. To avoid the paradox Whiteheadand Russell introduced a complex system now called “the ramifiedtheory of types”. After the introduction of a theory of sets, or“classes” early in the first volume, however, the systemof PM can be compared with both Frege and the early development of settheory and found to contain rival accounts, free of contradiction, butdiffering from the now standard theories in as yet understudiedways.

1. Overview

Principia Mathematica, the landmark work in formal logicwritten byAlfred North Whitehead andBertrand Russell, was first published in three volumes in 1910, 1912 and 1913. A secondedition appeared in 1925 (Volume I) and 1927 (Volumes II and III). In1962 an abbreviated issue (containing only the first 56 chapters)appeared in paperback.

Written as a defense oflogicism (the thesis that mathematicsis in some significant sense reducible to logic), the book wasinstrumental in developing and popularizing modern mathematical logic.It also served as a major impetus for research in the foundations ofmathematics throughout the twentieth century. Along withAristotle’sOrganon and Gottlob Frege’sBasic Laws of Arithmetic, it remains one of the mostinfluential books on logic ever written.

This entry includes a presentation of the main definitions andtheorems used in the development of the logicist project in PM. Theentry indicates a path through the whole work presenting the basicresults proved inPrincipia Mathematica (PM) in a somewhatmore contemporary notation, so as to make it easy to compare thesystem of Whitehead and Russell with that of Frege, the other mostprominent advocate of logicism in the foundations of mathematics. Theaim of that program, as described by Russell in the opening lines ofthe preface to his 1903 bookThe Principles of Mathematics,namely to define mathematical notions in terms of logical notions, andto derive mathematical principles, so defined, from logical principlesalone:

The present work has two main objects. One of these, the proof thatall pure mathematics deals exclusively with concepts definable interms of a very small number of fundamental concepts, and that all itspropositions are deducible from a very small number of fundamentallogical principles, is undertaken in Parts II–VII of this work,and will be established by strict symbolic reasoning in VolumeII.…The other object of this work, which occupies Part I., isthe explanation of the fundamental concepts which mathematics acceptsas indefinable. This is a purely philosophical task…. (1903:xv)

Though Frege’s system was subject to Russell’s Paradox,subsequent examination of his system shows how much of the developmentof arithmetic is possible independently of the paradoxical elements ofthe system. In particular, recent interest in Frege’s system hasled to the isolation of what is called “Frege’sTheorem” as possible in a consistent fragment of Frege’soriginal system, and from it the goal of deriving arithmetic, asformalized in Peano’s Postulates. See the entryFrege’s theorem and foundations for arithmetic, which presents this aspect of Frege’s system in contemporarynotation.

Russell had writtenThe Principles of Mathematics (PoM),which presents the basic elements of his logicist program, beforediscovering Frege’s similar work inFoundations ofArithmetic andBasic Laws of Arithmetic, inJune of 1902. As he describes in the Preface, Russell intended aformal presentation of his account in a “Volume II” ofPoM. In 1903 he enlisted Alfred North Whitehead to join him in thewriting of this second volume, but soon the project turned into a newwork,Principia Mathematica, a massive three volume work,that was not to be published until 1910 (Volume I), 1912 (Volume II)and 1913 (Volume III).

The system of PM differed significantly from Frege’s system, ina large part because of the introduction of thetheory oftypes whose purpose was to avoid the paradox that had affectedFrege in a principled fashion. A second important difference fromFrege’s system is that PM is based on a logic of relations ofvarious numbers of arguments, whereas Frege’s system was basedon the notions of function and object, with even his distinctivelylogicalconcepts being seen as functions (from a number ofobjects to the truth values T and F, which are also objects inFrege’s system.) So, it might be said that PM is based on atheory of ramified types of relations, in contrast to Frege’ssecond order predicate calculus with concepts. The most important stepis to define set expressions in terms of higher-order functions. Thusthe paradoxical “Russell set”, the set of all sets whichare not members of themselves, \(\{ x \mid x \notin x\}\), is defined byan expression involving functions that will violate the theory oftypes. The expression for the offending class is ruled out on thebasis of the theory of types, as is its seeming innocuous complement,the set of all sets thatare members of themselves, \(\{ x\mid x \in x\}\). In contemporary set theory \(\{ x \mid x \notinx\}\) is the universe of sets, which is not itself a set, and becauseno set is an element of itself, \(\{ x \mid x \in x\}\) is just theempty set. An additional cost of this method is that while for Fregesets are objects of the lowest types, there will be sets in the PMtheory in asimple theory of types, which distinguishesindividuals and sets of individuals and sets of sets of individuals,etc. Even to derive a hierarchy of sets in the simple theory theaxiom of reducibility is needed to guarantee that morecomplex “impredicative” definitions pick out sets of thesame simple type. Thus the “least upper bound” of a closedinterval of real numbers will identify a member of that set of ahigher order in the ramified theory. That this least upper bound willbe of the same simple type requires the axiom of reducibily. The costof adopting the theory of types to avoid the paradox extends todifficulties in constructing the natural numbers. While Russellfollows Frege in many important details, in particular in usingFrege’s notion of the ancestral of the successor relation todefine the natural numbers, other parts of the construction areimportantly different. Frege was able to define the successor of anumber by using the set of its predecessors. The number 2 is the setcontaining 0 and 1, and thus it has two members. They will, however,be of different types in the hierarchy of simple types, and so thewhole set of natural numbers cannot be defined within the theory ofsimple types. Since each step from 0 to 1, to 2, etc, raises thesimple types from 0 to 1 to 2, there will be no simple type of all thenatural numbers, so defined. Instead PM adopts theaxiom ofinfinity which assures the existence of an infinite number ofindividuals, allowing for the construction of the natural numbers foreach type above a lower bound of 3 or so (as numbers will be sets ofequinumerous sets of individuals…).

With this turn to the ramified theory of types, along with the extraaxioms of reducibility, and infinity, it is possible for PM to definea version of Frege’s construction of the Natural Numbers so thatthe “Peano axioms” can be proved from logic alone. Thistakes up to section∗120, well into Volume II. Atthis point the alternative to “Frege’s Theorem” iscompleted, in the sense that we are presented with a consistentdevelopment of the natural numbers, based on a theory of higher-orderlogic with a number of additional axioms. Philosophers soon followedLudwig Wittgenstein (1922) and disputed the idea that these additionalaxioms, the axioms of reducibility and infinity, are really logicaltruths, and so denied that the logicist program of reducing arithmeticto logic was any more successful than Frege’s attempt hadbeen.

The survey of PM will proceed through the remainder of Volume II andthrough Volume III, where the theories of rational and real numbersare developed. The contrast intended here is not with Frege’stheories of rational and real numbers, which are present inGrundgesetze but are not seen as a natural extension of thetheory of natural numbers. Instead the contemporary account of naturalnumbers and real numbers is seen as an elementary extension of theaxiomatic Zermelo-Frankel set theory. A contemporary textbook inaxiomatic set theory, such as Enderton (1977) or Suppes (1960), showshow to construct rational numbers (and negative integers) as pairs ofnatural numbers, thus 3/4 is constructed as the pair with theoperations of addition and multiplication defined as operations onpairs; thus \(1/2 + 1/3 = 10/12 = 5/6\). These positive rationalnumbers are extended to the whole set by adding negative integers, andthen real numbers are defined as Dedekind cuts in the rationalnumbers, i.e., the set of partitions of sets of rational numbers. Thearithmetic of real numbers is then defined for these constructions,and so with sets of real numbers the whole of analysis can be reducedto arithmetic. PM, however, avoids this “arithmetization”of analysis, but instead defines rational, real and in fact a hugeclass of “relation numbers” as sets of isomorphic sets ofrelations. Russell says later that he regrets that the theory ofrelation numbers was not picked up by later set theorists, even thoughthis was some of his most original work in PM. The brief summary ofthese later topics that we include below, can therefore be seen as asummary of the interesting consequences of taking a different route tothe definition of natural numbers based on a logic of relations andproperties, rather than the set theory of contemporary foundations ofmathematics. This entry is thus aimed at an explication of the unusualorder of presentation of these results, in comparison with both Fregeand contemporary set theory, and to illustrate these aspects of thetheory of relations that are not investigated by contemporaryresearchers.

2. History of and Significance ofPrincipia Mathematica

2.1 History ofPrincipia Mathematica

Logicism is the view that (some or all of) mathematics can be reducedto (formal) logic. It is often explained as a two-part thesis. First,it consists of the claim that all mathematical truths can betranslated into logical truths or, in other words, that the vocabularyof mathematics constitutes a proper subset of the vocabulary of logic.Second, it consists of the claim that all mathematical proofs can berecast as logical proofs or, in other words, that the theorems ofmathematics constitute a proper subset of the theorems of logic. AsRussell writes, it is the logicist’s goal “to show thatall pure mathematics follows from purely logical premises and usesonly concepts definable in logical terms” (1959: 74).

The logicist thesis appears to have been first advocated in the lateseventeenth century by Gottfried Leibniz. Later, the idea was defendedin much greater detail by Gottlob Frege. During the critical movementof the 1820s, mathematicians such as Bernard Bolzano, Niels Abel,Louis Cauchy, and Karl Weierstrass succeeded in eliminating much ofthe vagueness and many of the contradictions present in themathematics of their day. By the mid- to late-1800s, William Hamiltonhad gone on to introduce ordered couples of reals as the first step insupplying a logical basis for the complex numbers and KarlWeierstrass, Richard Dedekind, and Georg Cantor had all developedmethods for founding the irrationals in terms of the rationals. Usingwork done by H.G. Grassmann and Richard Dedekind, Guiseppe Peano hadthen gone on to develop a theory of the rationals based on his nowfamous axioms for the natural numbers. By Frege’s day, it wasthus generally recognized that large parts of mathematics could bederived from a relatively small set of primitive notions.

Even so, it was not until 1879, when Frege developed the necessarylogical apparatus, that logicism could finally be said to have becometechnically plausible. After another five years’ work, Fregearrived at the definitions necessary for logicising arithmetic andduring the 1890s he worked on many of the essential derivations.However, with the discovery of paradoxes such asRussell’s paradox at the turn of the century, it appeared that additional resourceswould need to be developed if logicism were to succeed.

By 1902, both Whitehead and Russell had reached this same conclusion.Both men were in the initial stages of preparing second volumes totheir earlier books on related topics: Whitehead’s 1898ATreatise on Universal Algebra and Russell’s 1903ThePrinciples of Mathematics. Since their research overlappedconsiderably, they began collaborating on what would eventually becomePrincipia Mathematica. By agreement, Russell worked primarilyon the philosophical parts of the project, including the book’sphilosophically rich Introduction, the theory ofdescriptions, and the no-class theory (in which set or class terms becomemeaningful only when placed in well-defined contexts), all of whichcan still be read fruitfully even by non-specialists. The two men thencollaborated on the technical derivations. As Russell writes,

As for the mathematical problems, Whitehead invented most of thenotation, except in so far as it was taken over from Peano; I did mostof the work concerned with series and Whitehead did most of the rest.But this only applies to first drafts. Every part was done three timesover. When one of us had produced a first draft, he would send it tothe other, who would usually modify it considerably. After which, theone who had made the first draft would put it into final form. Thereis hardly a line in all the three volumes which is not a jointproduct. (1959: 74)

Initially, it was thought that the project might take a year tocomplete. Unfortunately, after almost a decade of difficult work onthe part of the two men, Cambridge University Press concluded thatpublishingPrincipia would result in an estimated loss of 600pounds. Although the press agreed to assume half this amount and theRoyal Society agreed to donate another 200 pounds, this still left a100-pound deficit. Only by each contributing 50 pounds were theauthors able to see their work through to publication. (Whitehead,Russell, & James 1910)

Publication involved the enormous job of type-setting all threevolumes by hand. In 1911, the printing of the second volume wasinterrupted when Whitehead discovered a difficulty with the symbolism.The result was the insertion (on roman numeral pages) of a long“Prefatory Statement of Symbolic Conventions” at thebeginning of Volume II.

The initial print run of 750 copies of Volume I and 500 copies of eachof Volumes II and II from Cambridge University Press had been sold by1922 when Rudolf Carnap wrote to Russell asking for a copy. Russellresponded by sending Carnap a 35 page handwritten summary of thedefinitions and some important theorems in the work (Linsky 2011:14–15). As no plates were available for a second printing,Russell began the work of preparing a second edition that appeared in1925–27. The first was reset along with a new introduction andthree appendices, and Volume II was reset as well. Volume III wasreproduced by a photographic process, and so the page numbers from thefirst edition are the same in this volume.PrincipiaMathematica is still in print with Cambridge University Press.

Aswith many works in mathematics, the later progress of the field ofsymbolic logic led to numerous improvements. Work in theschool of logic started by David Hilbert at Göttingen and in thePolish school of logicians led by S. Leśniewski and his most famous student, Alfred Tarski, began with correctingwhat they saw as defects and gaps in PM. See Kahle (2013) and Wolenski (2013). The criticisms were immediate,begun by Chwistek (1912) soon after the first volume had beenpublished. A series of important new presentations of mathematicallogic, in particular Hilbert and Ackermann (1928), Hilbert and Bernays(1934), and Kleene (1952), were adopted as text books by successivegenerations of logicians. As pointed out in Urquhart (2013) this leadto a slow decline in the number of references to PM in technical workin logic, as well as its gradual replacement by other texts for theIntroduction to Symbolic Logic courses that soon became a stapleoffering of university departments of philosophy. By the 1950s PM wasno longer used as a textbook, even in graduate courses. PM’sinfluence, then, was enormous from 1910 to 1950, with it now havingthe status of a recognized classic that is unfamiliar to students oflogic, and even unreadable because of its superseded notation. Thisentry, together with the entry onthe notation inPrincipia Mathematica, are intended to make the contributions of this monumental workavailable, and to enable further research on some of the ideas hiddenin those three long volumes.

2.2 Significance ofPrincipia Mathematica

AchievingPrincipia’s main goal proved to be achallenge. An initial response among mathematicians and logicians inGermany and Poland was to decry the decline in standards of formalrigor set by Frege. This complaint was voiced by Frege himself, in aletter to Philip Jourdain in 1912:

…I do not understand the English language well enough to beable to say definitely that Russell’s theory (PrincipiaMathematica I, 54ff) agrees with my theory of functions of thefirst, second, etc. levels. It does seem so. But I do not understandall of it. It is not quite clear to me what Russell intends with hisdesignation \(\phi \bang \hat{x}\). I never know for sure whether heis speaking of a sign or of its content. (Frege 1980: 78)

This claim that the notion of “propositional function” issubject to use-mention confusions has persisted to this day. Thisentry will present a modernized version of the syntax of PM, combinedwith an account of the notation for types in the works of AlonzoChurch (1974, 1976). Modern theories of types allow for a coherentsyntax for higher-order languages which many find adequate to meetthese objections. The complaint about the formulation of the syntax ofPM was repeated and a further difficulty was expressed by Gödel(1944) in his influential survey of PM:

It is to be regretted that this first comprehensive and thorough-goingpresentation of a mathematical logic and the derivation of mathematicsfrom it [is] so greatly lacking in formal precision in the foundations(contained in∗1–∗21 ofPrincipia) that it presents in this respect a considerable stepbackwards as compared with Frege. What is missing, above all, is aprecise statement of the syntax of the formalism. Syntacticalconsiderations are omitted even in cases where they are necessary forthe cogency of the proofs, in particular in connection with the“incomplete symbols”. These are introduced not by explicitdefinition, but by rules describing how sentences containing them areto be translated into sentences not containing them. To be sure,however, that (or for what expressions) this translation is possibleand uniquely determined and that (or to what extent) the rules ofinference apply to the new kind of expressions, it is necessary tohave a survey of all possible expressions, and this can be furnishedonly by syntactical considerations. (Gödel 1944 [1951: 126])

The issue with respect to defined expressions, including the“incomplete symbols” for classes and definite descriptionswhich are explained below, is still problematic for interpreting PM.The difficulty is that certain defined expressions such as thenotation for definite descriptions, class abstracts and even theidentity symbol ‘\(=\)’, are not specified in the initialdescription of the syntax of the theory, nor are they shown to bevalidly used as instances of the axioms with their apparent syntax.The method of “contextual definition” used in PM isdifficult to formulate rigorously and is not used in contemporarylogical theories. The modern presentation of PM in this entry includesthe symbols for descriptions and classes, thus differing from thecompletely rigorous presentations of Church (1976), for example, whoavoids both definite descriptions and class expressions, and takesidentity as an undefined primitive.

Despite these reactions to the rigor of the presentation, PMnevertheless was studied carefully by those interested in the newsymbolic logic including David Hilbert and those in his school inGöttingen (see Ewald & Sieg 2013: 3 and Chwistek 1912).Primarily at issue were the kinds of assumptions Whitehead and Russellneeded to complete their project. AlthoughPrincipiasucceeded in providing detailed derivations of many major theorems infinite and transfinite arithmetic, set theory, and elementary measuretheory, three axioms in particular were arguably non-logical incharacter: the axioms of infinity, reducibility and the“multiplicative axiom” or Axiom of Choice. The axiom ofinfinity in effect states that there exists an infinite number ofobjects. Arguably it makes the kind of assumption generally thought tobe empirical rather than logical in nature. The multiplicative axiom,later added to Zermelo’s axioms as the Axiom of Choice, assertsthe existence of a certain set containing one element from each memberof a given set. Russell objected that without a rule guiding thechoice, such an axiom was not a logical principle. The axiom ofreducibility was introduced as a means of overcoming the notcompletely satisfactory effects of thetheory of types, the mechanism Russell and Whitehead used to restrict the notion of awell-formed expression, thereby avoiding Russell’s paradox.Although technically feasible, many critics concluded that the axiomwas simply tooad hoc to be justified philosophically.Initially at least, Leon Chwistek (1912) believed that it led to acontradiction. Kanamori sums up the sentiment of many readers:

In traumatic reaction to his paradox Russell had built a complexsystem of orders and types only to collapse it with his Axiom ofReducibility, a fearful symmetry imposed by an artful dodger. (2009:411)

In the minds of many, the issue of whether mathematics could bereduced to logic, or whether it could be reduced only to set theory,thus remained open.

In response, Whitehead and Russell argued that both axioms weredefensible on inductive grounds. As they tell us in the Introductionto the first volume ofPrincipia,

self-evidence is never more than a part of the reason for accepting anaxiom, and is never indispensable. The reason for accepting an axiom,as for accepting any other proposition, is always largely inductive,namely that many propositions which are nearly indubitable can bededuced from it, and that no equally plausible way is known by whichthese propositions could be true if the axiom were false, and nothingwhich is probably false can be deduced from it. If the axiom isapparently self-evident, that only means, practically, that it isnearly indubitable; for things have been thought to be self-evidentand have yet turned out to be false. And if the axiom itself is nearlyindubitable, that merely adds to the inductive evidence derived fromthe fact that its consequences are nearly indubitable: it does notprovide new evidence of a radically different kind. Infallibility isnever attainable, and therefore some element of doubt should alwaysattach to every axiom and to all its consequences. In formal logic,the element of doubt is less than in most sciences, but it is notabsent, as appears from the fact that the paradoxes followed frompremisses which were not previously known to require limitations.(1910: 62 [1925: 59])

Whitehead and Russell were also disappointed by the book’slargely indifferent reception on the part of many workingmathematicians. As Russell writes,

Both Whitehead and I were disappointed thatPrincipiaMathematica was only viewed from a philosophical standpoint.People were interested in what was said about the contradictions andin the question whether ordinary mathematics had been validly deducedfrom purely logical premisses, but they were not interested in themathematical techniques developed in the course of thework.…Even those who were working on exactly the same subjectsdid not think it worth while to find out whatPrincipiaMathematica had to say on them. I will give two illustrations:Mathematische Annalen published about ten years after thepublication ofPrincipia a long article giving some of theresults which (unknown to the author) we had worked out in Part IV ofour book. This article fell into certain inaccuracies which we hadavoided, but contained nothing valid which we had not alreadypublished. The author was obviously totally unaware that he had beenanticipated. The second example occurred when I was a colleague ofReichenbach at the University of California. He told me that he hadinvented an extension of mathematical induction which he called‘transfinite induction’. I told him that this subject wasfully treated in the third volume of thePrincipia. When Isaw him a week later, he told me that he had verified this. (1959:86)

Despite such concerns, PM proved to be remarkably influential in atleast three ways. First, it popularized modern mathematical logic toan extent undreamt of by its authors. By using a notation moreaccessible than that used by Frege, Whitehead, and Russell managed toconvey the remarkable expressive power of modern predicate logic in away that previous writers had been unable to achieve. Second, byexhibiting so clearly the deductive power of the new logic, Whiteheadand Russell were able to show how powerful the idea of a modern formalsystem could be, thus opening up new work in what soon was to becalled metalogic. Third,Principia Mathematica re-affirmedclear and interesting connections between logicism and two of the mainbranches of traditional philosophy, namely metaphysics and epistemology,thereby initiating new and interesting work in both of theseareas.

As a result, not only didPrincipia introduce a wide range ofphilosophically rich notions (includingpropositional function,logical construction, andtype theory), it also set the stage for the discovery of crucial metatheoreticresults (including those of Kurt Gödel, Alonzo Church, AlanTuring and others). Just as importantly, it initiated a tradition ofcommon technical work in fields as diverse as philosophy, mathematics,linguistics, economics and computer science.

Today a lack of agreement remains over the ultimate philosophicalcontribution ofPrincipia, with some authors holding that,with the appropriate modifications, logicism remains a feasibleproject. Others hold that the philosophical and technicalunderpinnings of the project remain too weak or too confused to be ofgreat use to the logicist. (For more detailed discussion, readersshould consult Quine 1963, 1966a, 1966b; Landini 1998, 2011; Linsky1999, 2011; Hale and Wright 2001; Burgess 2005; Hintikka 2009; andGandon 2012.)

There is also lack of agreement over the importance of the secondedition of the book, which appeared in 1925 (Volume I) and (Volumes IIand III were directly reprinted from the first edition in 1927). Therevisions were done by Russell, although Whitehead was given theopportunity to advise. In addition to the correction of minor errorsthroughout the original text, changes to the new edition included anew Introduction and three new appendices. (The appendices discuss thetheory of quantification, mathematical induction and the axiom ofreducibility, and the principle of extensionality respectively.) Thebook itself was reset more compactly, making page references to thefirst edition obsolete. Russell continued to make corrections as lateas 1949 for the 1950 printing, the year he and Whitehead's widowfinally began to receive royalties.

Today there is still debate over the ultimate value, or even thecorrect interpretation, of some of the revisions, revisions that weremotivated in large part by the work of some of Russell’sbrightest students, including Ludwig Wittgenstein and FrankRamsey. Appendix B has been notoriously problematic. The appendixpurports to show how mathematical induction can be justified withoutuse of the axiom of reducibility; but as Alasdair Urquhartreports,

The first indication that something was seriously wrong appeared inGödel’s well known essay of 1944, “Russell’sMathematical Logic”. There, Gödel points out that line (3)of the demonstration of Russell’s proposition∗89·16 is an elementary logical blunder, while thecrucial ∗89·12 also appears to be highly questionable.It still remained to be seen whether anything of Russell’s proofcould be salvaged, in spite of the errors, but John Myhill providedstrong evidence of a negative verdict by providing a model-theoreticproof in 1974 that no such proof as Russell’s can be given inthe ramified theory of types without the axiom of reducibility.(Urquhart 2012)

Linsky (2011) provides a discussion, both of the Appendix itself andof the suggestion that by 1925 Russell may have been out of touch withrecent developments in the quickly changing field of mathematicallogic. He also addresses the suggestion, made by some commentators,that Whitehead may have been opposed to the revisions, or at leastindifferent to them, concluding that both charges are likely withoutfoundation. (Whitehead’s own comments, published in 1926 inMind, shed little light on the issue.)

3. Contents ofPrincipia Mathematica

Principia Mathematica originally appeared in three volumes.

Title Page, First Edition, Principia Mathematica, Volume 1, see link below for text

Title page of the first edition ofPrincipia Mathematica, Volume I (1910)

Cover of the first edition of Principia Mathematica to *56, see link below for text

Cover of the first paperback issue ofPrincipia Mathematica to ∗56 (1962)

Together, the three volumes are divided into six parts. The commentarythat follows will go through the sections in order, indicating inthe early parts where a reader can skip ahead to study the uniquefeatures of the development of mathematics in the PM system ascontrasted with that of Frege and contemporary set theory.

3.1 Volume I

Volume I is divided into a lengthy Introduction in threesections, followed by two major Parts I (divided into SectionsA–E) and II (also divided into Sections A–E):

Preliminary Explanations of Ideas and Notations
The Theory of Logical Types
Incomplete Symbols
Part I: Mathematical Logic
- A. The Theory of Deduction∗1–∗5
- B. Theory of Apparent Variables∗9–∗14
- C. Classes and Relations∗20–∗25
- D. Logic of Relations∗30–∗38
- E. Products and Sums of Classes∗40–∗43
Part II: Prolegomena to Cardinal Arithmetic
- A. Unit Classes and Couples∗50–∗56
- B. Sub-Classes, Sub-Relations, and Relative Types∗60–∗65
- C. One-Many, Many-One and One-One Relations∗70–∗74
- D. Selections∗80–∗88
- E. Inductive Relations∗90–∗97

3.2 Volume II

Volume II begins with a preliminary section on notational conventionsfollowed by Parts III (divided into Sections A–C), IV (dividedinto Sections A–D), and the first half of Part V (SectionsA–C):

Prefatory Statement of Symbolic Conventions
Part III: Cardinal Arithmetic
- A. Definition and Logical Properties of Cardinal Numbers∗100–∗106
- B. Addition, Multiplication and Exponentiation∗110–∗117
- C. Finite and Infinite∗118–∗126
Part IV: Relation-Arithmetic
- A. Ordinal Similarity and Relation-Numbers∗150–∗155
- B. Addition of Relations, and the Product of Two Relations∗160–∗166
- C. The Principle of First Differences, and the Multiplication andExponentiation of Relations∗170–∗177
- D. Arithmetic ofRelation-Numbers∗180–∗186
Part V: Series
- A. General Theory of Series∗200–∗208
- B. On Sections, Segments, Stretches, and Derivatives∗210–∗217
- C. On Convergence, and the Limits of Functions∗230–∗234

3.3 Volume III

Volume III contains the remainder of Part V (Sections D–F) andconcludes with Part VI (divided into Sections A–D):

Part V: Series (continued)
- D. Well-Ordered Series∗250–∗259
- E. Finite and Infinite Series and Ordinals∗260–∗265
- F. Compact Series, Rational Series, and Continuous Series∗270–∗276
Part VI: Quantity
- A. Generalization of Number∗300–∗314
- B. Vector-Families∗330–∗337
- C. Measurement∗350–∗359
- D. Cyclic Families∗370–∗375

A fourth volume on geometry was begun but never completed (Russell1959: 99).

Overall, the three volumes not only represent a major leap forwardwith regard to modern logic, they are also rich in earlytwentieth-century mathematical developments. To give one example,Whitehead and Russell were the first to define a series as a set ofterms having the properties of being asymmetrical, transitive andconnected (1912 [1927: 497]). To give another, it is inPrincipia that we find the first detailed development of ageneralized version of Cantor’s transfinite ordinals, which theauthors call “relation-numbers”. The resulting“relation-arithmetic” in turn led to significantimprovements in our understanding of the general notion of structure(1912: Part IV).

As T.S. Eliot points out, the book also did a great deal to promoteclarity in the use of ordinary language in the early part of thetwentieth century:

how much the work of logicians has done to make of English a languagein which it is possible to think clearly and exactly on any subject.ThePrincipia Mathematica are perhaps a greater contributionto our language than they are to mathematics. (1927: 291)

The book is also not without some self-deprecating humour. AsBlackwell points out (2011: 158, 160), the authors twice poke fun atthe length and tedium of the project’s many logical derivations.In Volume I, the authors explain that one cannot list all thenon-intensional functions of \(\phi \bang \hat{z}\) “becauselife is too short” (1910 [1925: 73]); and in Volume 3, afterover 1,800 pages of dense symbolism, the authors end Part IV, SectionD, on Cyclic Families, with the comment,

We have given proofs rather shortly in this Section, particularly inthe case of purely arithmetical lemmas, of which the proofs areperfectly straightforward, but tedious if written out at length. (1913[1927: 461])

Evidence that the humour originates more with Russell than withWhitehead is perhaps found in not dissimilar remarks that appear inRussell’s other writings. Russell’s comment whendiscussing the axiom of choice, to the effect that given a collectionof sets, it is possible to “pick out a representativearbitrarily from each of them, as is done in a General Election”(1959: 92), is perhaps a case in point.

Readers today (i.e., those who have learned logic in the last fewdecades of the twentieth century or later) will find the book’snotation somewhat antiquated. Readers wanting assistance are advisedto consult the entry on thenotation inPrincipia Mathematica. Even so, the book remains one of the great scientific documents ofthe twentieth century.

4. Volume I

4.1 Part I: Mathematical Logic

4.1.1 Propositional Logic in PM

The system ofpropositional logic of PM, can be seen as asystem ofsentential logic consisting of a language, andrules of inference. PM contains the first presentation of symboliclogic that deals with propositional logic as a separate theory. Fregehad involved quantification from the beginning, while Peano’ssystem was interpretable as about propositions and classes with somedifferent principles holding for each interpretation. Thepropositional logic of PM is unusual for modern readers, for variousreasons having to do with its origins in Russell’s earlier workon logic. One is that the axioms of propositional logic are not statedusing only the primitive connectives of the logic, which are \(\lnot\)and \(\lor\), but instead only use \(\lor\), and \(\supset\) which isa defined connective.

In this section we will useA,B, etc as meta-linguisticvariables for formulas. The formulas constructed from atomicpropositions with the connectives are said to expresselementarypropositions to distinguish them from propositions involvingquantifiers and propositional functions. The system is organizedaxiomatically, the axioms, called “primitive propositions”or “Pp”, are presented with the characteristic‘\(\supset\)’ of material implication, which is definedwith \(\lnot\) and \(\lor\). The connectives \(\amp\) and \(\equiv\)are also defined, but not needed in the statements of the axioms. Thispeculiarity has its origins in Russell’s view from 1903 that

The propositional calculus is characterized by the fact that all itspropositions have as hypothesis and as consequent the assertion of amaterial implication. (1903: 13)

All of the “primitive propositions” of PoM are stated withonly material implication as a primitive connective. The connectives\(\amp\),\(\lor\) and \(\equiv\), are defined as might be expected.The notion of negation, expressed by \(\lnot\), is defined using anotion of quantification over propositions ( \(\lnot A\) means thatA implies all propositions). By 1906 Russell had decided to use\(\lnot\) as a primitive connective, and no longer used propositionalquantifiers, allowing \(\supset\) to be defined, while the primitivepropositions were still stated with \(\supset\) and \(\lor\). That thesystem of propositional logic in PM was the result of an evolution ofchanges in choices of primitives is mirrored in the choice of theoremsthat are proved in the first chapters. While most are proved becausethey will be used later in PM, some remain simply as remants of theearlier systems. In particular PM contains several theorems that wereprimitive propositions in earlier systems, though not used in whatfollows. In fact one primitive proposition of PoM, known as“Peirce’s Law (\([(p \supset q )\supset p ] \supset p\))appears to have been proved in an early version of PM as∗2·7 but simply deleted (and its number notreassigned to another theorem) simply to save space (see Linsky2016).

The notion of truth-functionalsemantics for propositionallogic, using the familiar truth tables, and the notion ofcompleteness of an axiom system, was not developed until soonafter the publication of PM by Bernays (1926). As a result there is noattempt to find a short list of axioms that will be complete, and soat later stages of the work there is no simple appeal to“tautological consequences” which might be easilyjustified by semantic considerations.

The language of propositional logic in PM consists of a vocabularyconsisting of:

Atomic proposition variables:p,q,r,\(p_1\), … (There are no proposition constants.)
Sentential connectives. Primitive: \(\lnot\) and \(\lor\).Defined: \(\supset\), \(\amp\), \(\equiv\).
Punctuation: \((\), \()\), \([\), \(]\), \(\{\), \(\}\), etc.

The well formed formulas (wffs) are defined as follows:

Atomic proposition variables arewffs.
IfA andB arewffs then so are: \(\lnotA\) and \(A \lor B\)

The other familiar connectives are defined:

Definitions \[\begin{align}\tag*{∗1·01} {A \supset B} & \eqdf \lnot {A \lor B}\\\tag*{∗3·01} {A \amp B} & \eqdf {\lnot(\lnot A \lor \lnot B})\\\tag*{∗4·01}({A \equiv B} & \eqdf {(A \supset B) \amp (B \supset A)}\\ \end{align} \]
Axioms \[\begin{align}\tag*{Pp ∗1·2} (p \lor p ) & \supset p\\\tag*{Pp ∗1·3} q & \supset (p \lor q )\\\tag*{Pp ∗1·4} (p \lor q ) & \supset (q \lor p)\\\tag*{Pp ∗1·5} [p \lor ( q \lor r )] & \supset [q\lor ( p \lor r )]\\\tag*{Pp ∗1·6} ( q \supset r ) & \supset [ (p \lor q) \supset (p \lor r ) ]\\ \end{align}\]

In 1926 Paul Bernays showed that this could be reduced by one, asaxiom 4 (∗1·5) can be proved from theothers.

Rules of inference:
- Modus ponens (∗1·1): From\(\vdash A \supset B\) and \(\vdash A\), derive \(\vdash B\)
- Substitution: From \(\vdash A\) derive \(\vdash A'\)where \(A'\) is the result of substituting some formulaBuniformly for any atomic proposition variable that occurs inA.

There is no explicit statement of a rule of substitution in PM. Thefree variables in the propositional logic of PM may be interpreted asschematic letters, and so the system will require a rule ofsubstitution of formulas. In this article they are to be interpreted asreal variables ranging over propositions, in which case instanceswould be derived by instantiation from generalizations over allpropositions. The announcement in the Introduction that propositionsare not necessary in what follows and so will be avoided suggests theschematic interpretation of the variables. We follow the variableinterpretation in this article, however, in part to allow our notationto follow PM, withp’s andq’s rather than anew vocabulary of schematic lettersA,B, etc. Thisinterpretation of the letters as variables will also assist in thepresentation of quantificational logic in PM below.

As is standard for an axiomatic formulation of logic, aderivation of a formula of sentential logic in PM willconsist of an instance of one of the six axioms, the result of asubstitution in a preceding line, or the application of modus ponensto two preceding lines. Theorems of PM will be proved in order,allowing the use of (instances of) preceding theorems as lines inlater derivations.

The resulting system is complete, in the sense that all and onlytruth-functionally valid sentences are derivable in the system. Thisdespite the seeming defects of the system by modern standards,including the redundancy of one of the axioms, the use of definedsymbols in expressions to which the rules of inference apply, and theuse of defined symbols in the axioms. The derivations in∗2 to∗5 are abbreviated, butwith an indication on the side of each line of what justifies it, andhow any abbreviation can be undone. Theorems are proved primarily asneeded in later numbers, but some were axioms, or important theoremsof earlier versions of propositional logic, going back toThePrinciples of Mathematics. Aside from historical interest intheir actual choices, however, the system of PM can be viewed as basedon any standard system of propositional logic.

4.1.2 The “Ramified” Theory of Types

The theory of types in the initial chapters of PM isramified, so that within a given type, of propositions, or offunctions of individuals, and functions of functions of individuals,there will be finer subdivisions. This ramification is necessary forthe application of the logic of PM to what what are called“epistemological” paradoxes in the Introduction to PM. Themost prominent of these is the (propositional) Liar paradox created bythe proposition that all propositions of a certain sort, say assertedby Epimenides, are false, when that very proposition is of that sort,that is the only proposition that Epimenides asserted. The solution inthe ramified theory of types requires that a proposition about a sortof first level propositions, say that they are all false, will itselfbe of the next order.

The paradoxes of the theory of sets are resolved by reducingassertions about sets to assertions about propositional functions. Therestriction that a function of one type cannot apply to a function ofthe same type is enough to block the paradoxes. Thus the distinctionbetween individuals, functions of individuals, and functions of suchfunctions, categorized by what came to be called “simple theoryof types” is enough for the purposes of reducing mathematics toclasses, and so to logic. The idea that the full theory of types wasnot needed to resolve the mathematical or set theoretical paradoxeswas proposed by Chwistek (1921) and Ramsey (1931), and led to thelater introduction of the terms “ramified theory of types”and “simple theory of types” that will be used in thisentry.

In the Introduction to PM terminology is introduced for the two waysthat variables may appear in formulas. The “apparentvariables” arebound variables, whereas “realvariables” arefree variables. The properinterpretation of higher-order variables in PM is the subject ofcontemporary dispute among scholars of PM. Landini (1998) and Linsky(1999) offer two rival accounts. Landini holds that higher-order freevariables should be interpreted as schematic letters, replaceable byformulas, and that the bound variables are to be interpreted“substitutionally”. The logic of the theory of types in PMcan be seen as an extension of a theory of a standard first orderlogic developed in∗10. Then the more distinctivenotions of PM that depend on the theory of types can be explained.These include the Axiom of Reducibility, in∗12which underlies the so calledramification of the theory oftypes, the division intoorders of predicates true of asingle type of argument. The Axiom of Reducibility asserts that for anarbitrary function of any order there is an equivalentpredicative function, that is, one true of exactly the samerange of arguments. This will be explained below. Identity is defined,in∗13, with a version of Leibniz’ notion ofthe identity of indiscernibles that is consistent with the theory oftypes. Replacing Leibniz’ notion thatx andy areidentical just in case they share the same properties, in PM,xandy are identical if and only if they share the samepredicative functions. Then using the notion of identity so defined,PM presents Russell’s theory of definite descriptions, preciselyas it was defined in “On Denoting” (1905). This articlewill use the notation for “r-types” due to AlonzoChurch 1976, which is explained in the accompanying article “TheNotation ofPrincipia Mathematica” in thisEncyclopedia.

Although PM does not single out first order logic from the wholeramified theory of types, the actual deductive apparatus on the pagelooks exactly like a system of first order logic, and thecomplications of the logic of higher types can be expressed with anadditional apparatus of type indices. In what follows we will use thesystem ofr-types in Church (1976) for type indices, and theuse oflambda operators for propositional functions.

Church’s (1976) formulation of the logic of PM withr-types The language of the higher-orderquantificational logic of PM is calledramified type theory,and the system of types, following Church (1976) will be calledr-types. Note that there are two kinds of variables, but theyare all assigned to an r-type. Individual variables behave as aspecial case of propositional function variables.

(argument) variables: \(x_{\mathbf{\tau}}\),\(y_{\mathbf{\tau}}\), \(z_{\mathbf{\tau}}\), …for each type\(\tau\)
n-place propositional function variables:\(\phi_{\tau}^{n}, \psi_{\tau}^{n}\), …(\(n \geq 1\)), where\(\tau\) is a type symbol. (\(R^n,S^n, \ldots\) (\(n \geq 1\)) forrelations in extension.) \(\chi\) is used for a higher-orderfunction of functions, as in \(\chi (\phi)\), and \(\Phi\) for thenext order, as in \(\Phi (\chi)\)
connectives: \(\lnot\), \(\lor\)
punctuation: \((\), \()\), \([\), \(]\), \(\{\), \(\}\), etc.
the quantifier symbols: \(\forall\) and \(\exists\).
the lambda symbol: \(\lambda\)

The system of symbols forr-types and the assignment ofr-types to variables for different entities (individuals andfunctions) is as follows:

\(\iota\) is ther-type for anindividual.
Where \(\tau_1 , \ldots, \tau_m\) are anyr-types, then\((\tau_1 , \ldots, \tau_m) / n\) is ther-type of apropositional function oflevel n; this is ther-typeof anym-ary propositional function ofleveln,which has arguments ofr-types \(\tau_1 , \ldots, \tau_m\),respectively.

Theorder of an entity is defined as follows:

theorder of an individual (ofr-type \(\iota\))is 0
theorder of a function ofr-type \( (\tau_1, \ldots,\tau_m ) / n \) is \(n+N\) whereN is the greatest of the orders ofthe arguments \(\tau_1 \ldots, \tau_m\)

There are no predicate or individual names in this language. Thereare, however complex terms for propositional functions, definedtogether with formulas (with the usual notion of bound and freevariables):

Let the expressions \(\phi_{\tau}\) be variables ranging over propositionalfunctions of type \(\tau\). We read \(x_{\tau}\) as a metalinguisticvariable ranging over variables ofr-type \(\tau\). Thesubscript \(\tau\) will be indicated only with the initial quantifierwhich governs the variable.

We then can define thewell formed formulas (wffs) andterms of quantificational logic as follows:

Variables (for individuals and propositional functions) areterms.
If \(\phi_{\tau}^{n}\) is ann-place propositional functionvariable ofr-type \((\tau_1 \ldots, \tau_n) /k\) and\(x_{i} \) for \( ( 0 \leq i \leq n ) \) are terms of ofr-types \(\tau_1 \ldots,\tau_n\), respectively, then \(\phi (x_1, \ldots x_n)\) is awff.
(The variables \(x_n\) are called “argument” variables.They will includeindividual variables ofr-type\(\iota\), but also variables of higher types. The variable \(\phi\)can occur as a predicate in \(\phi(x)\) and as an argument in \(\Psi(\phi)\), and cannot be of type \(\iota\) to occur in awff.)
Ifx is a variable of type \( \tau \) andA isawff then \(\lambda x A\) is atermofr-type \((\tau)/n\) (wheren is one more than thehighest order of any bound variable inA and at least as highas the order of any free variable inA).
Ifx is an individual variable of type \(\tau\) andA isa wff in whichx occurs \(free\) then \(\forall x A\) and\(\exists x A\) are wffs.
IfA andB are wffs, then so are \(\lnot A\), \(A \ampB\), \(A \lor B\), \(A \supset B\), and \(A \equiv B\).
The conventional precedence ordering of connectives will allow forfewer punctuation signs to indicate scope of connectives, thus \(A\lor B \supset C\) is read as \(( A \lor B) \supset C\)

The comprehension principle for a system of higher-order logic, or settheory, states which formulas express a property or set. Within a typetheory this allows for what looks like an “unrestricted”comprehension principle, in that for every well formed expressionA with a free variable,x, there is a property which issatisfied by precisely the entities satisfying the formula. It is therestrictions of types that block the paradoxes, as the problematicformulas “is not a member of itself” and “does notapply to itself” are ruled out by the system of types. Thecomprehension principle then is characterized by an infinite set ofsentences of the form of:

Comprehension:

\[\exists \phi \forall x_{\tau} [ \phi (x) \equiv A], \quad (\phi \textrm{ not free in } A)\]

where \(\phi\) is a functional variable ofr-type \((\tau)/n\)andx is a variable ofr-type \(\tau\), and the boundvariables ofA are all of order less than the order of \(\phi\)and the free variables ofA are all of order not greater thanthe order of \(\phi\).

As presented here Church’s seemingly straightforwardcomprehension principle, with its restrictions on the types ofvariables, is for Quine a glaring manifestation of the confusion ofuse and mention of language that he sees infecting PM:

…there is a characteristic give and take between sign andobject: the propositional function gets its order from the abstractiveexpression, and the order of the variable is the order of the values.Exposition is eased by allowing the word ‘order’ a doublesense, attributing orders at once to the notations and, in parallel,to their objects. (Quine 1963: 245)

The offense comes from attributing orders (r-types) topropositional functions on the basis of the variables with which theyare defined, but also to the functions themselves, as simply values ofbound higher-order variables. In response, the defender of type theorymust say that any semantic intrepretation of the notion ofpropositional function will have to attribute to functions thesedistinctions that are marked in linguistic expressions of some ofthem, and in particular, the variables involved in theirdefinition.

What follows in PM up to∗12 is a presentation ofquantificational logic in the ramified theory of types. Thecomplications are due to the decision of the authors (surely onRussell’s insistence) to add a new section∗9 which allows the earlier theory ofpropositional logic to be incorporated directly into aquantificational logic as is done in contemporary logic. This showsthe extent to which the earlier theory is indeed a theory ofpropositions, not an account of a fragment ofquantificational logic allowing open sentences containing freevariables.

Quantificational Logic in PM

Section∗10 formulates quantificational logic asit is currently formulated, namely the axioms and theorems ofpropositional logic are assumed to hold for all formulas, and not justthe elementary propositions of∗1–5. Itappears that Russell became concerned about this assumption, and so anew section∗9 was introduced to derive theprinciples of quantification theory from elementary propositionsalone. While of interest to scholars of PM, the upshot is the same forlater uses of quantificational logic in PM.

Again, the reader interested in what distinguishes the logicistproject in PM can skip this section, although passing attention may bepaid to the system of higher-order logic that is used, as based hereon theramified theory of types.

The extension to functions of more than one variable is obvious, andbelow, some applications will employ this extension.

The existential quantifier and the other familiar connectives\(\supset\), \(\amp\) and \(\equiv\) are defined as for propositionallogic. (In what followsA is now an arbitrary (possiblyquantificational) formula):

\[\tag*{∗10·01} \exists x A \eqdf \lnot \forall x \lnot A\]

Axioms of∗10: All instances of propositionaltheorems where wffs are uniformly substituted for propositionalvariables.

The system of PM uses a rule of Universal Generalization and an Axiomwhich amounts to a rule of Instantiation.

\[\tag*{∗10·1} \vdash \forall x_{\tau} A \supset A'\]

where \(A'\) is likeA except for having a termy oftype \(\tau\) substituted for \(x_{\tau}\) inA.

(Note: The notion of suitable “substitution” is much morecomplicated for logic of higher types than it is for first orderlogic. In part this is because of the application to an argument oflambda expressions for a propositional function, e.g., \([\lambda x\phi (x)] (\nu)\) where \(\nu\) may be a complex term involvingvariables and quantifiers in other lambda expressions.)

\[\tag*{∗10·11} \textrm{If } \vdash A \textrm{ then } \vdash \forall x_{\tau} A'\]

where \(A'\) is likeA except for having a termy oftype \(\tau\) substituted forx inA

Other quantifier principles, which govern the move of a quantifierfrom the inside of a formula to governing the entire formula, socalled “quantifier containment principles” are alsoderived as theorems in ∗10. Some that are often used in laternumbers are:

\[\begin{align}\tag*{∗10·12} \forall x_{\tau} ( A \lor \phi (x ) )& \supset (A \lor \forall x_{\tau} \phi (x ))\\\tag*{∗10·21} \forall x_{\tau} [ A \supset \phi (x ) ]& \equiv [A \supset \forall x_{\tau} \phi (x)]\\ \end{align}\]

The introduction to∗10 in PM begins with:

The chief purpose of the propositions of this number[∗10] is to extend to formal implications(i.e. to propositions of the form \(\forall x (\phi x \supset\psi x)\) as many as possible of the propositions proved previouslyfor material implications,i.e. for propositions of the form\(p \supset q\). (notation updated)

In other words, this section introduces the logic of quantification,in a way that is familiar to contemporary logic. The propositionallogic of the preceding sections is interpreted as true only ofelementary, first order propositions, and so extended to higher-orderlogic by showing how sentences can be presented in “prenexform”, that is with quantifers in intial position preceding aquantifier free matrix. These theorems are familiar now as“quantifier confinement” theorems, of the form of:

\[\tag*{∗10·23} \forall x_{\tau} [ \phi (x ) \supset A ] \equiv [ \exists x_{\tau} \phi (x)] \supset A \]

The Axiom of Reducibility

Given that the system of PM contains a ramified theory of types,however, the move to discussion of classes for the remainder of thework after∗20 requires a further axiom, the axiomof reducibility, in order to allow a simple theory of types ofclasses. Consider the fundamental notion from the theory of realnumbers of theleast upper bound (l.u.b.) of abounded class of real numbers. Consider the class of all real numberswhose square is less than or equal to 2, i.e., \(\{ x \mid x^2 \leq2\}\). A class of realsS has anupper bound if andonly if \(\exists r \forall s ( s \in S \supset s \leq r)\). If abounded classS of real numbers has members of somer-type \(\tau\), then the least upper bound must belong to anr-type \(\tau / 1\) because of the quantifier in the definitionranging over the elementss ofS. We say that thedefinition ofS is “impredicative” because itinvolves quantification over a totality to which it is intended tobelong. The theory of real numbers, however, requires that sometimesthe least upper bound of a class is a member of that class, in thiscase, the least upper bound ofS, namely \(\sqrt{2}\), is anelement ofS.

The resolution of this in the system of PM is to adopt an axiom whichguarantees that any class defined in terms of another class will be ofthe same type. Thus impredicative definitions of classes are allowed,and do not introduce a class of a higher type. This is accomplished byadopting the Axiom of Reducibility, in∗12, whichguarantees that for any function \(\phi\), there will be aco-extensivepredicative function. More precisely, the Axiomof Reducibility asserts that for any function of any number ofarguments of an arbitrary level, there is an equivalent function oflevel 1, ie. one true of the same entities:

Axiom of Reducibility,

\[ \tag*{Pp ∗12·1} \forall \psi \exists \phi \forall x_{\tau} [\psi (x) \equiv \phi \bang(x) ] \]

where \(\phi \bang\) is apredicative function.

The exclamation mark “\(\bang\)” is used in PM to indicatepredicative functions. In Church’s system ofr-types this is expressed by saying that the variablexis ofr-type \(\tau\) and \(\phi\) is ofr-type \((\tau)/1\) and \(\psi\) is ofr-type \((\tau) /n\). In other words,\(\phi\) is of the lowest order compatible with its arguments. Thisnotion of predicative functions is taken from the Introduction. In∗12 Whitehead and Russell propose a narrowerconception of predicative function, by which \(\phi\) must be amatrix, or function in the definition of which no quantifiersat all appear. See the accompanying entry onthe notation inPrincipia Mathematica.

It has seemed to some, beginning with Chwistek (1912) and continuingthrough Copi (1950) that the Axiom of Reducibility is technicallyfaulty, leading to an inconsistency, or at least redundancy in thesystem of PM. Ramsey (1931) early on argued that the supposedcontradiction in fact demonstrated that certain predicative functionsare indefinable. Church (1976), confirms this assessment, and uses thepresentation ofr-types we describe here to show rigorously thelimitations on what functions are definable in the system of PM.

The interaction of this Axiom with the theory of classes in PM will beexplained below in connection with∗20 onclasses.

Identity in PM

Contemporary logic follows Frege in treatingidentity,represented by \(=\), as a logical notion. In PM the notion ofidentity is defined following Leibniz as indiscernibility, namelyindiscernible objects are identical. That is, \(\forall \phi ( \phi x\equiv \phi y) \supset x = y\). But since the axiom of reducibilityguarantees that if there is any type of function on whichx andy differ, they will differ on some predicative function, PMuses the following definition of identity:

\[\tag*{∗13·01}x_{\tau} = y_{\tau} \eqdf \forall \phi [ \phi \bang (x) \supset \phi \bang(y) ],\]

for \(\phi \bang\) a predicative function.

In contemporary systems of logic an axiom or rule of inference allowsthat if \(x = y\), then for any predicate \(\phi\), \(\phi x \equiv\phi y\). In other words, identicals are indiscernible. The givendefinition of identity only suffices if it is not possible thatentitiesx andy which share all predicative properties,cannot be distinguished by some property of a higher order. The axiomof reducibility guarantees thatx andy sharingproperties of any given higher order will entail sharing predicativeproperties, and so by the definition of identity, \(x =y\).

In the appendix B to the second edition of PM, which was written byRussell, there is a technical discussion of the consequences ofabandoning the axiom of reducibility. A faulty proof is proposed toshow that the principle of Induction can be derived without using theaxiom of reducibilty in a modified theory of types (see Linsky 2011).As Russell points out, however, it is not possible to define real numbers using “Dedekindian” classesof rational numbers without assuming the axiom of reducibility. (Thethesis that every class of reals with an upper bound has a real numberas its least upper bound, discussed above, would not be provable.) Asa result, Russell says “analysis would collapse”. In allof this discussion, however, Russell does not indicate what wouldreplace the definition of Identity in∗13, whichso crucially depends on the axiom of reducibility.

Definite Descriptions

Russell presented his theory of definite descriptions in “OnDenoting” (1905) and it has probably been the most widelydiscussed application of the logic of PM. The role of the theory ofdefinite desciptions in PM, however, is exhausted by its use in∗30 to define what are called “Descriptivefunctions”. In contemporary logic it is routine to show how thenotion of a “functional relation” can be used to justifythe introduction of function symbols into a language with onlyn-place predicates. The theory of definite descriptions isessential for this argument. After∗30 there areonly a handful of occurrences of description operators in PM. What isperhaps Russell’s most valuable contribution to philosophicallogic and the philosophy of language, is, here, only a device used fora technical, though programatically important, purpose. The technicalpurpose, however, does indicate an important distinction between thelogicism of Frege and Russell. Frege’s logic is based on thenotion of concept, which is a case of a function from objects to truthvalues. Russell’s logic can be seen as further reducing themathematical notion of function to his logical notion ofpropositional function. Some logicians firmly in thetradition of mathematical logic do not find this to be an advance, butit does indicate a significant difference between the approaches ofFrege and Russell (see Linsky 2009).

Definite descriptions are expressions of the form “the\(\phi\)” which occur in the position of terms apparently as thearguments of functions. Russell’s example from “OnDenoting” (1905) is the expression “The present King ofFrance” which apparently occurs as an argument to the function“is bald” in the sentence “The present King ofFrance is bald”. In general the expression “the \(\phi\)is \(\psi\)” is defined as equivalent to the expression“There is exactly one \(\phi\) and it is \(\psi\)”:

Contextual definition of Definite Descriptions

\[ \tag*{∗14·01} \psi ( \imath x \phi (x) ) \eqdf \exists x \forall y \{ [ \phi ( y) \equiv y = x ] \amp \psi (x)\}\]

The use of the expression \(\eqdf\) which makes it appear that bothflanking expressions are terms, disguises the fact that in this caseof a “contextual definition” what occurs on each side are formulas, the right hand side replacing the left hand side, thus“eliminating” the definite description.

To distinguish the two readings of the expression “The presentKing of France is not bald”, according to the“scope” of the description (with respect to negation), PMuses a “scope indicator” \( [ \imath x \phi ( x )]\) beforethe formula from which the description is to be eliminated by thedefinition above. Symbolize “The present King of France”as \(\imath x K(x)\) and “x is bald” as \( B(x)\), the tworeadings will be symbolized as:

\[[\imath x K(x)] \lnot B(\imath x K(x)),\]

which, eliminating the description by definition, becomes:

\[ \exists x \forall y \{ [K(x) \equiv y = x ] \amp \lnot B(x) \} \]

which is the reading on which there is exactly one present King ofFrance and he is not bald, and:

\[\lnot [\imath x K(x)]B(\imath x K(x)),\]

which, eliminating the description by definition, becomes:

\[ \lnot ( \exists x \forall y \{ [K(x) \equiv y = x ] \amp B(x) \} )\]

The latter is the reading on which it is not the case that there isone and only one present King of France and he is bald. That may betrue if there is not exactly one present King of France, as isactually the case, as France has no King. In such a case thedescription is not “proper”, which is expressed with aspecial symbol in PM, \(E\bang\), defined as:

proper description

\[\tag*{∗14·02}E\bang (\imath x \phi (x)) \eqdf \exists x \forall y [ \phi (y) \equiv y = x ]\]

In theorem∗14·3 we find one of the rareoccurrences of bound variables ranging over propositionsp andq of functions that are not predicative. (Suppose thatpandq are of somer-type \(()/n\) andf is afunction of those propositions,f might haver-type\((()/n)/m\) for \(m, n > 1\)). Here we also see an occurrence of aformula \(\imath x \phi (x)\) in subject position expressing a proposition asan argument of such a function. These expressions do not figure intheorems later in PM and only occasionally in the introductorymaterial of some sections. Theorem∗14·3asserts that in truth-functional contexts the scope of a (proper)description does not effect the truth value of a proposition in whichit occurs:

\[\tag*{∗14·3}\begin{align}\{ \forall p \forall q [ ( p \equiv q ) \supset (\Phi(p) \equiv \Phi(q) )] \amp E\bang ( \imath x \phi(x) ) \} \supset \\\{ \Phi [\imath x \phi (x) ]\chi(\imath x \phi (x) )\equiv [\imath x \phi (x)] \Phi(\chi(\imath x \phi (x ))) \}\end{align}\]

This theorem is another indication of the way in which thephilosophical basis of PM, with its propositional functions that areintensional is left behind as the mathematical content of PM isintroduced with the definition of classes in the next sections.

The “No-Classes” Theory of Classes

The theory of sets (classes) in PM is based on a number ofcontextual definitions, similar in some ways to the theory ofdescriptions. In what follows we will occasionally use the expression“class” for the PM notion, to remind the reader of thedifferences between this and an axiomatic theory of sets, such as ZF,not to indicate that these are “proper classes” in thesense used in ZF or VGB class theory, to indicate an expression thatdoes not define a set, such as \(\{ x \mid x = x \}\), which is trueof theuniverse \(\rV\) and so too “large” to bea set.

The basic definition eliminates terms for classes from contexts inwhich they occur, just as the theory of definite descriptionseliminates descriptions occuring in the positions of terms:

Contextual definition of classes

\[ \phi \{x \mid \psi (x) \} \eqdf \exists \chi \left [ \begin{split}\forall x [ \chi \bang (x) \equiv \psi (x) ] \\{} \amp \phi (\lambda x \chi (x))\end{split}\right] \tag*{∗20·01}\]

for \(\chi \bang\) a predicative function

In other words, an expression seeming to attribute the property\(\phi\) to a class \(\{x \mid \psi (x) \}\) is true if and only ifthere is some predicative property \(\chi\), which is co-extensivewith \(\psi\), which really has the property \(\phi\).

The notion of membership (\(\in\)) which is the one non-logicalrelation symbol of ZF, is defined in the PM system:

Definition of \(\in\)

\[\tag*{∗20·02}x \in \phi \eqdf \phi \bang (x)\]

for \(\phi\) a predicative function.

The principal role of this “no-classes” theory of classes,as it is called, is to show how the theory of types resolves theparadoxes that had afflicted the naive theory of classes inThePrinciples of Mathematics and was seen by Russell to afflictFrege’s theory. After these foundational sections, all theindividual variables that appear in PM should be seen as ranging overclasses, (and, as will be explained below, the relation symbols are tobe interpreted as ranging over relations in extension). The paradoxesappear in different forms, as seen in the Introduction to PM, but theresolution of the paradox of “the class of all classes which donot belong to themselves”, which appears in Russell’sintial letter to Frege, will be used as our example. This class, whichleads directly to a contradiction, would appear in contemporarynotation as \(\{ x \mid x \notin x \}\). The paradox arises when oneasks whether that class is a member of iteself or not. The expressionthat it is a member of itself \(\{ x \mid x \notin x \} \in \{ x \midx \notin x \}\) will have two class expressions to be eliminated bythe first definition, and then several uses of the relation symbol\(\in\) which will also be eliminated. In the end there will be anexpression \(\lnot (\phi_{\tau} \in \psi_{\tau})\), which is notlegitimate, since this is not well-formed for any \({\tau}\). Afunction must be of a higher order than its arguments.

The effect of these two definitions is to demonstrate that classesfall into a simple theory of types, and while subject to these typerestrictions, all of the inferences involving class expressionsobserve classical quantification theory as stated in∗10 above. The definitions of existential anduniversal quantification are simple. Note that Russell uses Greekletters (\(\alpha\), \(\beta\),…) to range over classes:

Definition of quantification over “all classes”

\[\tag*{∗20·07}\forall \alpha \chi (\alpha) \eqdf \forall \phi \chi ( \{ x \mid \phi \bang(x) \})\]

for \(\phi \bang\) a predicative function.

Definition of quantification over “some classes”

\[\tag*{∗20·071}\exists \alpha \chi ( \alpha ) \eqdf \exists \phi \chi \{ x \mid \phi \bang(x) \}\]

for \(\phi \bang\) a predicative function.

The definition of \(\in\) is extended to classes without change:

Definition of membership of a class in a function

\[ \tag*{∗20·07}\alpha \in \psi \eqdf \psi \bang ( \alpha)\]

for \(\psi \bang\) a predicative function.

The remainder of∗20 consists of theorems provingthat the theorems of quantificational logic developed in∗10 apply as well to expressions about classes,with the “Greek” variables \(\alpha, \beta, \ldots\) inthe place of individual variables \(x, y, \ldots\). Because formulaswith Greek variables look and behave the same as individual variableswith respect to quantificational logic, it is possible to overlook theinteraction of the theory of classes with the theory of types. AsGödel points out in the passage quoted above (Gödel 1944[1951: 126]), the “contextual definitions” of classvariables \(\alpha\), \(\beta\), etc., does not specify theelimination of class abstracts from all possible contexts, and inparticular those that talk about classes. Linsky (2004) argues that PMhas no notation for classes of propositional functions to distinguishthem from classes of classes, although one could be added. This isanother indication of the turn in PM after the intial sections (up to∗21) to an extensional system of classes andrelations.

In effect the class variables can be seen as propositional functionvariables, restricted tor-types in which only predicativefunctions appear, in arguments as well, leading to what might be seenas “hereditarily predicative functions”. In other words,the class variables can be replaced with propositional functionvariables in which ther-type of the function, and of all thearguments are of the form \((\beta_1, \beta_2, \ldots, \beta_m )/1\)and the same applies to \(\beta_1, \beta_2, \ldots, \beta_m\) as well.This means that variables and terms for classes will obey thesimple theory of types. These can be contrasted withr-types by presenting an alternative system of simple types or“s-types”.

Church’s (1974) “Simple” Theory ofTypes

\(\iota\) is thes-type for anindividual.
Where \(\tau_1 \ldots, \tau_m\) are anys-types, then\((\tau_1 \ldots, \tau_m)\) is thes-type of a propositionalfunction of am-ary propositional function which has argumentsofs types \(\tau_1 \ldots, \tau_m\), respectively.

Theorder of an entity in the system ofs-types isdefined as follows:

theorder of an individual (ofr-type \(\iota\))is 0
theorder of a function ofr-type \(\tau_1 \ldots,\tau_m\) is \(n+1\) wheren is the greatest of the order of thearguments \(\tau_1 \ldots, \tau_m\)

Church’s notion of “order” is not quite one that isfamiliar from talk of “first order logic” and“second order logic”. First order logic will have boundvariables ofs-type 0, and a logic which quantifies overvariables ofs-type 1, thus the familiar notion of“order” is one more than the highest order of any of thebindable variables in thes-type system.

It should be noted that everys-type is also anr-type, namely one that is hereditarily predicative. Thus itmight seem that the expressions of the theory of classes are allsimply a special case of formulas of the full system of the ramifiedtheory of types. This will be true of the assignment of types tovariables, but it must be remembered that the entire formula \(\phi\{x \mid \psi (x) \}\) about a class is by definition

\[\exists \chi [ \forall x [ \chi (x) \equiv \psi (x)] \amp \phi (\chi)].\]

All we have discussed so far is the relative types of \(\phi\) and\(\chi\). The Axiom of Reducibility guarantees that there is apredicative \(\chi\) co-extensive with any \(\psi\) in the definingcondition of a class. To justify use of the class term \(\{x \mid \psi(x) \}\) one must then just show that there issome functionthat has the higher-order property \(\phi\). This is the stepcomparable to the proof that a definite description is proper, i.e.,true of exactly one thing, that justifies using that description as asingular term.

Comparison of the Classes of PM with Axiomatic Set Theory

It is widely thought that the system of PM offers a very differentapproach to the solution of the paradoxes than that of axiomatic settheory as formulated in the Zermelo-Fraenkel system ZF. While thetheory of types is thought of as a desperate attempt to save thelogicist program by artifically introducing types in order to resolvethe paradoxes, axiomatic set theory seems to simply postulate sets asentities and adopts axioms in a first order language with“\(\in\)” for membership as its one non-logical symbol.This view has been forcefully expressed by Quine:

Whatever the inconveniences of type theory, contradictions such as[the Russell paradox] show clearly enough that the previous naivelogic needs reforming.…There have been other proposals to thesame end—one of them coeval with the theory of types. [Quinecites Zermelo 1908.] But a striking circumstance is that none of theseproposals, type theory included, has any intuitive foundation. Nonehas the backing of common sense. Common sense is bankrupt, for itwound up in contradiction. (Quine 1951: 153)

However, both the view that type theory lacks intuitive support, andthat type theory and axiomatic set theory are based on the sameintuitions dates back to Gödel in 1933, referring to set theoryas the “theory of aggregates”:

At least hitherto only one solution which meets these two requirements[of avoiding the paradoxes while retaining mathematics and the theoryof aggregates] has been found.…This solution consists in thetheory of [simple] types.…It may seem as if another solutionwere afforded by the system of axioms for the theory of aggregates, aspresented by Zermelo, Fraenkel and von Neumann; but it turns out thatthis system of axioms is nothing else but a natural generalization ofthe theory of types, or rather, it is what it becomes of the theory oftypes if certain superfluous restrictions are removed. (Gödel1933 [1995: 45–46])

The two “restrictions” that Gödel intends are therestriction that types are notcumulative and that the levelsof types are limited to the natural numbers 0, 1,…n,…. Gödel suggests that one adopt a cumulativesystem of types in which a given type includes functions of all lowertypes (ororders), and the types extend beyond \(\omega\),\(\omega\) + 1, …\(\omega^{\omega}\), …, through all theordinals. Such a “natural generalization” of the theory oftypes, he asserts, amounts to the same as Zermelo-Fraenkel set theory(ZF). Gödel’s claim is spelled out by George Boolos (1971)as the “iterative conception” of sets, which can beexpressed formally. If one thinks of sets as built up in stages, witheach stage adding all sets of members of the last stage, and theprocess extending endlessly, then the axioms of ZFset theory are indeed provable from the axioms of the theory of the“iterative conception” of sets. In turn the“iterative conception” relies on a strong intuition,contrary to what Quine says. It is the same intuition that underliesthe hierarchy of types.

Following Boolos’ presentation of the “iterativeconception of set” it seems that axiomatic set theory and PM donot differ widely, and express the similar intuitive notions of setthat provide the same solution to the paradoxes.

Strictly as presented in PM, however, the no-classes theory differssignificantly from ZF. The sentences of the PM theory are expressed inthe theory of types, as opposed to the first order theory of ZF. ZFand PM cannot simply be compared in terms of their theorems. Not onlyare there different axioms in the two theories, but the very languagesin which they are expressed differ in logical power. If we followGödel and Boolos, however, the two are seen to be based on thesame intuitive basis, and the differences are seen as the same,barring certain “superfluous restrictions” on the theoryof PM.

Relations in PM

∗21 extends the notion of class which is theextension of a one place propositional function to the comparablenotion of a “Relation” for functions of two arguments withthe analagous contextual definition.

Contextual definition of a relation in extension

\[\phi \{x ; y \mid \psi (x,y) \} \eqdf \exists \chi \left[\begin{split} \forall x \forall y [ \chi \bang (x,y) \equiv \psi (x,y)] \\{} \amp \phi ( \lambda x \lambda y \: \chi (x,y))\end{split}\right ]\tag*{∗20·01}\]

(Note: The use of this unusual notation \(\phi \{x; y \mid \psi (x,y)\}\) in this one definition is meant to avoid the implication that arelation is interpreted as a set of ordered pairs, that would berepresented by the contemporary notation \(\phi \{\langle x,y \rangle\mid \psi (x,y) \}\). The PM notation for propositional functions, asin \(\phi \hat{x}\) uses a caret over the variable where we wouldwrite \(\lambda x \phi(x)\). The PM notation for a class is \(\hat{x}\phi (x)\). A two-place propositional function is identified withvariables also with carets: \(\phi (\hat{x} \hat{y})\) and thecorresponding relation \(\hat{x} \hat{y} \phi (x,y)\). This notationdoes not identify relations as classes of ordered pairs, and that ishow our blend of PM and contemporary notation in \(\phi \{x ; y \mid\psi (x,y) \}\) is to be taken.)

The introduction of Greek letters for classes in∗20 and the use of “Roman letters” R,S, …in∗21 for relations, marks a change inthe notation used in PM. After∗21 the letters\(\phi, \psi, \ldots\) rarely appear. As Quine remarks in his study ofthe logic of Whitehead and Russell, it would seem that after a certainpoint the body of PM makes use of extensional higher-order logic in asimple theory of types:

In any case there are no specific attributes [propositional functions]that can be proved inPrincipia to be true of just the samethings and yet to differ from one another. The theory of attributesreceives no application, therefore, for which the theory of classeswould not have served. Once classes have been introduced, attributesare scarcely mentioned again in the course of the three volumes.(Quine 1951: 148)

Quine here hints at the view of PM that is widely shared amongmathematical logicians, who see the ramified theory of types, with itsaccompanying Axiom or Reducibility, as a digression taking logic intoa realm of obscure intensional notions, when instead logic, even ifexpressed in a theory of types, is extensional and is comparable toaxiomatic set theory presented with a simple hierarchy of sets ofindividuals, sets of sets individuals, and so on.

It is certainly true that the remainder of PM is devoted to thetheory of individuals, classes, and relations (in extension) betweenthose entities. Thus the ontology of these later portions is ahierarchy of predicative functions arranged in a simple theory oftypes. This has led one interpreter, Gregory Landini (1998), to arguethat only predicative functions are values of bound variables in PM.What we have interpreted as variables ranging over possiblynon-predicative propositional functions, \(\phi\), \(\psi\),…are for Landini only schematic letters, and are not bindablevariables. The only bound variables in PM, he asserts, range overpredicative functions. This is a strong version of a view that otherssuch as Kanamori (2009) have expressed, going back to Ramsey (1931),namely that the introduction of the Axiom of Reducibility has theeffect of undoing the ramification of the theory of types, at leastfor a theory of classes, and so a higher-order logic used for thefoundations of mathematics ought to have only a simple typestructure.

Our interpretation of this change in attention to classes andrelations indicated by the shift in notation is that it indicates theextent to which the solution to the paradoxes, which required aramified theory of (possibly intensional) propositional functions, mayhave superseded a logic based on an unproblematic notion of class andmathematical functions and relations between them, that appeared inthe body ofThe Principles of Mathematics beforeRussell’s attention was drawn to the paradoxes. In the summaryof the later sections of PM that follows below, it will appear that infact the symbolic development follows very closely that of PoM fromten years earlier. While we do not know much about the order in whichsections of PM were composed, it will appear from this change ofattention from propositional functions to classes and relations, thatthe later parts are in fact an earlier stratum in the conceptualdevelopment of the project that started out as a symbolic“Volume II” to follow PoM.

To remind the reader of the change from talking of propositionalfunctions to relations in extension, two further notationalalterations are introduced. Greek letters such as \(\alpha\),\(\beta\),etc. , will be usedas variables for ranging over classes as well. The individualvariables which are ambiguous with respect to type, “typicallyambiguous”, will now also range over classes. A function\(\phi\) of two variablesx andy is indicated with thearguments in parentheses after the function variable: \(\phi(x, y)\).A two place relationR holding betweenx andy iswritten \(x \relR y\), with theR in “infix”position. The obvious limitation of this notation is that it is notreadily extended to three place relations, adding a third variable,sayz. We will follow the practice in PM and write \(x \relRy\) for binary relations. PM only requires binary relations for mostof the three volumes, although the projected volume IV on geometrywould need a notation for “x is betweeny andz”, as can be seen from Henry Sheffer’s unpublished notes from Russell’s lectures ongeometry from 1910 at Cambridge. There he uses the notation\(y\rels{B}(x,y)\) which blends the two styles.

The Algebra of Classes

The notions of the subset relation and the intersection and union ofsets are defined in PM exactly as they are now (albeit with differentterminology). The complement of a set and the universal class \(\rV\)are not allowed in set theory, and rejected as “properclasses”. In PM, as they only are a set of entities of a giventype \(\tau\), they form a set of the next higher type, \((\tau)/1\).The complement of a set of a given type is the set of all entities (ofthat type) that are not in the set. Each empty set will be thecomplement of the universal set (of a given type \(\tau\) ) and sothere will be the empty set of type \(\tau\).

\[\begin{align}\alpha \subseteq \beta & \eqdf \forall x (x \in \alpha \supset x \in \beta) \tag*{∗22·01}\\\alpha \cap \beta & \eqdf \{x \mid ( x \in \alpha \amp x \in \beta ) \tag*{∗22·02}\\\alpha \cup \beta & \eqdf \{ x \mid (x \in \alpha \lor x \in \beta ) \tag*{∗22·03}\\\end{align}\]

The type subscript \(\tau\) is added below as a reminder that thenotions of universal set \(\rV\) and complement are each with regard to a given type (andso an empty set \(\emptyset\) will recur in each type.)

\[\begin{align}- \alpha &\eqdf \{ x_{\tau} \mid \lnot (x \in \alpha )\}\tag*{∗22·04}\\{\alpha - \beta} & \eqdf {\alpha \cap {- \beta}}\tag*{∗22·05}\\\end{align}\]

The Universal Class and the Empty Class

\[\tag*{∗24·01}\rV_{\tau + 1} \eqdf \{ x_{\tau} \mid (x = x) \}\]

The subscript on ‘\(\rV\)’ indicates that the universe ofclasses of a given (simple) type \(\tau\) will be a member of the nexttype. There is no class of all classes of whatever type. This is incommon with axiomatic set theory which holds that there is no set ofall sets.

\[\tag*{∗24·02}\emptyset_{\tau} \eqdf - \rV_{\tau}\]

Mathematical functions in PM

The logic of PM is based onpropositions,propositionalfunctions andrelations in extension, unlikeFrege’s which deals with objects, in particular, truth values,and functions, with the special case of concepts, which are functionsfrom objects to truth values. PM reduces mathematical functions to“functional relations” in a way that is familiar fromelementary courses. If there is a binary relation which has a uniquesecond argument for each first argument, i.e.,

\[\forall x \exists y [x \relR y \amp \forall z (x \relR z \supset z = y)]\]

then one can introduce a new function symbol \(f_R\), such that

\[\forall x \forall y (x\relR y \equiv f_R(x) = y).\]

Similarly for an \(n+1\) place relation for each \(x_1\), …,\(x_n\) there is a uniquey such that \(R(x_1, \ldots, x_n,y)\), then one can introduce ann-place functiongmapping \(x_1, \ldots, x_n\) ontoy. In PM the expressions formathematical functions are definite descriptions, referring to thelast argument of a relation as the “value” of the functiondescribed by that relation. We will use the expression \(f_R\) torefer to the functional term referring to the function derived from arelationR. PM uses the explicit definite description “theR ofy” (written \( R`y \) ) where we would use the functionalexpression \(f_R\). The definition of amonadic functionalterm then is:

\[\tag*{∗30·01} f_Ry \eqdf (\imath x)(x \relR y)\]

with the general form for ann-place functional termgderived from an \(n+1\) place relationS (followingRussell’s notation in lectures):

\[g_S(x_1, \ldots, x_n) \eqdf (\imath y )(x_1 S x_2, \ldots, x_n, y)\]

(The diligent reader will find that this presentation does not followPM exactly. The example “the father of” based on arelationR expressing “x is the father ofy” would make “theR ofx” actually refer to the uniquex which is thefather ofy, and so what has been explained above isappropriate to theconverse of that relation, \(\relbR\). Thepractice of reading the argument of a relational function as thex and the value as they is so well established that wehave taken a liberty with the actual definitions in PM.)

Recall that from this point on in PM, the relations are to beconsidered as “relations in extension” and so it is easyto see how one can treat the relations as ordered \(n+1\)-tuples ofwhich the last member is unique given the firstn arguments. Inparticular, a monadic functionf can be seen in the familiarway as a set of ordered pairs (of \(\langle x, f_R(x) \rangle\)) foreach argumentx in the domain of the function.

Given the treatment of “relations” as “relations inextension” it is no accident that the development of the logicof relations in∗30–∗38looks familiar to contemporary logicians, with even some of thenotation from PM surviving into contemporary usage. A series ofnotions are defined in a way quite familiar to the modern treatment ofrelations as sets ofn-tuples:

The Converse of a Relation

\[\tag*{∗31·02} {\relbR} = \{\lambda x \lambda y (y \relR x)\}\]

or, in terms of pairs:

\[{\relbR} = \{ \langle x, y \rangle \mid (y \relR x) \}\]

Domains, Ranges and Fields of Relations

The notions of thedomain,range, andfield of a relationare also given a contemporary definition (and so also the notions ofthedomain,range andfield of a function).

\[\begin{align}\tag*{∗33·11} \Domain \; R &\eqdf \{x \mid \exists y ( x \relR y ) \}\\\tag*{∗33·111} \Range \; R &\eqdf \{y \mid \exists x (x \relR y) \}\\\tag*{∗33·112} Field \; R &\eqdf \{x \mid \exists y (x \relR y \vee y \relR x ) \} \\\end{align}\]

Note that it is possible that a relation can have its domain in one typeand range in another. This adds complications in the theory ofcardinal numbers when a relation of similarity (equinumerousity) holdsbetween classes of different types. (See the discussion of ∗100below.)

The Product of Two Relations

Thecomposition of relationsR andS is calledtheirrelative product and uses a different symbol \(R\midS\) where we write \(R \circ S\):

\[\tag*{∗34·01} R \circ S \eqdf \lambda x \lambda z \{ \exists y ( x \relR y \amp y \relS z ) \}\]

Restricted Relations

In the case of therestriction of (the range of) a relationR to aparticular class \(\beta\), is given this definition, with the symbol now used instead for the restriction of the domain :

\[\tag*{∗35·02} R \upharpoonright \beta \eqdf \lambda x \lambda y (x \relR y \amp y \in \beta)\]

In his survey of PM, Quine (1951: 155) complains that this last 100pages of Part I is occupied with proving theorems relating redundantdefinitions of the same notions. Thus PM defines the notion of domainand range and then introduces notions that again define the sameclasses, which are proved to be equivalent. PM defines the notation of ‘\(R\pmdq\beta\)’to be read as “the termswhich have the relationR to members of \(\beta\)” anduses the example:

If \(\beta\) is the class of great men, andR is the relationof wife to husband, \(R\pmdq\beta\) will mean “wives of greatmen”. (PM, 278)

In contemporary logic with the notation of set theory used above,there is no need for a special symbol for this notion, as it iswritten as:

\[\tag*{∗37·01} R\pmsq \pmsq \beta \eqdf \{ x \mid \exists y (y \in \beta) \amp x \relR y \}\]

Products and Sums of Classes of Classes

\[\tag*{∗40·01} {} \cap \alpha \eqdf \{ x \mid \forall \beta (\beta \in \alpha \supset x \in \beta) \}\]

This is theintersection of \(\alpha\).

\[\tag*{∗40·02} {} \cup \alpha \eqdf \{ x \mid \exists \beta (\beta \in \alpha \amp x \in \beta) \}\]

is theunion of \(\alpha\).

4.2 Part II: Prolegomena to Cardinal Arithmetic

The Cardinal Number 1

\[\tag*{∗52·01} 1 \eqdf \{ \alpha \mid \exists x ( \alpha = \{x \} ) \}\]

So the cardinal number 1 is the class of all singletons. There will bea different number 1 for each type ofx. Frege, by contrast,defines the natural number 1 as the extension of a certain concept,namely being identical with the number 0, which itself is theextension of the (empty) concept of not being self identical. Inaxiomatic set theory the natural numbers are particular finiteordinals, in particular the series with 0 as the empty set \(\emptyset\), 1 is \(\{0 \}\), 2 is \(\{0, 1 \}\), and so on. This constructionis named thevon Neumann ordinals.

Pairs

\[\tag*{∗54·02} 2 \eqdf \{ \alpha \mid \exists x \exists y (x \neq y \amp \alpha = \{y \} \cup \{ x \} ) \}\]

Similarly, the number 2 is the class of all pairs, rather than aparticular pair. In the type theory of PM there will be distinctcouples for the types ofy andx. When they are of thesame type the couple is called “homogenous”. Even withhomogenous pairs there will be distinct classes of pairs for eachtype, and thus a different number 2 for each type. The same notionapplies to relations.

Ordered Pairs

The notion of an ordered pair, called an “ordinalcouple” is defined as:

\[\tag*{∗55·01} \langle x, y \rangle \eqdf \textrm{ the extension of } \lambda x \lambda y (x \in \{x \} \amp y \in \{y\})\]

The idea is that the order of the relation \(\lambda x \lambda y (x\in \{x\} \amp y \in \{y\})\) determines the first and second elementof the ordered pair. It is a relation in extension, which is theanalogue of a property in extension or class. A relation in extensionhas a distinction between the first and second elements due to theorder of the defining relation. The closest in contemporary languagewould be:

\[\phi \langle x, y \rangle \eqdf \exists \psi \forall u \forall v ( \psi (u, v) \equiv \lambda x \lambda y [ x \in \{x\} \amp y \in \{y\} ] (u, v) \amp \phi (\psi) ) \]

Given the definition of extensions of relations this is the version ofthe no-classes theory for relations. After attending classes ofRussell the year before, and having several discussions, NorbertWiener (1914) proposed the following definition (in modernnotation):

\[\langle x, y \rangle \eqdf \{\{\{ x\}, \emptyset \}, \{\{ y\}\}\} \textrm{ where } \emptyset \textrm{ is the empty set.}\]

Wiener’s accomplishment was to capture the ordering of the pairwhich in PM is captured by the ordering of the arguments of relationswith the unordered notion of set membership.

The end of PM to∗56

The paperback abridged edition ofPM to∗56 only goes this far, so the remainingdefinitions have only been available to those with access to the fullthree volumes of PM.

Relative Types

This section presents a discussion of relations between individuals ofdistinct types, introducing a notation for types, \(t\pmsq x\) for thetype to whichx belongs. This section is little used in VolumeI. The special consequences for this notion when dealing with relativetypes of cardinal numbers is the topic of the Preface to Volume II,which was added after the first volume was already in print. The delaydue to working out these details partially explains the three year gapbetween the publication of Volume I in 1910, and the remaining volumesII and III in 1913. Section∗65 (On the TypicalDefinition of Ambiguous Symbols), is a discussion oftypicalambiguity, the ambiguity of variables with respect to type.

\[\tag*{∗70·01} f: \alpha \rightarrow \beta \eqdf\]

The functionsf from \(\alpha\) onto \(\beta\), that is, the\(\Domain \; f = \alpha\) and \(\Range \; f = \beta\)

Similarity of Classes

\[\tag*{∗73·01} \alpha \approx \beta \eqdf (\exists f) f : \alpha \stackrel{1-1}{\longrightarrow} \beta.\]

There is a one-one function mapping \(\alpha\) onto \(\beta\)(similarity of \(\alpha\) and \(\beta\)). Contemporarydiscussions say that \(\alpha\) and \(\beta\) areequinumerous. Difficulties arise with respect to thedefinition of cardinal numbers when the relation of similarity theyinvolve is one that has a domain and range in different types. See∗100 below.

The main theorem in this chapter is a proof of the Cantor-Bernsteintheorem, that if a set \(\alpha\) is similar to a subsetz ofanother set \(\beta\) and \(\beta\) is similar to a subset \(\delta\)of \(\alpha\) then \(\alpha\) and \(\beta\) are themselvessimilar:

\[ \forall \alpha \forall \beta \forall \gamma \forall \delta \left[\left (\begin{split}\alpha \approx \gamma & {}\amp \beta \approx \delta \\&{} \amp \gamma \subseteq \beta \\&{} \amp \delta \subseteq \alpha \\\end{split}\right) \supset \alpha \approx \beta \right] \tag*{∗73·88}\]

The proof here explicitly follows the proof by Ernst Zermelo from1908. Whitehead and Russell call this the“Schröder-Bernstein” theorem.

The Axiom of Choice (Multiplicative Axiom)

The Multiplicative Axiom, or “Axiom” of Choice, is not anaxiom of PM, what is termed a “primitive proposition”, butis instead a defined expression that is added as an hypothesis totheorems for which it is used. This reflects the emerging awareness atthe time of the role of the Axiom of Choice in various proofs, inparticular, Zermelo’s proof that every class can bewell-ordered.

\[{}\\begin{aligned}&\textrm{Multiplicative}\\&\textrm{Axiom}\end{aligned} \eqdf \forall \alpha \left\{ \begin{split}&\forall \beta (\beta \in \alpha \supset \beta \neq \emptyset ) \amp {}\\&\;\;\forall \beta \forall \delta \left [\left(\begin{split}\beta & \in \alpha \amp {}\\ \delta & \in \alpha \amp {}\\\beta & \neq \delta\end{split}\right) \supset (\beta \cap \delta = \emptyset) \right] \supset \\&\;\;\;\; \exists \beta \forall \delta \exists \gamma\left[\begin{split}& \delta \in \alpha \supset \\& \delta \cap {} \beta = \{\gamma\} \end{split}\right]\end{split}\right\}\tag*{∗88·03}\]

If \(\alpha\) is a class of mutually exclusive, non-empty, classes, then there isa (“choice”) set \(\beta\) such that the intersection of\(\beta\) with each member of \( \delta \) of \(\alpha\) is a unique (chosen) member of\(\delta \).

\(\Rast\) The Ancestral Relation

\[\quad\Rast \eqdf\left \{\begin{split}& \langle x, y \rangle \mid (\exists u x\relR u \lor \exists u uRx ) \amp {} \\& \forall \alpha \left [ \left[\begin{split}& x \in \alpha \amp {}\\&\forall z \forall w (z \in \alpha \amp zRw \supset w \in \alpha ) \end{split}\right] \supset y \in \alpha \right] \end{split}\right \}\tag*{∗90·01}\]

This follows Frege’s definition, namely, thaty is in alltheR-hereditary classes that containx or (x isin the field ofR ).

The Powers of a Relation

The “powers” of a relation \(R (\textrm{Pot}\pmsq R)\) arethe relationsR, \(R^2\), \(R^3\), …where

\[{R^2 \eqdf R \circ R},\quad {R^{n+1} \eqdf R \circ R^n},\quad \ldots\]

These definitions begin with ∗91·03, using the notion ofthe ancestral of a higher-order relation between relations definedbeginning withR.

The main result of this section is another proof of theCantor-Bernstein theorem: “This proof is essentially the same asBernstein’s published originally by Borel [1898: Note 1, pp.102–7]” (PM I, 589). In this proof the one-one relationbetween the sets \(\alpha\) and \(\beta\) is constructed from thepowers of two relationsR that maps \(\alpha\) into \(\beta\),andS that maps \(\beta\) into \(\alpha\). The one-one mappingis constructed in stages. First all of \(\alpha\) is mapped onto\(\beta\) byR. Those elements in \(\beta\) not in the range ofR are mapped onto \(\alpha\) byS. But some elements inthe range ofS will have already been mapped byR. Theyneed to be shifted to a new image in \(\beta\), again byR.This process is iterated through all of the powers ofR, andthen it is shown that the resulting relation is one to one from\(\alpha\) onto \(\beta\). See Hinkis (2013) for a history of the manydifferent proofs of this theorem.

5. Volume II

5.1 Prefatory Statement of Symbolic Conventions

The writing of this preface delayed the publication of the secondvolume of PM, as Whitehead and Russell struggled over thecomplications it raised. The difficulties arise from thetypicalambiguity of terms and formulas of the theory of types. Everyconstant, such as those for the numbers \(0,1, \ldots, \aleph_0\) willhave a definition relative to each type. Without assuming the Axiom ofInfinity for individuals, there is no guarantee that a given constantdesignates a non-empty class in a given type. The preface introducesthe notion of “formal numbers”, which are to beinterpreted as belonging to a type that makes them not identical withthe \(\emptyset\) for that type. Volume II begins with Part III,“Cardinal Arithmetic”. The notions of cardinal numbers aredeveloped in full generality, extending to infinite cardinals.Consequently the theory of natural numbers, which are called“Inductive Cardinals” in PM, is introduced with a seriesof definitions of special cases of notions that are first introducedin a general form applying to any numbers or classes. For example,addition of natural numbers, as in the famous proof that \(1 + 1 = 2\)in∗110·04 is proved for the specialcase of the addition of classes that applies to cardinal numbers,‘\(+_c\)’. The Summary to section A introduces the notionofhomogenous cardinals, which are classes of similar classeswhose members are all of the same type. It is possible to definesimilarity between two classes \(\alpha\) and \(\beta\) of distincttypes say \(\tau\) and \(\tau '\), and cardinals are classified asdescending andascending as the domain of therelevant similarity relation is of a higher type than the range, andwhen of a lower type, respectively. The theory of cardinal numbers isstraightforward with homogenous cardinals, however the exceptions mustbe kept in mind, as is evidenced in∗100.

5.2 Part III: Cardinal Arithmetic

Definition of cardinal numbers

\[\tag*{∗100·01} \rN_c \eqdf \{ x \mid \forall y (y \in x \leftrightarrow \forall z \forall w (z,w \in y \leftrightarrow z \approx w) \}\]

Cardinal Numbers are classes of equinumerous (similar)classes. We can add a notion of thenumber of a class toallow for a direct comparison with Frege:

\[\# \{x \mid \phi ( x) \} \eqdf \{ y \mid y \approx \{x \mid \phi (x) \} \}.\]

(In set theory of course this is too large to be a set, and so is justa “proper class”.)

Hume’s Principle in PM

Hume’s Principle which is described in Frege (1884:§63) as asserting that the content of the proposition that“the number which belongs to the concept F is the same as thenumber which belongs to the concept G” is equivalent to“the concept F is similar to the concept G”. In terms ofclasses this becomes \(\alpha \approx \beta \equiv \# \alpha = \#\beta\). Hume’s principle is the focus of much of the discussionof “Neo-Logicism”, the doctrine that Frege’sconstruction of the numbers can be built on a consistent foundation(see the entry onFrege’s theorem).

Only one direction of this equivalence is provable in PM:

\[\tag*{∗100·321}\alpha \approx \beta \supset \# \alpha = \# \beta\]

The failure of the other direction, the implication from right toleft, is due to the possibility that \(\alpha\) and \(\beta\) are ofdifferent types, so that any similarity relation between them willhave its domain and range in different types. Suppose there are\(\aleph_0\) individuals, and consider two higher types with thecardinals in them of even larger, but distinct, cardinalities, say\(\alpha\) in some high type has cardinality \(\aleph_1\) and that\(\beta\) in an even higher type and has cardinality \(\aleph_2\).There are no sets of individuals similar to \(\alpha\) or \(\beta\),so no similarity relation with a domain in \(\alpha\) or \(\beta\)will hold with respect to any set in the type of individuals. Supposethat \(\#\) is defined in terms of suchdescending relation.Therefore \(\# \alpha = \{ \Lambda \} = 0\) and \(\# \beta = \{\Lambda \} = 0\) so \(\# \alpha = \# \beta\), yet \(\alpha \not\approx\beta\) on any similarity relation of whatever domain and range,because their cardinalities differ. Whitehead and Russell assert thatthe case of \(\alpha\) and \(\beta\) being in different types is theonly way to construct an exception to this direction of Hume’sprinciple, and offer as a restricted version:

\[\tag*{∗100·34}\exists \gamma [\gamma \in (\alpha \cap \beta) ] \supset ( \alpha \approx \beta \equiv \# \alpha = \# \beta )\]

The antecedent guarantees that \(\alpha\) and \(\beta\) are of thesame type, and so the cardinal numbers involved arehomogenouscardinals. (Landini (2016) argues that this section of PM isconfused.)

0 Defined

\[\tag*{∗101·1} 0 \eqdf \# \emptyset\]

The class of all classes equinumerous with the empty set is just thesingleton containing the empty set, so \(0 = \{ \emptyset \}\).

TheArithmetical Sum of Classes and Cardinals

\[\tag*{∗110·01} \alpha + \beta \eqdf [\{ \beta \cap \emptyset \} \times \{\alpha \}] \cup [ \{ \alpha \times \{ \beta \} ] \]

(If \( \alpha, \beta \neq \emptyset \) , otherwise \( \alpha +\emptyset = \alpha, \emptyset + \beta = \beta \) ). This qualificationis hidden in PM by the use of expressions for functional relationsthat are sometimes undefined. Thearithmetic sum of\(\alpha\) and \(\beta\) is the union of \(\alpha\) and \(\beta\)after they are made disjoint by summing the pairing the elements of\(\beta\) with elements of \(\{ \alpha \}\) and the elements of\(\alpha\) with the elements of \(\{ \beta \}\). (TheCartesianproduct of \( \gamma \) and \( \delta \), \( \gamma \times \delta\) is \( \{ \langle x, y \rangle | x \in \gamma \; \amp \; y \in\delta \} \)). The classes \(\alpha\) and \(\beta\) areintersected with the empty class, \(\emptyset\), to adjust the type ofthe elements of the sum. It is more recognizable to contemporary settheory by the equivalent definition (subject to the same exceptionwhen:

\[\alpha + \beta \eqdf \{ \langle \emptyset, x \rangle | x \in \alpha \} \cup \{ \langle y, \emptyset \rangle | y \in \beta \} \]

Thecardinal sum ofy andx

\[\tag*{∗110·02}{}\quad y +_c x = \{ z \mid \exists \alpha \exists \beta [( y = \# \alpha \amp x = \# \beta) \amp z \approx \alpha + \beta ] \} \]

\(y +_c x\) expresses thecardinal addition of cardinalsy andx. It is the arithmetical sum of“homogeneous cardinals”, cardinals of a uniform type, towhich \(\alpha\) and \(\beta\) are related by \(\rN_0 c\) (itselfdefined at [∗103·01]). The notation indicating that\(\alpha\) is a homogenous cardinal \(\alpha\) is \(\rN_0c\pmsq\alpha\), which we might write as \(\#_0\) in an extension ofour contemporary notation replacing \(\#\) above.

The reader can now appreciate the notorious fact that \(1 +1=2\), themost elementary truth of arithmetic, is not proved until page 83 ofVolume II ofPrincipia Mathematica, and even then, almost asan afterthought:

\[\tag*{∗110·643} 1 +_c 1 = 2\]

Whitehead and Russell remark that “The above proposition isoccasionally useful. It is used at least three times,in…”. This witticism reminds us that the theory ofnatural numbers, so central to Frege’s works, appears in PM asonly a special case of a general theory of cardinal and ordinalnumbers and even more general classes of isomorphic structures.

Exponentiation

Exponentiation for cardinals is defined in such a way that itcoincides with Cantor’s notion that the cardinality of thepowerset of a class \(\alpha\) is 2 raised to the power of thecardinality of \(\alpha\):

\[\tag*{∗116·72}\lvert\lvert \wp \alpha \rvert\rvert = 2^{\lvert\lvert \alpha \rvert\rvert }\]

Greater and Less

Cantor’s Theorem:

\[\tag*{∗117·661} 2^{y} > y\]

This is Cantor’s theorem that the if a set \(\alpha\) has acardinal numbery then the cardinal number \(2^{y}\) of thepowerset of \(\alpha\) is greater thany.

The Natural Numbers

The most direct comparison with Frege’s development of thenatural numbers comes with the notion ofInductive Cardinalby which PM means the natural numbers 0, 1, 2,…, and the theoryof these numbers including the principle of induction. Although thenumbers 0 and 1, as well as addition of natural numbers \(+_c\) is hasbeen defined earlier, they are defined as cardinal numbers andaddition will apply to all cardinal numbers, finite and transfinite.For the finite natural numbers, special notions need to be definedfirst. For the proof of the Peano Postulates it is necessary not onlyto define 0, but also the notion ofsuccessor. For Frege thenotion of (weak) predecessor of a number is defined, thus 0 and 1 arethe predecessors of 1, while 0, 1 and 2 are the predecessors of 2,etc. The successor ofn is then defined by counting thepredecessors of a number, in terms of the definition of number, it isthe number of the class of predecessors. This definition would notwork for PM, where each number would be of a higher type, as it isdefined as a set containing that number. There will in fact be naturalnumbers for each type, thus a set of all pairs of individuals of type0, a set of pairs of sets of type 1, etc., for each type. There is noone type, however, at which there are all of the natural numbers (setsof equinumerous sets of that type) without an assumption that thereare infinitely many members of some one type.

The solution in PM is to guarantee that for each finite set ofn individuals of type 0, there will be some object not in thatset, which can be included in the set defining the successor. That such anew individual can be found is guaranteed by the Axiom of Infinity,which in effect asserts the existence of distinct individuals of anyfinite number. It is interesting to note the “Axiom”of Infinity is not aprimitive proposition of the logic ofPM. Instead it is an additional hypothesis, to be used as anantecedent to mathematical assertions upon which it depends. The issueof whether the system of PM succeeds aslogicism is thus notsettled by noting that an Axiom of Infinity has to be assumed, but bydetermining whether that “Axiom” is derivable from logicalprinciples alone.

In axiomatic set theory the “Axiom of Infinity” guaranteesthe existence of a particular set, the ordinal \(\omega\):

\[\exists x [ \emptyset \in x \amp \forall y ( y \in x \supset y \cup \{ y\} \in x ) ]\]

Theinductive cardinals (natural numbers) are defined as thenumbers bearing the ancestral of the \(+_1\) relation to 0. Given thatthe \(+_1\) relation is the PM account ofsuccessor this isthe same definition as for Frege.

Inductive Cardinals N

\[\tag*{∗120·01}\rN \eqdf \{x\mid 0 \relSast x \}\]

Theinductive cardinals N are the familiarnaturalnumbers, namely 0 and all those cardinal numbers that are relatedto 0 by the ancestral of the “successor relation”S, where \(x\relS y\) just in case \(y = x +1\).

\[\tag*{∗120·03}\textrm{Axiom of Infinity} \eqdf \forall y ( y \in \{x\mid 0\relSast x \} \supset y \neq \emptyset )\]

This Axiom of Infinity asserts that all inductive cardinals arenon-empty. (Recall that \(0 = \{\emptyset \}\), and so 0 is notempty.) The axiom is not a “primitive proposition” butinstead to be listed as an “hypothesis” where used, thatis as the antecedent of a conditional, where the consequent will besaid to depend on the axiom. Technically is not anaxiom ofPM as ∗120·03 is adefinition, so this is justfurther defined notation in PM!

Whitehead and Russell do carry out the step of the logicist program ofderiving Peano’s Postulates based on the prior definitions ofthe notions of Natural number, 0, and successor, as Russell describesthe project later, in (1919). This is in fact what is done in∗120 “Inductive Cardinals” of PM, butis not described as such, either there or in introductory material.The results are not proved separately, but as they appear in adevelopment of various results about natural numbers. Indeed some, such as ∗120·31, can only be seen to be versions of a Peano axiom with a bit of work.

0 is a natural number. \[ \tag*{∗120·12} 0 \in \rN \]
The successor of any number is a number. \[ \tag*{∗120·121} n \in \rN \supset n +_c 1 \in \rN \]
No two numbers have the same successor (assuming the axiom ofInfinity). \[ \tag*{∗120·31}\ \textrm{Axiom of Infinity} \supset (n +_{c} 1 = m +_{c} 1 \supset n = m) \]
Given the way that the successoroperation is defined, it is not a matter of logic that there is anextra individual to add to a set of sizen to give one of size\(n +_{c} 1\). This is guaranteed by adding the Axiom of Infinity asan hypothesis to the theorem.
0 is not the successor of any number. \[ \tag*{∗120·124} n +_{c} 1 \neq 0 \]
Any property \(\phi\) which belongs to 0, and belongs to thesuccessor ofm provided that it belongs tom, belongs toall natural numbersn. \[ \tag*{∗120·13} \forall n \{ [ n \in \rN \; \amp \; \forall m( \phi m \supset \phi (m +_c 1)) \; \amp \; \phi \: 0] \supset \phi n \} \]

In contemporary set theory the notion of the successor of a number isdefined directly for the ordinals as \(s(x) = x \cup \{x \} \)rather than by adding 1, and addition is defined using the familiarrecursive definition:

\[\begin{align}x + 0 & = x\\x + s(y) & = s (x + y)\end{align}\]

The use of recursive definitions is justified by a theorem provingthat they describe a unique function. The induction axiom is justifiedby showing that any class that contains 0, and for any numberncontains \(s(n)\) will contain all of the numbers in \(\omega\). Theexistence of \(\omega\) is guaranteed by the ZF axiom of infinity.

At this point, after 225 pages in Volume II, the reader will see howto compare the logicist reduction of arithmetic in PM with rivalaccounts of Frege and of contemporary set theory.

Frege completes his development of the natural numbers at page 68 of the Volume II of hisBasic Laws of Arithmetic published in 1903, which follows the250 pages of Volume I that had been published in 1893. So both Fregeand the authors of PM took great pains to prove more advanced theoremsonly after a chain of closely argued lemmas based on their ownformalized symbolic logic.

Frege ends his deductions of the laws of arithemtic with results aboutthe notions of 0, Successor, and the principle of Induction whichinclude the Peano Axioms. He does not consider arithmetical functions,such as addition or multiplication, and thus does not define thesuccessor of a numbern as the result of adding 1 ton.

Frege’s account is streamlined in other ways as well. He hadalways considered the analysis of simple identity sentences to beimportant to his logicism, dating back to theBegriffsschrift(1879), on through hisFoundations of Arithmetic (1884) andhis “On Sense and Reference” (1892), and even in theappearance of the example “\(2^2 = 2 + 2\)” in the forwardto theBasic Laws of Arithmetic. Indeed the analysis ofidentity sentences is the starting point of his introduction of thetheory of sense and reference in 1879,yet Frege does not diverge fromhis project enough to show how such an identity would be proved. SoWhitehead and Russell might well have wanted to include their proof of\(1+1 = 2\) at∗110·643 as a reminder ofthe analysis of mathematical equations in PM using the“descriptive functions” of∗30 andthen the account of definite descriptions in∗14.

Frege also does not construct the general theory of the arithmetic ofcardinal and ordinal numbers that occupies PM for much of Part III.Indeed, after the theorems on Arithmetic in §54 which concludesPart II ofBasic Laws of Arithmetic, Frege jumps directly tothe topic of Real Numbers for the remainder of Volume II. Judgingsimply on the amount of theorems that had to be proved in leading tothe account of Peano Arithmetic, PM does not differ wildly fromFrege’s earlier attempt.

Admittedly the system of PM is an indirect and cumbersome system todevelop if the theory of Arithmetic were the only goal in mind.Firstly, however, the system of the ramified theory of types isindependently interesting for the foundations of logic that itprovides. After∗20 the theory of classes, anddevelopment of general notions of arithmetic for relations whichfollows, does present the arithmetic of the natural numbers as aspecial case which can be generalized to the arithmetic of ordinal andcardinal numbers all in a logic with a simple hierarcy of types. Thesurvey below of what follows in Volume II and Volume III shows theparticular way of developing the theory of rational and then realnumbers that PM follows. The results in set theory will seemprimitive, as the results are dated to around 1908, at just the pointwhen axiomatic set theory began its extraordinary development.Whitehead and Russell were not active contributors to set theory andso PM should not be studied for later technical results that may havebeen anticipated here. Russell summarized results in PM from thecurrent state of the study of infinite cardinals and ordinals in apaper he give in Paris in a paper called “On the Axioms of theInfinite and Transfinite” (Russell 1911). There is, however, oneresult concerning two notions of infinity that appears to originatewith PM.

Dedekind Infinity

In PM a class is afinite “inductive” class ifand only if it can be put into a one to one correspondence with theNatural Numbers less than or equal to some natural number \(n\). It willbeinfinite if and only if it is not inductive.

A class isDedekind Infinite (Reflexive) if and onlyif it can be put in a one to one correspondence with a proper subsetof itself.

The key theorem in this section is:

\[\tag*{∗124·57}\textrm{If } y \textrm{ is not inductive then } 2^{2^y} \textrm{ is reflexive.}\]

The Inductive and Reflexive notions of infinity coincide if oneassumes that axiom of Choice. This result does not assume the axiom ofChoice.

George Boolos (1994: 27) describes the details of this argument andquotes J.R. Littlewood as saying:

He [Russell] has a secret craving to have provedsomestraight mathematical theorem. As a matter of fact thereisone: “\(2^{2^{\alpha}} > \aleph_0\) if \(\alpha\) isinfinite”. Perfectly good mathematics.

As use of the Axiom of Choice is explicitly indicated, and manyresults do not use it, the unique contribution to set theory of PM maybe in its indication of what can be proved without assumingChoice.

5.3 Part IV: Relation-Arithmetic

Relation Arithmetic is the study of the generalization ofcardinal and ordinal numbers to classes of similar classes wheresimilarity is based on an arbitrary relation. A relationP issimilar to a relationQ, if there is a one to one relationS (acorrelator) relating the domain ofP tothe domain ofQ so that if \(x\relP y\) for somex andy then if \(x\relS w\) and \(w\relQ z\) then \(z \relbS y\).The mappingS is an isomorphism between the relationsPandQ. Arelation number will then be a class ofrelations that are similar to each other. Relation Arithmetic thengeneralizes the notions of cardinal arithmetic, such as sum andproduct, to arbitrary relation numbers. Russell himself expressed regret that the material inPart IV was not more carefully studied by his contemporaries (Russell1959: 86). If a series \( \relP \) is well-ordered, then the class of relations ordinally similar to \( \relP \) will be an ordinal number. The sums of ordinal numbers are are studied in Tarski (1956), but there has been little interest in the more general notion of Relation Arithmetic presented in these sections of PM. See Solomon (1989).

Ordinal Similarity

∗151·01PandQ aresimilar ordinally\[ \eqdf \exists S [S: \Domain \; P \stackrel{1-1}{\longrightarrow}\Domain \; Q \amp P = S \circ Q \circ {\relbS} ] \]

P andQ aresimilar ordinally, written \(P\smor Q\), just in case there is a one to one mapping \( \relS \) ofthe domain ofP into the domain ofQ such if \(x\relPy\), \(x \relS z\), and \( z \relQ w \) then \(w \relbS y\).

Thesum of seriesP andQ, \(P \oplus Q\),is defined as:

\[\begin{align} \tag*{∗160·01} &\ P \oplus Q \eqdf\\ &\quad\{ \langle x, y \rangle | \; x\relP y \lor x \relQ y \lor[ \exists z (z\relP x \lor x\relP z)\; \amp \; \exists z (z\relQ y\lor y\relQ z ) ] \} \end{align}\]

As it is put in the headnote to∗160:

…we may regard the sum ofP andQ as a relationwhich holds betweenx andy when eitherxprecedesy in theP series, orx precedesy in theQ series, orx belongs to theP-series andy belongs to theQ-series.

Theproduct of seriesP andQ, \(Q \otimesP\), relates pairs of members of the field ofP to members of the field ofQ as follows. ( This should not be confused with the more familiar notion of relative product which was defined in ∗34.)

As it is put in the headnote to∗166:

The product \( \relQ \otimes P \) is … a relation which has forits field all the couples that can be formed by choosing a thereferent in \(C ‘ P \) and the relatum in \(C ‘ Q \).These couples are arranged by \( \relQ \otimes \relP \) on thefollowing principle: If the relatum of the one couple has the relation\( \relQ \) to the relatum of the other, we put the one before theother, and if the relata of the two couples are equal while thereferent of the one has the relation \( \relP \) to the referent ofthe other, we put the one before the other.

\[\begin{align}\tag*{∗166·112}&\ \langle x, z \rangle Q \otimes P \langle y, w \rangle {}\\&\quad\equiv [ ( x,y \in Field \; P \; \amp \; z,w \in Field \; Q ) \; \amp \; zQw ] \lor ( z = w \; \amp \; xPy )\end{align}\]

(Notice that while the \( sum \) of two binary relations is a binary relation, their \( product \) is a relation between pairs. This is the generalization from classes to relations of the fact that the cardinality of the relative product of two classes is the cardinality of the class of ordered pairs of elements taken one from each.) It is possible to prove results that show the differences between products and sums ofrelations and of numbers. The product of relations is associative:

\[\tag*{∗166·42} (P \otimes Q) \otimes R \text{ is ordinally similar to } P \otimes (Q \otimes R)\]

The relations distribute in one way:

\[\tag*{∗166·45} (Q \oplus R) \otimes P = (Q \otimes P) \oplus (R \otimes P)\]

However, it does not hold in general that:

\[P \otimes (Q \oplus R) = (P \otimes Q) \oplus (P \otimes R).\]

For the purposes of defining rational and real numbers as relationsbetween relations, it is necessary to define the ordering relationbetween individuals related by the relation, that is, in the domain orrange (thefield) of the relations. This is described in thesummary of∗170 and a theorem as follows:

…\(\alpha\) is said to precede \(\beta\)…if we considerthe two classes \(\alpha - \beta\) and \(\beta - \alpha\), there aremembers of \(\alpha - \beta\) which are not preceded by any members of\(\beta - \alpha\). (Vol. II, 1912: 411 and 1927: 399)

∗170·01\(\alpha \) precedes \( \beta \) in the relation \(P \)\[\alpha P_{lc}\beta \equiv \exists x \{ x \in ( \alpha - \beta) \amp \lnot [ \exists y(y \in \beta - \alpha \amp y\relP x)] \}\]

The notions of the sum and product of relation numbers is defined asthe relation number of the sum and product of the relations, withadjustments made so that the types of the relations are uniform, andthe numbers contain disjoint relations, as was seen in the definitionof sum for cardinal numbers in∗110 above. Ifx andy are relation numbers, their sum is \(x \dot{+}y\) and the product is \(x \dot{\times} y\).

It is proved that the operation of sum for relation numbers is associativeand other properties that directly follow from the correspondingproperties of the sums and products of relations. Among the manytheorems is the fact that sum for relation numbers is associative:

\[\tag*{∗180·56}( y \dot{+} x ) \dot{+} \rho = y \dot{+} ( x \dot{+} \rho)\]

The distribution of product of relation numbers over sum for relationnumbers holds in one form:

\[\tag*{∗184·42} ( x \dot{+} \rho) \dot{\times} y = (x \dot{\times} y) \dot{+} ( \rho \dot{\times} y )\]

5.4 Part V: Series

Aseries (linear ordering) is defined as a relationthat isirreflexive \(\forall x (\lnot x\relR x)\),transitive \(\forall x \forall y \forall z (x\relR y \ampy\relR z \supset x\relR z)\), andconnected \(\forall x\forall y (x\relR y \lor y\relR x)\).(∗204·01) (These properties are restrictedto a specific domain for each relation. Thus a connected relation willhold between any two members of a given domain.) This is now called alinear ordering of a given set.

Sequents

Thus sequents of \(\alpha\) are its immediate successors. If\(\alpha\) has a maximum, the sequents are the immediate successors ofthe maximum; but if \(\alpha\) has no maximum, there will be no oneterm of \(\alpha\) which is immediately succeeded by a sequent of\(\alpha\); in this case, if \(\alpha\) has a single sequent, thesequent is the “upper limit” of \(\alpha\). (PM Vol. II,“Summary of ∗205”, 1912: 577 or 1927: 559)

Dedekindian Relations

We call a relation “Dedekindian” when it is such thatevery class [bounded from above] has either a maximum or a sequent with respect to it. (PMVol. II, “Summary of ∗214”, 1912: 684 or 1927: 659)

In other words, when the relation \(R\) such as \( \pmlt \) isDedekindian when every class \(\alpha \) has either a maximum or a sequent with respect to \( R \). This is the standard definition by which every segment with an upper boundhas a least upper bound. That least upper bound will either be themaximum of the set or the least individual greater than all members ofthe set.

6. Volume III

6.1 Part V: Series (continued)

Elementary properties of well ordered series.

At∗250·51 we find a proof that the Axiom of Choicefollows from the Well-Ordering Principle, that is, that every set canbe well-ordered.

The series of Ordinals

The “Burali-Forti” paradox is described in theIntroduction to PM as one of the contradictions that can be resolvedby the theory of types:

It can be shown that every well-ordered series has an ordinalnumber…and that the series of all ordinals (in order ofmagnitude) is well-ordered. It follows that the series of all ordinalshas an ordinal number, \(\Omega\) say. But in that case the series ofall ordinals including \(\Omega\) has the ordinal number \(\Omega +1\), which must be greater than \(\Omega\), Hence \(\Omega\) is notthe ordinal number of all ordinals. (PM Vol. I, 1925: 61 and 1910:63)

In∗256 we find the resolution of the Burali-Forticontradiction in observing the relative types of classes of ordinals.Ordinal numbers, as classes of isomorphic series, will be of a highertype than their members. The purported “ordinal number of allordinal numbers” \(\Omega\) will be restricted to a type abovethe type of the ordinals that are its members. Just as there is noclass of all classes (of some type or other) there is also no ordinalof all ordinal numbers.

Theorem∗256·56 demonstrates that “inhigher types there are greater ordinals than any to be found in lowertypes”. (PM Vol.III, 75)

Zermelo’s Theorem

∗258·326Assuming the Axiom of Choice, every set can bewell-ordered.

This proof of Zermelo’s theorem follows Zermelo’s“new proof” of Zermelo (1908). Together with∗250·51 this shows that the Axiom of Choiceis equivalent to the Well Ordering principle.

The Transfinite Ancestral Relation

The discussion of “transfinitely hereditary” properties in∗257 constitutes the discussion of“transfinite induction” that Russell pointed out toReichenbach in the discussion reported above from Russell (1959:86),

Finite Ordinals

It is shown in∗262 that every infinite wellordered series consists of a series (well ordered set) ofprogressions. (\(\omega\) orderings).

The series of Alephs

A result of Hausdorff in (1906),that \(\omega_1\), the firstuncountable ordinal) is not the limit of a progression of smallerordinals is shown to follow if one assumes the Axiom of Choice(∗265·49). It is then conjectured that thiscannot be shown without relying on the Axiom of Choice. SeeGrattan-Guinness (2000: 403) for a discussion of Hausdorff’sinfluence on the content of PM.

Dorothy Wrinch, who had been a member of Russell’s circle ofunofficial students during the war, published in 1919 an article onDedekindian series of ordinal numbers. She describes the result asinvestigating “necessary and sufficient conditions that \(P^Q\)should be Dedekindian or semi-Dedekindian whenP andQare well ordered series” (Wrinch 1919: 219). This study followsup the result in∗124 that is described in Boolos(1994), as an investigation of the arithmetic of ordinals withoutassuming the Axiom of Choice. Wrinch’s paper follows not onlythe notation of PM, but also makes use of theorems upto the end of section V on Series, with numbers following the dot, andso could easily be added as∗277. Russell intendedto found a school of “mathematical philosophy”, and ofcourse succeeded in attracting Ludwig Wittgenstein to the foundationalissues in PM, but there is no other indication of logicians attemptingto set their results in the framework of PM.

6.2 Part VI: Quantity

The later portions of PMshould thus be studied for a hint of how the real numbers and the useof mathematics in measurement can be developed with this rivalfoundation on the theory of relations. Gandon (2008, 2012) argues thatthe application of mathematics is better explained using this logicistaccount than by rivals.

In∗300–∗314, positiveand negative integers, ratios and real numbers all defined. The goalof the section is to begin the study of how these numbers are used inthe measurement of quantities in geometry and physics.

Ratios

\[ R n/m S \eqdf \forall x \forall y ( x\rels{R^n} y \supsetx\rels{S^m}y) \]

forn,m relatively prime.

RelationsR andS stand in theratio ofn tom when \(x\rels{R^n} y\) then \(x\rels{S^m} y\),where \(R^n\) is then-th power ofR and \(S^m\) is them-th power ofS (see∗91). The ratioof two relations is represented by numbersn andm thatarerelatively prime, namely, if

\[\forall j \forall k \forall l [(n = j \times l \amp m = k \times l )\supset l = 1].\]

Ratios thus are relations between relations.RationalNumbers, in keeping with the generalized notions of all numbersin PM will be classes of similar ratios, and thus the work ofdeveloping the theory of rational, and then real numbers, is carriedout in terms of ratios and relations between ratios.

Real Numbers

The real numbers \(\Theta\) are defined as “the series ofsegments of the series of ratios”, or the set of Dedekind cutsof the sequence of rational numbers, in their standard ordering.Technically \(\Theta\) is a relation in extension. The individualsthat are related in the ratios, and thus at the basis of the series ofrational numbers must be infinite in number for the PM version of thereal numbers to have the structure we expect. In the introductorymaterial to the section they point out that while the construction ofthe reals thus requires the Axiom of Infinity, they add it explicitlyto the theorems where needed and try as much as possible to derive theresults upon which it does not rely without making thatassumption.

The theory of real numbers in PM is closer to that of Frege than the“arithmetizing” accounts of Dedekind or Cantor. Dedekindpostulated that irrational numbers are to be “created” tofill out the gaps in the series of rational numbers that are marked byDedekind cuts (Dedekind 1872 [1901: 15], while Cantor (1883: §9, para. 8[1996: 899]) identified real numbers with the limits of sequences of rationalnumbers.

Frege, and PM, however, see real numbers as abstracted from thesimiliarity of relations with a certain structure. See Gandon (2012)for a fine grained comparison of the Frege and Russellconstructions.

It is interesting to note that the overall structure and contents ofsections in PM andBasic Laws are similar. They both share afirst section on the symbolic logic that they will use, then a seriesof definitions and theorems about the notions of number and conceptsused in arithmetic, with a concluding section on real numbers. Whilethe range of the mathematics that is to be reduced to logic in the twoworks is the same, Frege restricts his work more directly to naturaland real numbers, while PM includes the theory of classes andarithmetic of relations and relations and infinite sets. While all ofthese topics are handled in the initial chapters of a contemporarytextbook in axiomatic set theory, as Urquhart (2013) suggests, thismay be the inevitable fate of even the most groundbreakingmathematical works.

The addition and multiplication of real numbers is defined in the nextsections as they are for other numbers, but taking disjoint instancesof each number similarity class, and performing the correspondingoperation on them. Many results assume the Axiom of Infinity as anhypothesis. The operations are symbolized as \(+_s\) and\(\times_s\).

Measurement—i.e. the application of ratios and realnumbers to magnitudes—will be dealt with in Section C; for thepresent, we shall confine ourselves to those properties of magnitudewhich are presupposed by measurement.…
We conceive of a magnitude as a vector,i.e. as an operation,i.e. as a descriptive function in the sense of∗30. Thus for example, we shall so define ourterms that 1 gramme would not be a magnitude, but the differencebetween 2 grammes and 1 gramme would be a magnitude,i.e. therelation “+1 gramme” would be a magnitude. On the otherhand a centimetre and a second would both be magnitudes according toour definition, because distances in space and time arevectors.…
We demand of a vector (1) that it shall be a one-one relation, (2)that it be capable of indefinite repetition,i.e. that if thevector takes us froma tob, there shall always be apointc such that the vector takes us fromb toc. (PM, Vol. III, “Summary of Section B”, 1913: 339and 1927: 339)

The kinds of quantity addressed in this section and the next areall

vector families that is, classes of one-one relations allhaving the same converse domain, and all having the domain containedin the converse domain. (PM Vol. III, 350)

6.2.1 Measurement

Measurement in PM is based on the relations between objects that arethe basis for the operation of measurement, that one object is heavierthan another, or longer than another. Quantities are then equivalenceclasses of objects which have the same relations to others. Operationsare defined on quantities, so that:

…that is to say, two-thirds of a pound of cheese ought to be\((2/3 \times_s 1/2)\) of a pound of cheese, and similarly in everyother case. (PM, Volume III, 407)

PM concludes with a seemingly dangling excursion into the special caseof the measurement of “Cyclic Families”. For “suchcases as the angles at a point, or the elliptic straight line, werequire a theory of measurement applicable to families which are notopen” (PM, Volume III, 457). The angles at a point will bemeasured from 0 to 360 degrees, and then start over again at 0 tomeasure an object rotating around a point. The many ratios thatrepresent the rotations are represented by a “principalratio”, which is used to assign the measurement of degrees.

6.2.2 There is no “Conclusion” at the end of PM

PM ends abruptly with the proof of a theorem(∗375·32) concerning cyclic families,without any concluding remarks or hint of what is to come later. Thethought is that further mathematics, including the Volume IV thatWhitehead was to write on geometry, would have to be developedpiecemeal. First the notions of a given branch of mathematics wouldhave to be defined in terms of earlier notions, such as classes ofrelations with a given structure, and then the important basic resultsin that field would be proved one by one, in the style of the work sofar. Establishing logicism would be an ongoing project, as open-endedas mathematics itself.

Bibliography

Primary Literature

Russell, Bertrand, [PoM] 1903,The Principles ofMathematics, Cambridge: Cambridge University Press. [PoM available online]
–––, 1905, “On Denoting”,Mind, 14(4): 479–493. doi:10.1093/mind/XIV.4.479
–––, 1911, “On the Axioms of the Infiniteand of the Transfinite”, printed inLogical andPhilosophical Papers 1909–1913: The Collected Papers of BertrandRussell, Vol. 6, John G. Slater (ed.), London and New York:Routledge, 1992, 41–53.
–––, 1919,Introduction to MathematicalPhilosophy, London: George Allen & Unwin.
–––, 1948, “Whitehead andPrincipiaMathematica”,Mind, 57(226): 137–138.doi:10.1093/mind/LVII.226.137
–––, 1959,My PhilosophicalDevelopment, London: George Allen and Unwin, and New York: Simonand Schuster; reprinted London: Routledge, 1993. (Page numbers are to the 1959 edition.)
–––, 1967, 1968, 1969,The Autobiography ofBertrand Russell, 3 vols., London: George Allen and Unwin;Boston: Little Brown and Company (Vols 1 and 2), New York: Simon andSchuster (Volume 3).
Whitehead, Alfred North, 1898,A Treatise on UniversalAlgebra, Cambridge: Cambridge University Press. [Whitehead 1898 available online]
–––, 1906, “On Mathematical Concepts ofthe Material World”,Philosophical Transactions of the RoyalSociety A: Mathematical, Physical and Engineering Sciences,205(387–401): 465–525. doi:10.1098/rsta.1906.0014
–––, 1926, “Notes: PrincipiaMathematica”,Mind, 35(137): 130.doi:10.1093/mind/XXXV.137.130-a
Whitehead, Alfred North, Bertrand Russell, and M.R. James, 1910,Contract for the First Edition ofPrincipia Mathematica,reprinted in “Illustrations: Manuscripts Relating to PrincipiaMathematica”,Russell: The Journal of Bertrand RussellStudies, 31(1): 82. doi:10.15173/russell.v31i1.2199
Whitehead, Alfred North and Bertrand Russell, 1910, 1912, 1913,Principia Mathematica, 3 volumes, Cambridge: CambridgeUniversity Press; 2nd edition, 1925 (Vol. I), 1927 (Vols II, II);abridged asPrincipia Mathematica to ∗56, Cambridge:Cambridge University Press, 1956. (Page numbers are to the secondedition.)

Secondary Literature

Borel, Émile, 1898,Leçons Sur La ThéorieDes Fonctions, Paris.
Bernays, Paul, 1926, “Axiomatische Untersuchungen desAussagen-Kalkuls derPrincipia Mathematica”,Mathematische Zeitschrift, 25: 305–320.doi:10.1007/BF01283841
Blackwell, Kenneth, 2005, “A Bibliographical Index forPrincipia Mathematica”,Russell: The Journal ofBertrand Russell Studies, 25(1): 77–80.doi:10.15173/russell.v25i1.2072
–––, 2011, “The Wit and Humour ofPrincipia Mathematica”, in Griffin, Linsky, andBlackwell 2011: 151–160. doi:10.15173/russell.v31i1.2198
Boolos, George, 1971, “The Iterative Conception ofSet”,Journal of Philosophy, 68(8): 215–231.doi:10.2307/2025204
–––, 1994, “The Advantages of Honest Toilover Theft”,Mathematics and Mind, Alexander George(ed.), Oxford: Oxford University Press, 27–44.
Burgess, John P., 2005,Fixing Frege, Princeton:Princeton University Press.
Cantor, Georg, 1883 [1996],Grundlagen einer allgemeinenMannigfaltigkeitslehre. Ein mathematisch-philosophischer Versuch inder Lehre des Unendlichen, Teubner, Leipzig. Printed as“Foundations of a General Theory of Manifolds: AMathematico-Philosophical Investigation into the Theory of theInfinite” inFrom Kant to Hilbert: A Source Book in theFoundations of Mathematics, Vol. II, William Ewald (trans.),Oxford: Oxford University Press, 1996, 878–920.
–––, 1895 & 1897 [1915],“Beiträge zur Begründung der transfinitenMengenlehre”,Mathematische Annalen, (1895) 46(4):481–512 & (1897) 49(2): 207–246. Translated asContributions to the Founding of the Theory of TransfiniteNumbers, Philip E.B. Jourdain (trans), Chicago: Open Court, 1915.doi:10.1007/BF02124929 (de) doi:10.1007/BF01444205 (de)
Chihara, Charles S., 1973,Ontology and the Vicious CirclePrinciple, Ithaca, NY: Cornell University Press.
Church, Alonzo, 1974, “Russellian Simple Type Theory”,Proceedings and Addresses of the American PhilosophicalAssociation, 47: 21–33. doi:10.2307/3129899
–––, 1976, “Comparison of Russell’sResolution of the Semantical Antinomies with That of Tarski”,The Journal of Symbolic Logic, 41(04): 747–760.doi:10.2307/2272393
Chwistek, Leon, 1912 [2017], “Zasada sprzeczności wświetle nowszych badań Bertranda Russella”, RozprawyAkademii Umiejętności (Kraków), Wydzial  historyczno-filozoficzny, Series II. 30: 270–334. Translated by Rose Rand as“The Law of Contradiction in the Light of Recent Investigationsof Bertrand Russell”, inThe Significance of the Lvov-WarsawSchool in the European Culture, Anna Brożek, FriedrichStadler, and Jan Woleński (eds.), Cham: Springer InternationalPublishing, 2017, 227–289. doi:10.1007/978-3-319-52869-4_13
–––, 1921 [1967], “Antynomie logikiformalnej”,Przegla̧d Filozoficzny, 24: 164–171.Printed as “Antinomies of Formal Logic”, Z. Jordan(trans.), inPolish Logic 1920-1939, Storrs McCall (ed.),Oxford: Clarendon Press, 1967, 338–345.
Collins, Jordan E., 2012,A History of the Theory of Types:Developments after the Second Edition of Principia Mathematica,Saarbrücken: Lambert Academic Publishing.
Copi, Irving M., 1950, “The Inconsistency or Redundancy ofPrincipia Mathematica”,Philosophy andPhenomenological Research, 11(2): 190–199.doi:10.2307/2103637
–––, 1971,The Theory of Logical Types,London: Routledge and Kegan Paul.
Dedekind, Richard, 1872 [1901],Stetigkeit und irrationaleZahlen, Braunschweig: Vieweg. Translated 1901, “Continuityand Irrational Numbers”, Wooster Woodruff Beman (trans.),inEssays on the Theory of Numbers Chicago: Open Court.doi:10.1007/978-3-322-98548-4
Eliot, T.S., 1927, “A Commentary”,The MonthlyCriterion, 6(4), 289–291.
Enderton, Herbert B., 1977,Elements of Set Theory, NewYork: Academic Press.
Ewald, William and Wilfried Sieg (eds), 2013,DavidHilbert’s Lectures on the Foundations of Arithmetic and Logic1917–1933, Berlin: Springer Verlag. doi:doi:10.1007/978-3-540-69444-1
Frege, Gottlob, 1879 [1967],Begriffsschrift: Eine DerArithmetische Nachgebildete Formelsprache des Reinen Denkens,Halle a/S: Louis Nebert. Translated by Stefan Bauer-Mengelberg as“Begriffsschrift, A Formula Language, Modeled Upon thatof Arithmetic, for Pure Thought” in Jean van Heijenoort (ed.),From Frege to Gödel: A Source Book in Mathematical Logic,1879-1931, Cambridge, MA: Harvard University Press, 1967,1–82. [Frege 1879 available online (de)]
–––, 1884 [1950],Die Grundlagen derArithmetik: Eine logisch mathematische Untersuchung über denBegriff der Zahl, Breslau: Koebner, translated by J.L. Austin asThe Foundations of Arithmetic: A logico-mathematical enquiry intothe concept of number, Oxford: Basil Blackwell, 1950.
–––, 1892 [1984], “Über Sinn undBedeutung”,Zeitschrift für Philosophie undphilosophische Kritik 100, 25-50, translated by Max Black as“On Sense and Meaning” inGottlob Frege: CollectedPapers on Mathematics, Logic, and Philosophy, Brian McGuinness,ed., Oxford: Basil Blackwell, 1984, 157–177.
–––, 1893/1903 [2013],Grundgesetze derArithmetik, Band I (1893), Band II (1903), Jena: Verlag HermannPohle. Translated (preserving the original pagination) by Philip A. Ebert & Marcus Rossberg withCrispin Wright asBasic Laws of Arithmetic, Oxford: OxfordUniversity Press, 2013.
–––, 1980,Philosophical and MathematicalCorrespondence, G. Gabriel, et al. (eds.), Chicago: University ofChicago Press.
Gabbay, Dov M. and John Woods (eds.), 2009,Handbook of theHistory of Logic, Volume 5: Logic From Russell to Church,Amsterdam: Elsevier/North Holland.
Gandon, Sébastien, 2008, “Which Arithmetization forWhich Logicism? Russell on Relations and Quantities in The Principlesof Mathematics”,History and Philosophy of Logic,29(1): 1–30. doi:10.1080/01445340701398530
–––, 2011, “PrincipiaMathematica, part VI: Russell and Whitehead on Quantity”,Logique et Analyse, 54(214): 225–247. [Gandon 2011 available online]
–––, 2012,Russell’s UnknownLogicism, New York: Palgrave Macmillan.
Gödel, Kurt, 1933 [1995], “The Present Situation in theFoundations of Mathematics”, lecture delivered to theMathematical Association of America and the American MathematicalSociety, Cambridge, MA, December 1933. Printed inKurt Gödel:Collected Works, Vol. II, Solomon Feferman, et al. (eds.), Oxfordand New York: Oxford University Press, 1995, 45–53.
–––, 1944 [1951], “Russell’sMathematical Logic”, inThe Philosophy of BertrandRussell, Paul Arthur Schilpp (ed.), first edition, Chicago:Northwestern University, 1944; third edition, New York: Tudor, 1951,123–153.
Grattan-Guinness, I., 2000,The Search for Mathematical Roots,1870-1940: Logics, Set Theories and the Foundations of Mathematicsfrom Cantor Through Russell to Gödel, Princeton and Oxford:Princeton University Press.
Griffin, Nicholas and Bernard Linsky (eds.), 2013,ThePalgrave Centenary Companion to Principia Mathematica, London:Palgrave Macmillan. doi:10.1057/9781137344632
Griffin, Nicholas, Bernard Linsky and Kenneth Blackwell (eds.),2011,Principia Mathematica at 100, Hamilton, ON: BertrandRussell Research Centre; also published as a special issue ofRussell: The Journal of Bertrand Russell Studies, 31(1). [Griffin, Linsky, and Blackwell 2011 available online]
Guay, Alexandre (ed.), 2012,Autour de Principia Mathematicade Russell et Whitehead, Dijon: Editions Universitaires deDijon.
Hale, Bob and Crispin Wright, 2001,The Reason’s ProperStudy: Essays towards a Neo-Fregean Philosophy of Mathematics,Oxford: Oxford University Press. doi:10.1093/0198236395.001.0001
Hausdorff, Felix, 1906, “Untersuchungen überOrdnungstypen”,Berichte der Königlichen SächsischeAkademie der Wissenschaft (Leipzig), 58: 106–169; 59:84–159.
Hilbert, David and W. Ackermann, 1928,Grundzüge dertheoretischen Logik, Berlin: Julius Springer Verlag. Translatedas “Principles of Mathematical Logic”, Providence:American Mathematical Society, 1958.
Hilbert, David and Paul Bernays, 1934,Grundlagen derMathematik, Berlin: Julius Springer Verlag.
Hinkis, Arie, 2013,Proofs of the Cantor-Bernstein Theorem: AMathematical Excursion, New York, Dordrecht, London:Birkhäuser.
Hintikka, Jaakko, 2009, “Logicism”, in Irvine 2009:271–290. doi:10.1016/B978-0-444-51555-1.50010-9
Irvine, Andrew D. (ed.), 2009,Philosophy of Mathematics(Handbook of the Philosophy of Science), Amsterdam: Elsevier.doi:10.1016/B978-0-444-51555-1.X0001-7
Kahle, Reinhard, 2013, “David Hilbert andPrincipia Mathematica in Poland”, inGriffin and Linsky, 2013: 21–34.
Kanamori, Akihiro, 2009, “Set Theory from Cantor toCohen”, in Irvine 2009: 395-459.doi:10.1016/B978-0-444-51555-1.50014-6
Kleene, S.C., 1952,Introduction to Metamathematics,Princeton: Van Nostrand.
Landini, Gregory, 1998,Russell’s Hidden SubstitutionalTheory, New York and Oxford: Oxford University Press.
–––, 2011,Russell, London and NewYork: Routledge.
–––, 2016, “Whitehead’s(Badly) EmendedPrincipia”,History andPhilosophy of Logic, 37(2): 114–169.doi:10.1080/01445340.2015.1082063
Link, Godehard (ed.), 2004,One Hundred Years ofRussell’s Paradox, Berlin and New York: Walter deGruyter.
Linsky, Bernard, 1990, “Was the Axiom of Reducibility aPrinciple of Logic?”Russell, 10: 125–140;reprinted in A.D. Irvine (ed.), 1990,Bertrand Russell: CriticalAssessments, 4 vols., London: Routledge, vol. 2, 150–264.doi:10.15173/russell.v10i2.1775
–––, 1999,Russell’s MetaphysicalLogic, Stanford: CSLI Publications.
–––, 2002, “The Resolution of Russell’sParadox inPrincipia Mathematica”,PhilosophicalPerspectives, 16: 395–417.doi:10.1111/1468-0068.36.s16.15
–––, 2003, “Leon Chwistek on theNo-Classes Theory inPrincipia Mathematica”,History and Philosophy of Logic, 25(1): 53–71.doi:10.1080/01445340310001614698
–––, 2004, “Classes of Classes and Classesof Functions inPrincipia Mathematica”, in Link 2004:435–447.
–––, 2009, “From Descriptive Functions toSets of Ordered Pairs”, in Alexander Hieke and Hannes Leitgeb,Reduction-Abstraction-Analysis, Vol. 11 of Publications ofthe Austrian Ludwig Wittgenstein Society, new series, Frankfurt: OntosVerlag, 259-272.
–––, 2011,The Evolution of PrincipiaMathematica: Bertrand Russell’s Manuscripts and Notes for theSecond Edition, Cambridge: Cambridge University Press.doi:10.1017/CBO9780511760181
–––, 2016, “Propositional Logic fromThe Principles of Mathematics toPrincipiaMathematica”, inEarly Analytic Philosophy: NewPerspectives on the Tradition, Sorin Costreie (ed.), Cham:Springer International Publishing, 213–229.doi:10.1007/978-3-319-24214-9_8
Linsky, Bernard and Kenneth Blackwell, 2005, “New ManuscriptLeaves and the Printing of the First Edition ofPrincipiaMathematica”,Russell: The Journal of Bertrand RussellStudies, 25(2): 141–154.doi:10.15173/russell.v25i2.2084
Mares, Edwin D., 2007, “The Fact Semantics for Ramified TypeTheory and the Axiom of Reducibility”,Notre Dame Journal ofFormal Logic, 48(2): 237–251.doi:10.1305/ndjfl/1179323266
Mayo-Wilson, Conor, 2011, “Russell on Logicism andCoherence”, in Griffin, Linsky, and Blackwell 2011: 63–79.doi:10.15173/russell.v31i1.2206
Myhill, John, 1974, “The Undefinability of the Set ofNatural Numbers in the RamifiedPrincipia”, inBertrand Russell’s Philosophy, George Nakhnikian (ed.),London: Duckworth, 19-27.
Proops, Ian, 2006, “Russell’s Reasons for Logicism”,Journal of the History of Philosophy, 44(2): 267–292.doi:10.1353/hph.2006.0029
Quine, W.V.O., 1951, “Whitehead and Modern Logic”, inThe Philosophy of Alfred North Whitehead, P.A. Schilpp (ed.),New York: Tudor Publishing, 125-163.
–––, 1960,Word and Object, Cambridge:MIT Press.
–––, 1963,Set Theory and Its Logic,Cambridge: Harvard University Press
–––, 1966a,Selected Logic Papers, NewYork: Random House.
–––, 1966b,Ways of Paradox, New York:Random House.
Ramsey, Frank, 1931, “The Foundations of Mathematics”,in hisThe Foundations of Mathematics and Other Essays,London: Kegan Paul, Trench, Trubner, 1-61.
Rodriguez-Consuegra, Francisco, 1991,The MathematicalPhilosophy of Bertrand Russell, Boston: Birkhäuser Press;repr. 1993.
Shapiro, Stewart (ed.), 2005,The Oxford Handbook ofPhilosophy of Mathematics and Logic, Oxford: Oxford UniversityPress. doi:10.1093/oxfordhb/9780195325928.001.0001
Sheffer, Henry M., unpublished,Notes on Bertrand Russell'sLectures (Cambridge, MA 1910), in Harvard University Archives: Henry Maurice Sheffer Personal Archive [accessions], 1891–1970. For further information see URL = <http://id.lib.harvard.edu/alma/990138368470203941/catalog>.
Solomon, Graham 1989, “What became of Russell's ‘relation arithmetic’?”,Russell: The Journal of Bertrand RussellStudies, 9(2): 168 –173.
Stevens, Graham, 2011, “Logical Form inPrincipiaMathematica”, in Griffin, Linsky, and Blackwell 2011:9–28. doi:10.15173/russell.v31i1.2203
Suppes, Patrick, 1960,Axiomatic Set Theory, Princeton:van Nostrand.
Tarski, Alfred, 1956,Ordinal Algebras, Amsterdam:North Holland.
Urquhart, Alasdair, 1988, “Russell’s Zigzag Path to theRamified Theory of Types”,Russell: The Journal of BertrandRussell Studies, 8(1): 82–91.doi:10.15173/russell.v8i1.1735
–––, 2012, Review of Bernard Linsky’sThe Evolution of Principia Mathematica: Bertrand Russell’sManuscripts and Notes for the Second Edition,Notre DamePhilosophical Reviews, [Urquhart 2012 available online].
–––, 2013, “PrincipiaMathematica: The First 100 Years”, in Griffin and Linsky2013: 3–20.
Wahl, Russell, 2011, “The Axiom of Reducibility”, inGriffin, Linsky, and Blackwell 2011: 45–62.doi:10.15173/russell.v31i1.2205
Wiener, Norbert, 1914, “A Simplification of the Logic ofRelations”,Proceedings of the Cambridge PhilosophicalSociety, 17: 387–90. [Wiener 1914 available online]
Wittgenstein, Ludwig, 1922,TractatusLogico-Philosophicus, C.K. Ogden (trans.), London: Routledge& Kegan Paul.
Wolenski, Jan , 2013, “Principia Mathematica in Poland”, inGriffin and Linsky, 2013: 35–55.
Wright, Crispin, 1983,Frege’s Conception of Numbers asObjects, Aberdeen: Aberdeen University Press.
Wrinch, Dorothy, 1919, “On the Exponentiation ofWell-Ordered Series”,Proceedings of the CambridgePhilosophical Society, 19: 219-233. [Wrinch 1919 available online]
Zermelo, Ernst, 1908 [1967], “Neuer Beweis für dieMöglichkeit einer Wohlordnung”,MathematischeAnnalen, 65(1): 107–128. Translated by StefanBauer-Mengelberg as “A New Proof of the Possibility of aWell-Ordering”, in Jean van Heijenoort (ed.),From Frege toGödel: A Source Book in Mathematical Logic, 1879-1931,Cambridge, MA: Harvard University Press, 1967, 183–198.doi:10.1007/BF01450054 (de)

Academic Tools

How to cite this entry.
Preview the PDF version of this entry at theFriends of the SEP Society.
Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO).
Enhanced bibliography for this entryatPhilPapers, with links to its database.

Other Internet Resources

University of Michigan Historical Math Collection:
Internet Archive:
Principia Mathematica: Whitehead and Russell, Stanley Burris, University of Waterloo.

Acknowledgments

Thanks are due to Kenneth Blackwell, Fred Kroon, Jim Robinson andseveral anonymous referees for their helpful comments on earlierversions of this material and to Allen Hazen for discussions of thesecond edition of PM and of the iterative conception of sets over manyyears. Thanks to Andrew Tedder, who checked all the proofs in∗2 of PM. Thanks to James Toupin, Rodrigo SabadinFerreira, Johan Gustafsson, and Gregory Landini for spotting errors inearlier versions of this entry. We are indebted to Axel Boldt forfinding a (large) number of errors and also for alerting us to somepeculiarities of the PM definitions ofsums of classes andof ordinal similarity in Volume II.

Open access to the SEP is made possible by a world-wide funding initiative.
The Encyclopedia Now Needs Your Support
Please Read How You Can Help Keep the Encyclopedia Free

Browse

About

Support SEP

Mirror Sites

View this site from another server:

USA (Main Site)Philosophy, Stanford University

Info about mirror sites

Library of Congress Catalog Data: ISSN 1095-5054

	How to cite this entry.
	Preview the PDF version of this entry at theFriends of the SEP Society.
	Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO).
	Enhanced bibliography for this entryatPhilPapers, with links to its database.

Movatterモバイル変換

Principia Mathematica

1. Overview

2. History of and Significance ofPrincipia Mathematica

2.1 History ofPrincipia Mathematica

2.2 Significance ofPrincipia Mathematica

3. Contents ofPrincipia Mathematica

3.1 Volume I

3.2 Volume II

3.3 Volume III

4. Volume I

4.1 Part I: Mathematical Logic

4.1.1 Propositional Logic in PM

4.1.2 The “Ramified” Theory of Types

Quantificational Logic in PM

The Axiom of Reducibility

Identity in PM

Definite Descriptions

The “No-Classes” Theory of Classes

Comparison of the Classes of PM with Axiomatic Set Theory

Relations in PM

The Algebra of Classes

The Universal Class and the Empty Class

Mathematical functions in PM

The Converse of a Relation

Domains, Ranges and Fields of Relations

The Product of Two Relations

Restricted Relations

Products and Sums of Classes of Classes

4.2 Part II: Prolegomena to Cardinal Arithmetic

The Cardinal Number 1

Pairs

Ordered Pairs

The end of PM to∗56

Relative Types

Similarity of Classes

The Axiom of Choice (Multiplicative Axiom)

\(\Rast\) The Ancestral Relation

The Powers of a Relation

5. Volume II

5.1 Prefatory Statement of Symbolic Conventions

5.2 Part III: Cardinal Arithmetic

Definition of cardinal numbers

Hume’s Principle in PM

0 Defined

TheArithmetical Sum of Classes and Cardinals

Exponentiation

Greater and Less

The Natural Numbers

Dedekind Infinity

5.3 Part IV: Relation-Arithmetic

Ordinal Similarity

5.4 Part V: Series

Sequents

Dedekindian Relations

6. Volume III

6.1 Part V: Series (continued)

Elementary properties of well ordered series.

The series of Ordinals

Zermelo’s Theorem

The Transfinite Ancestral Relation

Finite Ordinals

The series of Alephs

6.2 Part VI: Quantity

Ratios

Real Numbers

6.2.1 Measurement

6.2.2 There is no “Conclusion” at the end of PM

Bibliography

Primary Literature

Secondary Literature

Academic Tools

Other Internet Resources

Related Entries

Acknowledgments

Browse

About

Support SEP

Mirror Sites