Therecursive functions are a class of functions on thenatural numbers studied incomputability theory, a branch ofcontemporary mathematical logic which was originally known asrecursive function theory. Such functions take their namefrom the process ofrecursion by which the value of afunction is defined by the application of the same function applied tosmaller arguments.
This process may be illustrated by considering the familiar factorialfunction \(x!\)—i.e., the function which returns the product \(1\times 2 \times \ldots \times x\) if \(x > 0\) and 1 otherwise. Analternative recursive definition of this function is as follows:
\[\begin{align}\label{defnfact}\fact(0) & = 1 \\ \nonumber\fact(x+1) & = (x+1) \times \fact(x) \end{align}\]Such a definition might at first appear circular in virtue of the factthat the value of \(\fact(x)\) on the left hand side is defined interms the same function on the righthand side. However acharacteristic feature of recursive definitions is that they allow forthe values of functions which they describe to be calculated bysuccessively “unwinding” the clause for \(x > 0\) untilthe clause for \(x = 0\) (the so-calledbase case) isreached. For instance the value of \(fact(4)\) may be calculated usingthe preceding definition as follows:
\[\begin{align} \label{factcalc}\fact(4) &= 4 \times \fact(3) \\& = 4 \times (3 \times \fact(2)) \nonumber \\& = 4 \times (3 \times (2 \times \fact(1))) \nonumber \\ &=4 \times (3 \times (2 \times 1 \times (\fact(0)))) \nonumber \\& = 4 \times (3 \times (2 \times (1 \times 1))) \nonumber \\& = 24 \nonumber \\\end{align}\]Understood in this way, the defining equations (\ref{defnfact})provide analgorithm for computing \(\fact(x)\)—i.e.,an effective procedure for calculating its values which can be carriedout by a human or mechanical computing device within a finite numberof steps. It is for this reason that a class of recursive definitionssimilar to that exemplified by (\ref{defnfact})—i.e., thegeneral recursive functions—were first employed as themathematical model of computation on which recursive function theorywas originally founded.
Section 1 of this entry provides an overview of the foundationaldevelopments in logic and mathematics which led to the founding ofrecursive function theory in the 1930s. Section 2 surveys differentforms of recursive definitions, inclusive of theprimitiveandpartial recursive functions which are most central to theclassical development of this subject. Section 3 provides an overviewof computability theory, inclusive of the so-calledRecursionTheorem (Section 3.4)—a result which highlights thecentrality of recursion to computation in general as well as itsrelationship to self-reference. Subsequent updates to this entry willprovide an overview of subrecursive hierarchies employed in prooftheory and computer science as well as a more comprehensive treatmentof contemporary computability theory.
Supplement: History of the Ackermann-Péter function
The theory of recursive functions is often presented as a chapter inthe history of the subject originally known asrecursive functiontheory. This subject has its roots in the foundational debates ofthe first half of the twentieth century. Within this context, the needarose to provide a precise analysis of what we would naturallydescribe as inductive or recursive modes of reasoning which play apart in the deductive machinery of axiomatic theories in mathematics.This history will be traced in the current section, with an emphasison how different forms of recursion have been understood asexemplifying various kinds of step-by-step algorithmic processes.
This section assumes some familiarity with some of the terminologyintroduced inSection 2 andSection 3. Readers looking for a technical overview of recursive functions orcomputability theory are advised to start there.
Examples of recursive definitions can be found intermittently in thehistory of ancient and medieval mathematics. A familiar illustrationis the sequence \(F_i\) ofFibonacci numbers\(1,1,2,3,5,8,13, \ldots\) given by the recurrence \(F_0 = 1, F_1 =1\) and \(F_{n} = F_{n-1} + F_{n-2}\) (seeSection 2.1.3). The definition of this sequence has traditionally been attributed tothe thirteenth century Italian mathematician Leonardo of Pisa (alsoknown as Fibonacci) who introduced it in hisLiber Abaci inthe context of an example involving population genetics (see Fibonacci1202 [2003: 404–405]). But descriptions of similar sequences canalso be found in Greek, Egyptian, and Sanskrit sources dating as earlyas 700 BCE (see, e.g., Singh 1985).
General interest in recursion as a mode of function definitionoriginated in the mid-nineteenth century as part of the broaderprogram of arithmetizing analysis and the ensuing discussions of thefoundations of arithmetic itself. In this context, the formulation ofrecursive definitions for number theoretic functions was closely tiedto the isolation of mathematical induction as a mode of reasoningabout the natural numbers. It was in this setting in which Grassmann(1861) and Peirce (1881) first gave the familiar recursive definitionsof addition and multiplication:[1]
\[\begin{align} \label{defnadd}\text{i.}\quad && x + 0 & = x \\ \nonumber\text{ii.}\quad && x + (y+1) & = (x+y)+1\\\end{align}\] \[\begin{align} \label{defnmult}\text{i.}\quad && x \times 0 & = 0 \\ \nonumber\text{ii.}\quad && x \times (y+1) & = (x\times y) + x\end{align}\]They then used these definition to prove the associative, commutative,and distributive laws for these operations.[2]
The first person to employ the expression “definition byrecursion” appears to have been Dedekind in his essayWassind und was sollen die Zahlen (1888). This work presents a settheoretic foundation for arithmetic wherein Dedekind demonstrated thatit was possible to state and prove the existence and uniqueness offunctions defined by primitive recursion as mathematical theorems(§125–126). He formulated recursive definitions of addition(§135), multiplication (§147), and exponentiation(§155) and then also formally proved by induction that thefunctions so defined satisfy the expected algebraic identities. Thefirst two of these definitions would later be adopted by Peano (1889)as defining the symbols \(+\) and \(\times\) in the directaxiomatization of arithmetic he based on Dedekind’smonograph.
The first work devoted exclusively to recursive definability wasSkolem’s (1923) paper
The foundations of elementary arithmetic established by the recursivemode of thought, without the use of apparent variables ranging overinfinite domains.
This work is significant with respect to the subsequent development ofcomputability theory for at least three reasons. First, it contains ainformal description of what we now call theprimitive recursivefunctions. Second, it can be regarded as the first place whererecursive definability is linked to effective computability (see alsoSkolem 1946). And third, it demonstrates that a wide range offunctions and relations are primitive recursive in a manner whichanticipates Gödel’s (1931) use of primitive recursion forthe arithmetization of syntax.
One of Skolem’s stated goals was to present a logical foundationfor number theory which avoids the use of unrestricted quantifiers. Hewas inspired in this regard by the observation that it is possible todevelop much of elementary arithmetic without the use of theexpressions “always” (i.e.,for all) and“sometimes” (i.e.,there exists) which figure inthe formalization of number theory given by Russell and Whitehead inPrincipia Mathematica (1910–1913). This was to beaccomplished by formulating arithmetical theorems as what he referredto asfunctional assertions. These took the form ofidentities between terms defined by primitive recursive operationswhich Skolem referred to asdescriptive functions. Forinstance, the commutativity of addition is expressed in this form byan equation with free variables
\[\begin{equation}\label{funassert}x + y = y + x\end{equation}\]In cases where such statements are provable in the system Skolemdescribes, the intended interpretation is that the claim holdsuniversally for all natural numbers—e.g., \(\forall x \forall y(x + y = y + x)\). But in Skolem’s system there is no means ofnegating such a statement to express a bare existential assertionwithout producing a witness.
Statements like (\ref{funassert}) would later be referred to byHilbert & Bernays (1934) (who provided the first textbooktreatment of recursion) asverifiable in the sense that theirindividual instances can be verified computationally by replacingvariables with concrete numerals. This is accomplished by what Skolemreferred to as the “recursive mode of thought”. The senseof this phrase is clarified by the following properties of the systemhe describes:
Taking these principles as a foundation, Skolem showed how to obtainrecursive definitions of thepredecessor andsubtraction functions, theless than,divisibility, andprimality relations,greatestcommon divisors,least common multiples, andboundedsums and products which are similar to those given in Section 2.1.2 below.
Overall Skolem considered instances of what we would now refer to asprimitive recursion, course of values recursion, double recursion, andrecursion on functions of type \(\mathbb{N} \rightarrow \mathbb{N}\).He did not, however, introduce general schemas so as to systematicallydistinguish these modes of definition. Nonetheless, propertiesi–iv of Skolem’s treatment provide a means of assimilatingcalculations like (\ref{factcalc}) to derivations in quantifier-freefirst-order logic. It is thus not difficult to discern in Skolem(1923) the kernel of the system we now know asPrimitive RecursiveArithmetic (as later formally introduced by Hilbert & Bernays1934: ch. 7).
The next important steps in the development of a general theory ofrecursive function arose as a consequence of the interaction betweenHilbert’s Program and Gödel’s (1931) proof of the Incompleteness Theorems.Hilbert (1900) had announced the goal of proving the consistency ofarithmetic—and ultimately also analysis and set theory—inthe face of the set theoretic paradoxes. His initial plans forcarrying out such a proof are described in a series of lectures andaddresses in the 1910s–1920s which provide a description of whatwould come to be called thefinitary standpoint—i.e.,the fragment of mathematical reasoning pertaining to finitecombinatorial objects which was intended to serve as the secure basisfor a consistency proof. The proof itself was to be carried out usingthe methods of what Hilbert referred to asmetamathematics—i.e., the formal study of axioms andderivations which would grow into the subject now known asproof theory.
In one of his initial descriptions of this program Hilbert (1905)sketched the basic form which a metamathematical proof of consistencymight take. Suppose, for instance, that \(\mathsf{T}\) is amathematical theory about which it is possible to prove the followingconditional:
Were it possible to provide a mathematical demonstration of i), itmight seem possible to conclude
However Poincaré (1906) observed that Hilbert’s approachrelies on mathematical induction in inferring ii from i. He objectedon the basis that this renders Hilbert’s proposed methodcircular in the case that the system \(\mathsf{T}\) in question itselfsubsumes principles intended to formalize induction.[3]
Together with his collaborators Ackermann and Bernays, Hilbertdeveloped metamathematics considerably during the 1910–1920s.This served as the basis of Hilbert’s (1922) lecture wherein hereplied to Poincaré by making a systematic distinction between“formal” occurrences of mathematical induction in theobject language and the metatheoretic use of induction as a“contentual” [inhaltliche] principle used inorder to reason about proofs as finite combinatorial objects. It wasalso in this context in which Hilbert connected the latter form ofinduction to the “construction and deconstruction of numbersigns” (1922 [1996: 1123]).
As is made clear in subsequent presentations, Hilbert understood“number signs” to be unary numerals written in strokenotation of the form
\[\nonumber|, ||, |||, \ldots\]Such expressions can be operated on concretely by adjoining orremoving strokes in a manner which mirrors the arithmetical operationsof successor and predecessor which figure in Skolem’s“recursive mode of thought“. This observation in turninformed Hilbert’s explanation of the meaning of functionalassertions like (\ref{funassert}) in terms of their logicalderivability from recursive definitions which also serve as proceduresfor computing the values of functions they define (Hilbert 1920 [2013:54–57]).
Hilbert first described a logical calculus for finitary number theoryincluding “recursion and intuitive induction for finitetotalities” in 1923 ([1996: 1139]).[4] Although this presentation also included a discussion of definitionby simultaneous recursion, a more extensive treatment of what we wouldnow recognize asrecursion schemes is given in his well knownpaper “On the infinite” (1926). This includes a discussionof what Hilbert callsordinary recursion (which is similar toSkolem’s description of primitive recursion), transfiniterecursion, as well as recursion at higher types. (These differentforms of recursion will be discussed further in thesupplement on the Ackermann-Péter function.) This treatment makes clear that Hilbert and his collaborators hadtaken substantial steps towards developing a general theory ofrecursive definability. Ultimately, however, the influence ofHilbert’s presentations was diminished in light of the moreprecise formulation of primitive recursion which Gödel would soon provide.[5]
Gödel’s (1931 [1986: 157–159]) definition was asfollows:
A number-theoretic function \(\phi(x_1,\ldots,x_n)\) is said to berecursively defined in terms of the number-theoreticfunctions \(\psi(x_1,x_2,\ldots,x_{n-1})\) and \(\mu(x_1,x_2,\ldots,x_{n+1})\) if
\[\begin{align} \label{gprimrec}\text{i.}\quad & \phi(0,x_2,\ldots,x_n) = \psi(x_2,\ldots,x_n) \\ \nonumber\text{ii.}\quad & \phi(k+1,x_2,\ldots,x_n) = \mu(k,\phi(k,x_2,\ldots,x_n),x_2,\ldots,x_n)\end{align}\]holds for all \(x_2,\ldots,x_n,k\).
A number-theoretic function \(\phi\) is said to berecursiveif there is a finite sequence of number-theoretic functions \(\phi_1 ,\phi_2 , \ldots \phi_n\) that ends with \(\phi\) and has the propertythat every function \(\phi_k\) of the sequence is recursively definedin terms of two of the preceding functions, or results from any of thepreceding functions by substitution, or, finally, is a constant or thesuccessor function \(x + 1\)…. A relation \(R(x_1, \ldots ,x_n)\) between natural numbers is said to berecursive ifthere is a recursive function \(\phi(x_1 \ldots , x_n)\) such that,for all \(x_1, x_2, \ldots, x_n\)
\[\begin{equation}\label{prch}R(x_1,\ldots,x_n) \leftrightarrow \phi(x_1,\ldots,x_n) = 0\end{equation}\]
Putting aside Gödel’s use of the term“recursive” rather than “primitive recursive”(which will be explained below), this exposition comes close tocoinciding with the contemporary definition of the primitive recursivefunctions given inSection 2.1.[6] Gödel’s definition also improved upon those of hispredecessors by clearly defining the class of initial functions whichare allowed in primitive recursive definitions and by stating thateach primitive recursive function possesses a definition in terms of asequence of functions showing how it is built up from initialfunctions. This makes clear that the primitive recursive functionsconstitute a mathematically well-defined class of functions on thenatural numbers (which will be denoted here asPR). Gödeladditionally proved that the primitive recursiverelations—defined as characteristic functions via(\ref{prch})—are closed under propositional operations andquantification bounded by a primitive recursive function (seeSection 2.1.2).
The foregoing definition appears in Gödel’s well-known(1931) paper “On formally undecidable propositions ofPrincipia mathematica and related systems I”. As heobserves immediately before presenting it, the definition of primitiverecursion is in fact a digression from the main focus of thepaper—i.e., proving the incompleteness of the axiomatic systemof arithmetic he calls \(\mathsf{P}\). In order to understandGödel’s contribution to the initial development ofrecursive function theory, it will be useful to attend both to somefeatures of this system and also to his proof of the FirstIncompleteness Theorem itself. (Additional details and context areprovided in the entry onGödel’s incompleteness theorems.)
System \(\mathsf{P}\) is obtained from that of Whitehead andRussell’sPrincipia Mathematica (1910–1913) byomitting the ramification of types, taking the natural numbers as thelowest type, and adding for them the second-order Peano axioms. It ishence a fixed formal system with finitely many non-logical axiomssufficient for the development of elementary number theory.[7] Recall also that an arithmetical system is said to be\(\omega\)-consistent if it does not prove both \(\exists x\varphi(x)\) and \(\neg \varphi(\overline{n})\) for each naturalnumber \(n \in \mathbb{N}\) (where \(\overline{n} =_{\mathrm{df}}s(s(\ldots s(0)))\)n-times) and that \(\omega\)-consistencyimpliessimple consistency (i.e., the non-derivability of aformula and its negation).
The incompleteness theorem which Gödel proved states that if\(\mathsf{P}\) is ω-consistent, then there exists a formula\(G_{\mathsf{P}}\) which isundecidable in\(\mathsf{P}\)—i.e., neither provable nor refutable from itsaxioms. In order to obtain such a formula, Gödel firstdemonstrated how it is possible to express various syntactic andmetatheoretic properties of \(\mathsf{P}\)-formulas and proofs asprimitive recursive relations via a technique which has come to beknown as thearithmetization of syntax (see the entry onGödel’s incompleteness theorems). Second, he showed that for every primitive recursive relation\(R(x_1,\ldots,x_k)\) there exists a “class sign” (i.e.,formula) \(\varphi_R(x_1,\ldots,x_n)\) of \(\mathsf{P}\) such that thefact that \(R(x_1,\ldots,x_n)\) holds of (or does not hold of) a giventuple of numbers \(n_1,\ldots,n_k\) is mirrored by the provability (orrefutability) in \(\mathsf{P}\) of the corresponding instance of\(\varphi_R(x_1,\ldots,x_n)\) when the formal numeral \(\overline{n} =s(s(\ldots s(0)))\) (n-times) is substituted for\(x_i\)—i.e.,
\[\begin{align} \label{rep}\text{i.}\quad & \text{if } R(n_1,\ldots,n_k), \text{ then } \mathsf{P} \vdash \varphi_R(\overline{n}_1,\ldots,\overline{n}_k) \\ \nonumber\text{ii.}\quad & \text{if } \neg R(n_1,\ldots,n_k), \text{ then } \mathsf{P} \vdash \neg \varphi_R(\overline{n}_1,\ldots,\overline{n}_k)\end{align}\]According to the terminology Gödel would later introduce in 1934,in such a case \(\varphi_R(x_1,\ldots,x_n)\)represents\(R(x_1,\ldots,x_n)\). In this presentation, he also generalized hisprior definition to say that a function \(f(x_1,\ldots,x_n)\) isrepresentable in \(\mathsf{P}\) just in case there exists a formula\(\varphi_f(x_1,\ldots,x_k,y)\) such that for all \(n_1,\ldots,x_k,m\in \mathbb{N}\),
\[\begin{equation}\label{repfun}f(n_1,\ldots,n_k) = m \textrm{ if and only if } \mathsf{P} \vdash \varphi_f(\overline{n}_1,\ldots,\overline{n}_k,\overline{m})\end{equation}\]Gödel’s arithmetization of syntax provides a means ofassigning to each primitive symbol, term, formula, and proof\(\alpha\) of \(\mathsf{P}\) a uniqueGödel number\(\ulcorner \alpha \urcorner \in \mathbb{N}\) according to itssyntactic structure. This technique takes advantage of the familiarobservation that a finite sequence of numbers \(n_1,\ldots,n_k\) canbe encoded as a product of prime powers \(2^{n_1} \cdot 3^{n_2} \cdot\ldots p_k^{n_k}\) so that various correlative operations on sequencescan be shown to be primitive recursive—e.g., the operation whichtakes two numbers \(x\) and \(y\) encoding sequences and returns thecode \(x * y\) of the result of concatenating \(x\) followed by \(y\).Gödel proceeded on this basis to show that a sequence of notionsabout the syntax and proof theory of \(\mathsf{P}\) are primitiverecursive—e.g., the function \(\textrm{Neg}(x)\) which returnsthe Gödel number of the negation of the formula coded by \(x\)can be defined as \(\ulcorner \neg \urcorner * x\). The availabilityof the relevant recursive definitions thus falls out naturally sincethe inductive definitions of syntactic notions likewell-formedformula generalize the “construction and deconstruction ofnumber signs” in the sense described by Hilbert.[8]
The penultimate definition in Gödel’s list is the relation\(\mathsf{Proof}_{\mathsf{P}}(x,y)\) which holds between theGödel number of a \(\mathsf{P}\)-formula \(\varphi\) and theGödel number of a finite sequence of \(\mathsf{P}\)-formulas\(\psi_1,\ldots, \psi_n\) just in case the latter is a correctlyformed derivation of the former from the axioms of\(\mathsf{P}\)—i.e.,
\(\mathsf{Proof}_{\mathsf{P}}(\ulcorner \psi_1,\ldots, \psi_n\urcorner, \ulcorner \varphi \urcorner))\) iff \(\mathsf{P} \vdash\varphi\) via a derivation \(\psi_1,\ldots,\psi_n\) in which each\(\psi_i\) is either an axiom of \(\mathsf{P}\) or follows from priorformulas via its rules of inference.
From (\ref{rep}) it follows that there exists a formula\(\textrm{Prf}_{\mathsf{P}}(x,y)\) of \(\mathsf{P}\) which represents\(\mathsf{Proof}_{\mathsf{P}}(x,y)\) and thus also a formula
\[\textrm{Prov}_{\mathsf{P}}(y) =_{\textrm{df}} \exists x \textrm{Prf}_{\mathsf{P}}(x,y).\]Gödel famously named the latter formula \(\sc{bew}(x)\) (forbeweisbar) as it can be understood to express that thereexists a proof from the axioms of \(\mathsf{P}\) of the formula withGödel number \(y\). But unlike the other formulas representingprimitive recursive relations which figure in its definition,\(\textrm{Prov}_{\mathsf{P}}(x)\) contains an unbounded existentialquantifier. And thus as Gödel is careful to observe, there is noreason to expect that it defines a primitive recursive relation.
It is, nonetheless, this formula which Gödel uses to construct asentence which is undecidable in \(\mathsf{P}\). This can beaccomplished by the application of the so-calledDiagonalLemma (seeGödel’s incompleteness theorems) which states that for every formula \(\varphi(x)\) of \(\mathsf{P}\),there exists a sentence \(\psi_{\varphi}\) such that
\[\mathsf{P} \vdash \psi_{\varphi} \leftrightarrow \varphi(\overline{\ulcorner \psi_{\varphi} \urcorner})\]When applied to the formula \(\neg \textrm{Prov}_{\mathsf{P}}(x)\),the Diagonal Lemma yields a sentence \(G_{\mathsf{P}}\)—i.e.,the so-calledGödel sentence for\(\mathsf{P}\)—such that \(\mathsf{P} \vdash G_P\leftrightarrow \neg \textrm{Prov}_{\mathsf{P}}(\ulcornerG_{\mathsf{P}} \urcorner)\). \(G_{\mathsf{P}}\) is thus interpretableas “saying of itself” that it is unprovable in\(\mathsf{P}\). Gödel showed that this formula has the followingproperties:
This constitutes what is now known as Gödel’s FirstIncompleteness Theorem.
The proof of this fact relies explicitly on the representability ofthe relation \(\mathsf{Proof}_{\mathsf{P}}(x,y)\) in \(\mathsf{P}\)which in turn derives from its primitive recursiveness. But thetechniques on which Gödel’s proof relies also contributedto the subsequent development of computability theory in severaladditional ways. First, it follows from the possibility of Gödelnumbering the formulas of \(\mathsf{P}\) that we may also effectivelyenumerate them as \(\varphi_0(x), \varphi_1(x), \varphi_2(x),\ldots\)—e.g., in increasing order of \(\ulcorner \varphi_i\urcorner\). This provides a mechanism for referring to formulas viatheir indices which in turn served as an important precedent forKleene’s (1936a) use of a similar indexation of generalrecursive definitions in his proof of the Normal Form Theorem (seeSection 2.2.2). Second, the proof of the Diagonal Lemma also demonstrates how it ispossible to formalize the substitution of terms for free variables ina manner which may be understood to yield an effective form ofCantor’s diagonal argument (see the entry onself-reference). This technique served as an important precedent for the use ofdiagonalization in results such as the undecidability of the HaltingProblem (Turing 1937, seeSection 3.2), the Recursion Theorem (Kleene 1938, seeSection 3.4), and the Hierarchy Theorem (Kleene 1943, seeSection 3.6).
Another significant contribution of Gödel’s paper derivesfrom the fact that after proving the incompleteness of \(\mathsf{P}\),he took several steps towards isolating features of axiomatic theorieswhich are sufficient to ensure that they satisfy analogousundecidability results. In addition to being sufficiently strong tosatisfy (\ref{rep}), the other requirement which he identifies is that“the class of axioms and the rules of inference \(\ldots\) arerecursively definable” (1931 [1986: 181]). As he notes, thesefeatures hold both of Zermelo-Fraenkel set theory \([\mathsf{ZF}\)]and a first-order arithmetical system similar to what we now callfirst-order Peano arithmetic \([\mathsf{PA]}\), relative to anappropriate Gödel numbering of their axioms. In particular, whileneither of these systems isfinitely axiomatizable, they maybe axiomatized by a finite number ofschemes (e.g., ofinduction or comprehension) such that the relation\(\ulcorner\varphi \urcorner\) is the Gödel number of an axiom ofTis primitive recursive. This is soprecisely because membership in the schemes in question is determinedby a inductive condition on formulas whose structure mirrors that of aprimitive recursive definition.
This observation set the stage for Gödel’s subsequentrevisiting of the incompleteness theorems in the lectures (1934)wherein he suggests a significant generalization of his original(1931) definition of recursiveness. Gödel starts out by providingthe following informal characterization of the requirements of thetheories just described:
We require that the rules of inference, and the definitions ofmeaningful formulas and axioms, be constructive; that is, for eachrule of inference there shall be a finite procedure for determiningwhether a given formula \(B\) is an immediate consequence (by thatrule) of given formulas \(A_1, \ldots, A_n\) and there shall be afinite procedure for determining whether a given formula \(A\) is ameaningful formula or an axiom. (Gödel 1934: 346)
He also makes clear that what he calls “recursiveness” isto be initially regarded as aninformal notion which he isattempting to make precise:
Recursive functions have the important property that, for each givenset of values of the arguments, the value of the function can becomputed by a finite procedure. Similarly, recursive relations(classes) are decidable in the sense that, for each givenn-tuple of natural numbers, it can be determined by a finiteprocedure whether the relation holds or does not hold (the numberbelongs to the class or not), since the representing function iscomputable. (Gödel 1934 [1986: 348])
One of Gödel’s goals was thus to provide a mathematicaldefinition of the term “recursive” which generalizes priorexamples of recursive definability in a manner but also captures to asgreat an extent as possible the class of functions computable by afinite procedure. This led him to define the so-calledgeneralrecursive functions (seeSection 1.5) whose isolation in turn played an important role in the formulationofChurch’s Thesis (seeSection 1.6). However Gödel’s definition also took place against thebackdrop of other work which had been inspired by Hilbert’soriginal consideration of different forms of recursive definitions. Itwill now be useful to examine these developments.
Already at the time of (1926), Hilbert had anticipated that it wouldbe possible to formulate definitions of functions whose values couldbe computed in a recursive manner but which are not themselvesprimitive recursive. In order to illustrate how such a definitionmight be obtained, he presented a heuristic argument involving thefollowing sequence of functions:
\[\begin{align*}\alpha_0(x,y) &= x + 1 &\text{(successor)} \\ \alpha_1(x,y) &= x + y &\text{(addition)} \\ \alpha_2(x,y) &= x \times y &\text{(multiplication)} \\ \alpha_3(x,y) &= x^y &\text{(exponentiation)} \\ \alpha_4(x,y) &= \underbrace{x^{x^{\udots^x}}}_{y \textrm{ times}} &\text{(super-exponentiation)} \\ &\vdots\end{align*}\]The functions in this sequence are defined so that\(\alpha_{i+1}(x,y+1)\) is obtained by primitive recursion as\(\alpha_i(\alpha_{i+1}(x,y),x)\), together with an appropriate basecase. It thus makes sense to consider the function
\[\begin{equation}\label{alphadef}\alpha(i,x,y) = \alpha_i(x,y)\end{equation}\]whose first argument \(i\) represents the position of the function\(\alpha_i(x,y)\) in the prior list. For fixed \(i,n,m \in\mathbb{N}\), it is thus possible to effectively compute the value of\(\alpha(i,n,m)\) by first constructing the definition of\(\alpha_i(x,y)\) and then evaluating it at \(n,m\). But it is alsoeasy to see that \(\alpha_{i+1}(x,x)\) will eventually dominate\(\alpha_i(x,x)\) for sufficiently large \(x\). This in turn suggeststhat \(\alpha(i,x,y)\) cannot be defined by a finite number ofapplications of the primitive recursion scheme. It thus follows that\(\alpha(i,x,y)\) is thus not primitive recursive itself.
The specification of \(\alpha(i,x,y)\) just given does not have theform of a recursive definition. But it is possible to define similarfunctions in a manner which generalizes the format of the scheme(\ref{gprimrec}). One means of doing so is to use a simple form ofrecursion at higher types as considered by both Skolem and Hilbert. Tothis end, consider theiteration functional\(\mathcal{Iter}\) which takes as arguments a function \(f: \mathbb{N}\rightarrow \mathbb{N}\) and a natural number \(i\) and returns thefunction which is obtained asi-fold composition of \(f\) withitself. In other words, \(\mathcal{Iter}\) has the type
\[(\mathbb{N} \rightarrow \mathbb{N}) \rightarrow (\mathbb{N} \rightarrow (\mathbb{N} \rightarrow \mathbb{N})).\]Such a function can be formally defined as follows:
\[\begin{aligned}\mathcal{Iter}(f,0) & = id \\\mathcal{Iter}(f,i+1) & = \mathcal{Comp}(f,\mathcal{Iter}(f,i)) \nonumber\end{aligned}\]Here \(id\) denotes the identity function (i.e., \(id(y) = y\))and
\[\mathcal{Comp}(f,\mathcal{Iter}(f,i))\]denotes what we would more conventionally express as \(f \circf^i\)—i.e.,
\[f \circ \underbrace{f \circ \ldots \circ f}_{i \mathrm{\ times}}\]or the result of composing \(f\) with \(f^i\).
We may now define a function \(\beta\) which takes a natural number asinput and returns a function of type \(\mathbb{N} \rightarrow\mathbb{N}\)—i.e., of type \(\mathbb{N} \rightarrow (\mathbb{N}\rightarrow \mathbb{N})\)—as follows:
\[\begin{aligned}\beta(0) & = y +1 \textrm{ (i.e., the successor function)} \\ \beta(i+1) & = \mathcal{Iter}(\beta(i),y)(\beta(i)(1)) \nonumber\end{aligned}\]Since the value of \(\beta(i)\) is a function, here \(y+1\) and
\[\mathcal{Iter}(\beta(m),y)(\beta(i)(1))\]should both be understood as functions of type \(\mathbb{N}\rightarrow \mathbb{N}\) depending on a variable \(y\) which isimplicitly abstracted. In other words, if we employ the notation ofthe \(\lambda\)-calculus, then we should think of these terms as theabstracts \(\lambda y.y+1\) and
\[\lambda y.\mathcal{Iter}(\beta(i),y)(\beta(i)(1)).\]With these definitions in place, it can now be verified that as \(i\)varies over \(\mathbb{N}\), \(\beta(0), \beta(1), \ldots\) correspondto the following sequence of functions of increasing rate ofgrowth:
\[\begin{align*}\beta(0) & = \lambda x.x +1, \\\beta(1) & = \lambda x.2 + (x + 3) - 3 = x+2, \\\beta(2) & = \lambda x.2 \times x - 3, \\\beta(3) & = \lambda x.2^{x+3} - 3, \\\beta(4) &= \lambda x.\underbrace{2^{2^{\udots^2}}}_{x \textrm{ times}} - 3,\\ &\vdots\end{align*}\]This provides one means of defining what is now often called thePéter function (or also theAckermann-Péterfunction) as \(\pi(i,x) = \lambda x.\beta(i)(x)\). \(\pi(i,x)\)has the same order of growth as \(\alpha_i(x,x)\) and it is possibleto prove via the sort of argument sketched above that \(\pi(i,x)\) isnot primitive recursive (see, e.g., Péter 1967: ch. 9).
As with the series of functions \(\alpha_i(x,y)\), it also clear thateach function \(\pi(i,x)\) is effectively computable for each concretenumber \(i\). However in order to define this function uniformly wehave had to define \(\beta\) using the functional \(\mathcal{Iter}\)which itself is defined by recursion on type \(\mathbb{N} \rightarrow\mathbb{N}\). The question thus arise whether it is also possible todefine an extensionally equivalent function by a form of recursion onthe natural numbers themselves.
An affirmative answer was provided by Ackermann (1928a) for theslightly more complicatedAckermann function described in thesupplement and also directly for a function \(\pi(x,y)\) by Péter (1935).In particular, it is possible to formulate a definition of a functionextensionally coincident with \(\beta(i)\) by what Ackermannoriginally referred to assimultaneous recursion as follows:[9]
\[\begin{align}\label{pidef}\pi(0,i+1) & = i + 1\\ \nonumber\pi(x+1,0) & = \pi(x,1)\\ \nonumber\pi(x+1,i+1) & = \pi(i,\pi(i+1,x)) \end{align}\]The third clause in this definition defines the value of\(\pi(i+1,x+1)\) in terms of \(\pi(i,z)\) where the \(z\) isdetermined by the value of \(\pi(i+1,x)\). It may thus not beimmediately obvious that the definition (\ref{pidef}) describes analgorithm for computing the values of \(\pi(i,x)\) which alwaysterminates in the manner illustrated by the calculation(\ref{factcalc}). Note, however, the when we expand the clauses on theright-hand side of this definition, either \(i\) decreases, or \(i\)remains the same and \(x\) decreases. It thus follows that each time\(x\) reaches \(0\), \(i\) will start to decrease so that the basecase is eventually reached. Thus although the value of \(\pi(i,x)\)grows very rapidly—e.g., \(\pi(4,3) = 2^{2^{65536}}-3\)—itis still reasonable to regard (\ref{pidef}) as satisfying Gödel'srequirement that a recursively defined function is computable by afinite procedure.
Systematic consideration of such alternative recursion schemesexemplified by (\ref{pidef}) was initiated by Péter (1932). Itwas also she who introduced the term “primitive recursive”to describe the class of functions given by Gödel’s scheme(\ref{gprimrec}), a choice which would become standard after itsadoption by Kleene (1936a). Péter additionally showed that theprimitive recursive functions are closedcourse of valuesrecursion (seeSection 2.1.3),multiple recursion, andnested recursion of onevariable (see thesupplement). Thus the choice of there term “primitive” not should beunderstood to diminish the richness of the class of primitiverecursive functions. Rather it flags the fact that definitions like(\ref{pidef}) which give rise to more complicated computationalprocess leading out of this class were also regarded as“recursive” by theorists like Hilbert, Ackermann, andPéter from the outset of their studies.
Péter's work in the 1930s also led to her book (Péter1967), whose original German editionRekursive Funktionen(1951) was the first monograph devoted to recursive functions.Together with the later work of Grzegorczyk (1953), these developmentsalso inspired the investigation of various subrecursive hierarchieswhich would later play a role in proof theory and computer science.[10]
The immediate source for Gödel’s discussion of recursion in1934 was not Ackermann or Péter’s work but rather aprivate communication with Herbrand, who in two previous papers (1930,1932) had proposed a related means of generalizing recursivedefinitions. Gödel’s informal description ofHerbrand’s suggestion was as follows:[11]
If \(\phi\) denotes an unknown function, and \(\psi_1,\ldots,\psi_k\)are known functions, and if the \(\psi\)’s and \(\phi\) aresubstituted in one another in the most general fashions and certainpairs of the resulting expressions are equated, then, if the resultingset of functional equations has one and only one solution for\(\phi\), \(\phi\) is a recursive function. (Gödel 1934 [1986:308])
As an illustration, consider the following set of equations:
\[\begin{align} \label{genrecex}\phi(0) &= 0 \\ \nonumber\psi(x) &= \phi(x) + 1\\ \nonumber\phi(x+1) &= \psi(x) + 1\end{align}\]In this case, the “unknown” function denoted by\(\phi(x)\) is specified in terms of the auxiliary function\(\psi(x)\) in such a way that \(\phi(x)\) appears only once on thelefthand side of the equations (other than the base case).Nonetheless, such a system of equations is unlike a primitiverecursive definition in that it does not specify a unique means forcomputing the values of \(\phi(n)\) by “deconstructing”\(n\) in the deterministic manner illustrated by calculations such as(\ref{factcalc}).
In the general case there is indeed no guarantee that there will exista unique extensional function satisfying such a definition. But in thecase of this example it can be shown that \(2 \times x\) is the uniquefunction of type \(\mathbb{N} \rightarrow \mathbb{N}\) satisfying\(\phi(x)\) in the system of equations (\ref{genrecex}). This may beillustrated by considering the following calculation of\(\phi(2)\):
\[\begin{align} \label{genreccal}\text{i.}\quad & \phi(2) = \psi(1) + 1 \\ \nonumber\text{ii.}\quad & \psi(1) = \phi(1) +1 \\ \nonumber\text{iii.}\quad & \phi(1) = \psi(0) + 1 \\ \nonumber\text{iv.}\quad & \psi(0) = \phi(0) + 1 \\ \nonumber\text{v.}\quad & \phi(0) = 0 \\ \nonumber\text{vi.}\quad & \psi(0) = 0 + 1 \\ \nonumber\text{vii.}\quad & \phi(1) = (0 + 1) + 1 \\ \nonumber\text{viii.}\quad& \psi(1) = ((0 + 1) + 1) + 1 \\ \nonumber\text{ix.}\quad & \phi(2) = (((0 + 1) + 1) + 1) + 1 \ (= 4)\end{align}\]As Gödel notes, such a calculation may be understood as aderivation in quantifier-free first-order logic wherein the only ruleswhich are allowed are the substitution of numerals for variables andthe replacement of a term on the righthand side of an equation by anumeral for which the corresponding identity has already beenderived.
Gödel introduced the termgeneral recursive to describea function defined in this manner. Following the modernizedpresentation of Odifreddi (1989: ch. I.2) this class may be specifiedon the basis of the following initial definitions:[12]
Definition 1.1
The class ofnumerals is the smallest set containing 0 andclosed under the successor function \(x \mapsto s(x)\). We write\(\overline{n}\) for the numeral \(s(s(\ldots s(0)))\)n-times.
The class ofterms is the smallest set containing thenumerals,variables \(x_0,x_1, \ldots\) and closed under theoperations \(t \mapsto s(t)\) and \(t_1,\ldots,t_n \mapsto\psi^n_i(t_1,\ldots,t_n)\) where \(t,t_1,\ldots,t_n\) are terms and\(\psi^n_i\) is a primitiven-ary functional symbol.
If \(t\) and \(u\) are terms and \(t\) is of the form\(\psi^n_i(t_1,\ldots,t_n)\) where \(t_1,\ldots,t_n\) do not containany functional symbols other than \(s\), then \(t = u\) is anequation.
Asystem of equations is a finite set of equations.\(\mathcal{E}(\psi_1,\ldots,\psi_n,\vec{x})\) will be used to denote asystem of equations containing basic functional symbols\(\psi_1,\ldots,\psi_n\) and variables among \(\vec{x} = x_1,\ldots,x_k\).
Herbrand (1932) gave a semantic characterization of what it means fora number theoretic function \(f\) to be defined by a system ofequations \(\mathcal{E}(\psi_1,\ldots,\psi_n,\vec{x})\) by requiringboth that there is a solution to the system and that \(f\) coincideswith the function determined as \(\psi_1\) for every solution. He alsosuggested that this fact should be proved intuitionistically, whichmight in turn be thought to yield an effective procedure for computingthe values of \(f\).[13] He did not, however, specify a formal system in which such a proofshould be carried out. And thus Gödel suggested (essentially) thefollowing syntactic replacement for Herbrand’s definition:
Definition 1.2: A function \(f:\mathbb{N}^k\rightarrow \mathbb{N}\) isgeneral recursive if there is asystem of equations \(\mathcal{E}(\psi_1,\ldots,\psi_n,\vec{x})\) suchthat if \(\psi^k_i\) is the leftmost functional symbol in the lastequation of \(\mathcal{E}\) then for all \(n_1,\ldots,n_k, m \in\mathbb{N}\)
\[f(n_1,\ldots,n_k) = m\]if and only if the equation
\[\psi^k_i(\overline{n}_1,\ldots,\overline{n}_k) = \overline {m}\]is derivable from the equations comprising \(\mathcal{E}\) via thefollowing two rules:
In such a case we say that \(\mathcal{E}\)defines \(f\) withrespect to \(\psi^k_i\).
It can be verified that the system of equations (\ref{genrecex}) andthe derivation (\ref{genreccal}) exhibited above satisfy the foregoingrequirements, thus illustrating how it is possible to mechanicallycalculate using a system of general recursive equations. Howevercertain systems—e.g., \(\{\phi(x) = 0, \phi(x) =s(0)\}\)—are inconsistent in the sense of not being satisfied byany function on the natural numbers, while others—e.g.,\(\{\phi(x) = \phi(x)\}\)—are not satisfied uniquely. Oneevident drawback of Gödel’s definition of generalrecursiveness is thus that there is no apparent means of establishingwhether a given system of equations \(\mathcal{E}\) determines aunique function (even if only partially defined). This is one of thereasons why Gödel’s characterization has been replaced byother extensionally equivalent definitions such as Kleene’spartial recursive functions (seeSection 2.2) in the subsequent development of computability theory.
By formalizing his informal characterization of recursiveness viaDefinition 1.2, Gödel succeeded in formulating a definition which subsumes theprimitive recursion scheme (\ref{gprimrec}), the definition of theAckermann-Péter function, as well as several other schemesconsidered by Hilbert. Gödel’s definition of generalrecursiveness thus also defined a classGR of functions of type\(\mathbb{N}^k \rightarrow \mathbb{N}\) which properly subsumes theprimitive recursive functionsPR. Moreover, we now know thatthe class of functions representable in \(\mathsf{P}\) (and in fact infar weaker arithmetical systems) corresponds not to the primitiverecursive functions, but rather to the general recursive functions.Weakening the hypothesis that the set of (Gödel numbers) of theaxioms of a formal system to the requirement that they be generalrecursive rather than primitive recursive thus indeed provides ageneralization of the First Incompleteness Theorem the manner in whichGödel envisioned.
The definition ofGR is also of historical importance becauseit was the first among several equivalent (and nearly contemporaneous)definitions of what were originally called therecursivefunctions but are now often referred to as thecomputablefunctions (seeSection 2.2). These developments also contributed to one of the two final chaptersin the study of recursive definability prior to the initiation ofcomputability theory as an independent subject—i.e., theisolation and eventual adoption of what is now known asChurch’s Thesis.
Church’s Thesis corresponds to the claim that the class offunctions which are computable by a finite mechanicalprocedure—or, as it is traditionally said, areeffectivelycomputable—coincides with the class of general recursivefunctions—i.e.,
There is some historical variation in how authors have glossed thenotion of an effectively computable function which CT purports toanalyze. (For more on this point, see the entries onChurch’s Thesis andComputational Complexity Theory.) Nonetheless there is general agreement that this notion approximatesthat of a function computed by an algorithm and also that a properunderstanding of the thesis requires that this latter notion must beunderstood informally.[14]
On this understanding it may appear that Gödel already proposed aversion of Church’s Thesis in 1934. However, he did notimmediately endorse it upon its first explicit articulation by Church.[15] And since the surrounding history is complex it will be useful torecord the following observations as a prelude toSections 2 and 3.[16]
Gödel delivered the lectures (Gödel 1934) while he wasvisiting Princeton in the spring of 1934. Already at that time Church,together with his students Kleene and Rosser, had made substantialprogress in developing the formal system of function application andabstraction now known as theuntyped lambda calculus. Thissystem also provides a means of representing natural numbers as formalterms—i.e., as so-calledChurch numerals. This leads toa notion of a function beinglambda-definable which issimilar in form to (\ref{repfun}). Church’s definition thus alsocharacterize a class \(\mathbf{L}\) of lambda-definable functionswhich is similar in form to that ofGR. During this period,Kleene demonstrated that a wide range of number theoretic functionswere included in \(\mathbf{L}\), in part by showing how it is possibleto implement primitive recursion in the lambda calculus. Thisultimately led Church to propose in early 1934 that thelambda-definable functions coincide with those possessing the propertywhich he called “effective calculability”.[17]
A natural conjecture was thus that lambda-definability coincidedextensionally with general recursiveness. Unlike (CT)—whichequates an informally characterized class of functions with onepossessing a precise mathematical definition—the statement\(\textbf{GR} = \mathbf{L}\) potentially admits to formaldemonstration. Such a demonstration was given by Church(1936b)—and in greater detail by Kleene 1936b—providingthe first of several extensional equivalence results which Kleene(1952: sec. 60, sec. 62) would eventually cite as evidence of what heproposed to call “Church’s Thesis”.
Church’s Thesis underlies contemporary computability theory inthe sense that it justifies the assumption that by studyingcomputability relative to asingle formalism (such asGR or \(\mathbf{L}\)) we are thereby providing ageneral account of which functions in extension can andcannot be effectively computed in principle by an algorithm. In lightof this, it will be useful to catalog some additional evidence forChurch’s Thesis in the form of the equivalence ofGR withseveral other computational formalisms presented in the StanfordEncyclopedia:
Let \(\mathsf{T}\) be a consistent, computably axiomatizable theoryextending \(\mathsf{Q}\) (i.e., Robinson arithmetic). Then the classof functions \(\mathbf{F}_{\mathsf{T}}\) which is representable in\(\mathsf{T}\) in the sense of (\ref{repfun}) above (with\(\mathsf{T}\) replacing \(\mathsf{P}\)) is such that\(\mathbf{F}_{\mathsf{T}} = \textbf{GR}\). (Seerepresentability in the entry on Gödel’s incompleteness theorems and Odifreddi (1989: ch. I.3).)
The classREC consisting of the total functions which aremembers of the class ofpartial recursive functions (formedby closing the classPR under the unbounded minimizationoperation) is such that \(\textbf{REC} = \textbf{GR}\). (SeeSection 2.2.1 and Odifreddi [1989: ch. I.2].)
The classCL of functions representable inCombinatory Logic (a formal system related to the lambda calculus) is such that\(\textbf{CL} = \textbf{GR}.\) (Seecomputable functions and arithmetic in the entry on combinatory logic and Bimbó [2012: ch. 5.3].)
The class \(\mathbf{T}\) of functions computable by aTuring machine (under several variants of its definition) is such that \(\mathbf{T}= \textbf{GR}\). (Seealternative historical models of computability in the entry on Turing machines and Odifreddi [1989: ch. I.4].)
The class \(\mathbf{U}\) of functions computable byUnlimitedRegister Machines introduced by Shepherdson & Sturgis (1963)is such that \(\mathbf{U} = \textbf{GR}\). (See Cutland [1980: ch.1–3] and Cooper [2004: ch. 2].)
Equivalence results of these forms testify to the mathematicalrobustness of the classGR and thereby also to that of theinformal notion of effective computability itself. As we have seen,Gödel was originally led to the formulation of generalrecursiveness by attempting to analyze the background notion ofrecursive definition as a model of effective computation as inspiredby the foundational developments of the late nineteenth and earlytwentieth centuries.[18] Further discussion of how the work of Church, Turing, and Post can beseen as providing independently motivated analyses of computabilitywhich also support Church’s Thesis can be found in Gandy (1980)and Sieg (1994, 1997, 2009).
In addition to the goal of widening the scope of Gödel’sIncompleteness Theorems, another motivation for work on recursivefunctions during the 1930s was the study of so-calledundecidable (orunsolvable)problems. Theoriginal example of such a problem was that of determining whether agiven formula \(\varphi\) of first-order logic isvalid—i.e., true in all of its models. This was firstdescribed as theEntscheidungsproblem (ordecisionproblem) for first-order logic by Hilbert & Ackermann intheir textbookGrundzüge der theoretischen Logik (1928):[19]
TheEntscheidungsproblem is solved if one knows a procedure,which permits the decision of the universality [i.e., validity] orsatisfiability of a given logical expression by finitely manyoperations. The solution of the problem of decision is of fundamentalimportance to the theory of all domains whose propositions can belogically described using finitely many axioms. (Hilbert &Ackermann 1928: 73)[20]
This passage illustrates another sense in which the question of thedecidability of logical derivability is connected to the concernswhich had initiated Hilbert’s study of metamathematics. For notethat if \(\Gamma\) is afinite set of axioms\(\{\gamma_1,\ldots,\gamma_k\}\), then the question of whether\(\psi\) is a logical consequence of \(\Gamma\) is equivalent towhether the sentence \(\varphi=_{\textrm{df}} (\gamma_1 \wedge \ldots\wedge \gamma_k) \rightarrow \psi\) is logically valid. ByGödel’s Completeness Theorem (see the entry on Gödel) for first-order logic, this is equivalent to the derivability of\(\varphi\) from Hilbert & Ackermann’s axiomatization offirst-order logic. A positive answer to theEntscheidungsproblem could thus be interpreted as showingthat it is possible to mechanize the search for proofs in mathematicsin the sense of allowing us to algorithmically determine if a formulaexpressing an open question (e.g., the Riemann Hypothesis) is alogical consequence of a suitably powerful finitely axiomatized theory(e.g., Gödel-Bernays set theory).
In addition to analyzing the notion of effective computability itself,the mathematical goal of both Turing (1937) and Church (1936a,b) wasto provide a mathematically precisenegative answer to theEntscheidungsproblem. The answers which they provided can beunderstood as proceeding in three phases:
The first of these steps can be undertaken by defining
\[\begin{aligned}V & = \{\ulcorner \varphi \urcorner : \varphi \text{ is logically valid} \} \\& = \{\ulcorner \varphi \urcorner : \mathfrak{M} \models \varphi \text{ for all } \mathcal{L}_{\varphi} \text{-models } \mathfrak{M}\} \end{aligned}\]where \(\ulcorner \cdot \urcorner\) is a Gödel numbering of thelanguage of \(\varphi\) as described inSection 1.3. The second step of Turing and Church’s negative answer to theEntscheidungsproblem relied on their prior specification ofsimilar decision problems for the models \(\mathbf{T}\),\(\mathbf{L}\), andGR. Together with Kleene (1936a), theyshowed the following:
Proposition 1.1: The characteristic functions of thefollowing sets are not computable with respect to the relevantmodel:
\(\HP_T = \{\langle i,n \rangle : \text{the Turing machine $T_i$ haltson input $n$}\}\)
\(\HP_L = \{\ulcorner M \urcorner : \text{the untyped $\lambda$-term$M$ has a normal form}\}\)
\(\HP_{\textit{GR}} = \{\ulcorner \mathcal{E} \urcorner :\) the systemof equations \(\mathcal{E}\)-term determines a general recursivefunction\(\}\)
For instance, Part i ofProposition 1.1 shows that there is no Turing machine which outputs 1 if \(T_i\)halts on \(n\) and 0 otherwise. This is thus a formulation ofTuring’s well-knownunsolvability of the Halting Problem (see the entry on Turing machines). Part ii and iii would also now be described as expressing that thesets \(\HP_T,\) \(\HP_L,\) and \(\HP_{\textit{GR}}\) are allundecidable. By taking into account the equivalence resultssummarized inSection 1.6,Proposition 1.1 thus shows that membership in these sets cannot bedecided relative toany of the models in question.
On this basis, Turing (for \(\mathbf{T}\)) and Church (for\(\mathbf{L}\) andGR) then proved the following:
Proposition 1.2: If \(V\) were decidable (withrespect to any of the models in question), then \(\HP_T, \HP_L\), and\(\HP_{GR}\) would be as well.
The proofs which Turing and Church gave of these facts areconstructive in the sense that they show how to effectively transforman individual instance of one of the models into a first-order formulasuch that the formula is valid if and only if the instance possessesthe property in question—e.g., given a Turing machine \(T_i\)and input \(n \in \mathbb{N}\), we construct a formula\(\varphi_{i,n}\) such that the computation \(T_i(n)\) halts if andonly if \(\varphi_{i,n}\) is valid. This method thus anticipates thedefinition ofmany-one reducibility given inSection 3.5.1 below.
In conjunction with the other arguments which Church and Turing hadalready offered in favor of Church’s Thesis (seeSection 1.6), Propositions1.1 and1.2 can thus be taken to show that theEntscheidungsproblem isindeed not decidable in the informal sense described by Hilbert &Ackermann (1928)—i.e., not decidable by a “mechanicalprocedure using finitely many operations”. As we will see inSection 3, the desire to develop a general theory of such undecidability resultsand the relations which they bear to one another was an importantmotivation for the further development of computability theorystarting in the 1940s.
The developments just described form part of the prehistory of thesubfield of contemporary mathematical logic which was originally knownasrecursive function theory (or more simply asrecursiontheory). This subject was initiated in earnest by Kleene, Turing,and Post starting in the late 1930s, directly on the basis of thepapers containing the equivalence and undecidability resultssummarized inSection 1.6 andSection 1.7. Of particular importance are the papers (1936a, 1938, 1943,1955a,b,c) of Kleene. These respectively contain the definition of thepartial recursive functions, the proof of their equivalence toGR, the Normal Form Theorem, the Recursion Theorem, and thedefinitions of the arithmetical and analytical hierarchies. Of equalimportance are the papers (1937, 1939) of Turing (which respectivelycontain the undecidability of the Halting Problem and the definitionof Turing reducibility) and the paper (1944) of Post (which introducedmany-one and one-one reducibility and formulated what would come to beknown asPost’s Problem).
These developments will be surveyed inSection 3. As we will see there, an important theme in the early stages ofcomputability theory was the characterization of a notion of effectivecomputability which is capable of supporting rigorous proofs groundedin intuitions about algorithmic calculability but which abstracts awayfrom the details of the models mentioned inSection 1.6. To this end, Gödel’s original definition of the generalrecursive equations was replaced in early textbook treatments (e.g.,Shoenfield 1967; Rogers 1987) by Kleene’s definition of thepartial recursive functions in terms of the unbounded minimizationoperator introduced inSection 2.2. This characterization has in turn been replaced by machine-basedcharacterizations such as those of Turing (1937) or Shepherdson &Sturgis (1963) in later textbooks (e.g., Soare 1987; Cutland 1980)which are closer in form to informally described computerprograms.
What is retained in these treatments is an understanding ofcomputation as a means of operating in an effective manner on finitecombinatorial objects which can still be understood to fall under the“recursive mode of thought” as understood by earlytheorists such as Skolem, Hilbert, Gödel, and Péter. Butat the same time, many of the basic definitions and results inrecursive function theory are only indirectly related to recursivedefinability in the informal sense described in this section. In lightof this, Soare (1996) proposed that recursive function theory shouldbe renamedcomputability theory and that we shouldaccordingly refer to what were traditionally known as therecursive functions as thecomputable functions.
Such a change in terminology has been largely adopted in contemporarypractice and is reflected in recent textbooks such as Cooper (2004)and Soare (2016). Nonetheless, both sets of terminology are stillwidely in use, particularly in philosophical and historical sources.Readers are thus advised to keep in mind the terminological discussionat the beginning ofSection 3.
NB: Readers looking for a mathematical overview of recursive functionsare advised to start here. Discussion of the historical context forthe major definitions and results of this section can be found inSection 1.
This section presents definitions of the major classes of recursivelydefined functions studied in computability theory. Of these theprimitive recursive functionsPR and thepartialrecursive functionsPartREC are the most fundamental. Theformer are based on a formalization of the process of recursiondescribed in the introduction to this entry and include virtually allnumber theoretic functions studied in ordinary mathematics. Thepartial recursive functions are formed by closing the primitiverecursive functions under the operation ofunboundedminimization—i.e., that of searching for the smallestwitness to a decidable predicate. The class of recursivefunctionsREC—i.e., the partial recursive functionswhich are defined on all inputs—has traditionally been taken tocorrespond via Church’s Thesis (Section 1.6) to those which can be effectively computed by an algorithm.
The following notational conventions will be employed in the remainderof this entry:
\(\mathbb{N} =\{0,1,2,\ldots\}\) denotes the set of natural numbers,\(\mathbb{N}^k\) denotes the cross product \(\mathbb{N} \times \ldots\times \mathbb{N}\)k-times, and \(\vec{n}\) denotes a vectorof fixed numbers \(n_0,\ldots,n_{k-1}\) (when the arity is clear fromcontext).
Lowercase Roman letters \(f,g,h,\ldots\) denote functions of type\(\mathbb{N}^k \rightarrow \mathbb{N}\) (for some \(k\))—i.e.,the class of functions with domain \(\mathbb{N}^k\) and range\(\mathbb{N}\). For a fixed \(j\), \(f:\mathbb{N}^j \rightarrow\mathbb{N}\) expresses that \(f\) is aj-ary function(or hasarity \(j\))—i.e., \(f\) has domain\(\mathbb{N}^j\) and range \(\mathbb{N}\). Lower case Greek letterswill be used similarly for special functions (e.g., projections) asdefined below.
\(x_0,x_1,x_2, \dots\) are used as formal variables over\(\mathbb{N}\) for the purpose of indicating the argument offunctions. \(x,y,z,\ldots\) will also be used informally for arbitraryvariables from this list. \(\vec{x}\) will be used to abbreviate avector of variables \(x_0,\ldots,x_{k-1}\) (when the arity is clearfrom context).
Boldface letters \(\mathbf{X}, \mathbf{Y}, \mathbf{Z},\ldots\) (orabbreviations likePR) will be used to denote classes offunctions which are subsets of \(\bigcup_{k \in \mathbb{N}}(\mathbb{N}^k \rightarrow \mathbb{N})\).
Calligraphic letters \(\mathcal{F},\mathcal{G},\mathcal{H},\ldots\)(or abbreviations like \(\mathcal{Comp}^j_k\)) will be used to denotefunctionals on \(\mathbb{N}^k \rightarrow\mathbb{N}\)—i.e., operations which map one or more functions oftype \(\mathbb{N}^k \rightarrow \mathbb{N}\) (possibly of differentarities) to other functions.
Uppercase letters \(R,S,T, \ldots\) will be used to denoterelations—i.e., subsets of \(\mathbb{N}^k\)—withthe range \(A,B,C, \ldots\) reserved to denote unaryrelations—i.e., subsets of \(\mathbb{N}\).
Thecharacteristic function of a relation \(R \subseteq\mathbb{N}^k\) is denoted by\(\chi_R(x_0,\ldots,x_{k-1})\)—i.e.,
\[\chi_R(x_0,\ldots,x_{k-1}) = \begin{cases} 1 & \text{ if } R(x_0,\ldots,x_{k-1}) \\ 0 & \text{ if } \neg R(x_0,\ldots,x_{k-1})\end{cases}\]A class \(\mathbf{X}\) of recursively defined functions may bespecified by giving a class of initial functions \(I_{\mathbf{X}}\)which is then closed under one or more functionals from a class\(\textit{Op}_{\mathbf{X}}\). It is in general possible to define aclass in this manner on an arbitrary set of initial functions.However, all of the function classes considered in this entry willdetermine functions of type \(\mathbb{N}^k \rightarrow\mathbb{N}\)—i.e., they will takek-tuples of naturalnumbers as inputs and (if defined) return a single natural number asoutput.
In the case of the primitive recursive functionsPR, theinitial functions include the nullaryzero function\(\mathbf{0}\) which returns the value 0 (and can thus be treated as aconstant symbol), \(s(x)\) denotes the unarysuccessorfunction \(x \mapsto x + 1\), and \(\pi^k_i\) denotes thek-aryprojection function on to the \(i\)th argument(where \(0 \leq i < k\))—i.e.,
\[\pi^k_i(x_0,\ldots,x_i, \ldots x_{k-1}) = x_i\]This class of functions will be denoted by \(I_{\textbf{PR}} =\{\mathbf{0}, s, \pi^k_i\}\). Note that since \(\pi^k_i\) is adistinct function for each \(i,k \in \mathbb{N}\), \(I_{\textbf{PR}}\)already contains infinitely many functions.
The functionals ofPR are those ofcomposition andprimitive recursion. Composition takes \(j\) functions \(g_0,\ldots, g_{j-1}\) of arity \(k\) and a single function \(f\) of arity\(j\) and returns theircomposition—i.e., the function
\[h(x_0,\ldots,x_{k-1}) = f(g_0(x_0,\ldots,x_{k-1}),\ldots,g_{j-1}(x_0,\ldots,x_{k-1}))\]of type \(\mathbb{N}^k \rightarrow \mathbb{N}\). As an example,suppose that \(f\) is the multiplication function\(\textit{mult}(x,y)\), \(g_0\) is the constant 3 function (which wemay think of as implicitly taking a single argument), and \(g_1(x)\)is the successor function \(s(x)\). Then the composition of \(f\) with\(g_0\) and \(g_1\) is the unary function \(h(x) = f(g_0(x),g_1(x)) =mult(3, s(x))\) which we would conventionally denote by \(3 \times(x+1)\).
The operation of composition may be understood as a class offunctionals which for each \(j,k \in \mathbb{N}\) takes as inputs\(j\) functions \(g_0, \ldots, g_{j-1}\) of arity \(k\) and a singlefunction \(f\) of arity \(j\) and returns as output thek-aryfunction \(h\) which composes these functions in the manner justillustrated. This is described by the following scheme:
Definition 2.1: Suppose that \(f:\mathbb{N}^j\rightarrow \mathbb{N}\) and \(g_0, \ldots, g_{j-1} : \mathbb{N}^k\rightarrow \mathbb{N}\). Then the term\(\mathcal{Comp}^j_k[f,g_0,\ldots,g_{j-1}]\) denotes the function
\[f(g_0(x_0,\ldots,x_{k-1}),\ldots,g_{j-1}(x_0,\ldots,x_{k-1}))\]of type \(\mathbb{N}^k \rightarrow \mathbb{N}.\)
Primitive recursion is also a functional operation. In the simplestcase, it operates by taking a single unary function \(g(x)\) and anatural number \(n \in \mathbb{N}\) and returns the unary functiondefined by
\[\begin{align}h(0) & = n \label{prex1}\\ \nonumberh(x+1) & = g(h(x))\end{align}\]In such a definition, the first clause (known as thebasecase) determines the value of \(h\) at 0, while the second clausedetermines how its value at \(x+1\) depends on its value at \(x\). Inthis case it is easy to see that the value of \(x\) determines howmany times the function \(g\) isiterated (i.e., applied toitself) in determining the value of \(h\). For instance, if \(n = 3\)and \(g(x) = x^2\), then \(h(x) = 3^{2^x}\).
The full primitive recursion scheme generalizes (\ref{prex1}) in twoways. First, it allows the value of the function \(h\) at \(x+1\) todepend not just on its own value at \(x\), but also on the value ofthe variable \(x\) itself. This leads to the scheme
\[\begin{align} \label{prex2}h(0) & = n \\ \nonumberh(x+1) & = g(x,h(x))\end{align}\]For instance, the definition of the factorial function \(\fact(x)\)defined in the introduction to this entry can be obtained via(\ref{prex2}) with \(n = 1\) and
\[g(x_0,x_1) = mult(s(\pi^2_0(x_0, x_1)), \pi^2_1(x_0, x_1)).\]A second possible generalization to (\ref{prex1}) results fromallowing the value of \(h\) to depend on a finite sequence ofauxiliary variables known asparameters which may also bearguments to the base case. In the case of a single parameter \(x\),this leads to the scheme
\[\begin{align} \label{prex3}h(x,0) & = f(x) \\ \nonumberh(x,y+1) & = g(x,h(x,y)) \end{align}\]The addition function \(\textit{add}(x,y)\) may, for instance, bedefined in this way by taking \(f(x_0) = x_0\) and \(g(x_0,x_1) =s(x_1)\). This definition can also be thought of as specifying thatthe sum \(x+y\) is the value obtained by iterating the application ofthe successor function \(y\) times starting from the initial value\(x\) in the manner of (\ref{prex1}). Similarly,\(\textit{mult}(x,y)\) may be defined by taking \(f(x_0) = 0\) and\(g(x_0,x_1) = add(x_0,x_1)\). This defines the product \(x \times y\)as the value obtained by iterating the function which adds \(x\) toits argument \(y\) times starting from the initial value 0.
Such definitions may thus be understood to provide algorithms forcomputing the values of the functions so defined.[21] For observe that each natural number \(n\) is either equal to 0 or isof the form \(m+1\) for some \(m \in \mathbb{N}\). If we now introducethe abbreviation \(\overline{n} = s(s(s \ldots (s(\mathbf{0}))))\)n-times, the result of applying the successor function \(s\) toa number denoted by \(\overline{n}\) thus yields the number denoted by\(\overline{n+1}\). We may thus compute the value of \(x + y\) usingthe prior recursive definition of addition as follows:
\[\begin{align}\label{prcalc2}\textit{add}(\overline{2},\overline{3}) & = s(\textit{add}(\overline{2},\overline{2})) \\& = s(s(add(\overline{2},\overline{1}))) \nonumber\\& = s(s(s(\textit{add}(\overline{2},\overline{0})))) \nonumber\\& = s(s(s(\overline{2}))) \nonumber\\& = s(s(s(s(s(\mathbf{0}))))) \nonumber\\& = \overline{5}\nonumber\\\end{align}\]The full definition of the primitive recursion operation combines bothgeneralizations of (\ref{prex1}) into a single scheme which takes asarguments ak-ary function \(f\), a \(k+2\)-ary function \(g\),and returns a \(k+1\)-ary function \(h\) defined as follows
\[\begin{align} \label{prscheme}h(x_0,\ldots,x_{k-1},0) & = f(x_0,\ldots,x_{k-1}) \\ \nonumberh(x_0,\ldots,x_{k-1},y+1) & = g(x_0,\ldots,x_{k-1},y,h(x_0,\ldots,x_{k-1},y))\end{align}\]Here the first \(k\) arguments \(x_0,\ldots,x_{k-1}\) to \(g\) are theparameters, the \(k+1\)st argument \(y\) is therecursionvariable, and the \(k+2\)nd argument \(h(x_0,\ldots,x_{k-1},y)\)gives the prior value of \(h\). An elementary set theoretic argumentshows that for each \(k \in \mathbb{N}\), if \(f\) isk-ary and\(g\) is \(k+2\)-ary, then a there is a unique \(k+1\)-ary function\(h\) satisfying (\ref{prscheme})—see, e.g., Moschovakis (1994:ch. 5).
It will again be useful to introduce a formal scheme for referring tofunctions defined in this manner:
Definition 2.2: Suppose that \(f:\mathbb{N}^k\rightarrow \mathbb{N}\) and \(g: \mathbb{N}^{k+2} \rightarrow\mathbb{N}\). Then the term \(\mathcal{PrimRec}_k[f,g]\) denotes theunique function of type \(\mathbb{N}^{k+1} \rightarrow \mathbb{N}\)satisfying (\ref{prscheme}).
We may now formally define the classPR of primitive recursivefunctions as follows:
Definition 2.3: The class ofprimitive recursivefunctionsPR is the smallest class of functions containingthe initial functions \(I_{\textbf{PR}} = \{\mathbf{0}, s, \pi^k_i\}\)and closed under the functionals
\[\textit{Op}_{\textbf{PR}} = \{\mathcal{Comp}^i_j, \mathcal{PrimRec}_k\}.\]With the definition ofPR in place, we may also define what itmeans for a relation \(R \subseteq \mathbb{N}^k\) to be primitiverecursive:
Definition 2.4: \(R \subseteq \mathbb{N}^k\) is aprimitive recursive relation just in case its characteristicfunction
\[\chi_R(x_0,\ldots,x_{k-1}) = \begin{cases} 1 & \text{ if } R(x_0,\ldots,x_{k-1}) \\ 0 & \text{ if } \neg R(x_0,\ldots,x_{k-1})\end{cases}\]is a primitive recursive function.
Definition 2.4 thus conventionalizes the characterization of a primitive recursiverelation \(R \subseteq \mathbb{N}^k\) as one for which there exists analgorithm similar to that illustrated above which returns the output 1on input \(\vec{n}\) if \(R\) holds of \(\vec{n}\) and the output 0 if\(R\) does not hold of \(\vec{n}\). As will become clear below, mostsets and relations on the natural numbers which are considered ineveryday mathematics—e.g., the setPRIMES of primenumbers or the relation
\[\textit{DIV} = \{\langle n, m \rangle : n\textit{ divides } m \textit{ without remainder}\}\]—are primitive recursive.
The foregoing definition specifiesPR as theminimalclosure of \(I_{\textbf{PR}}\) under the functions in\(\textit{Op}_{\textbf{PR}}\). In other words,PR may beequivalently defined as the subclass of \(\bigcup_{k \in\mathbb{N}}(\mathbb{N}^k \rightarrow \mathbb{N})\) satisfying thefollowing properties:
\[\begin{equation}\label{prmc}\end{equation}\]Another consequence ofDefinition 2.3 is thus that each function \(f \in \textbf{PR}\) possesses aspecification which shows how it may be defined from the initialfunctions \(I_{\textbf{PR}}\) in terms of a finite number ofapplications of composition and primitive recursion. This process maybe illustrated by further considering the definitions of the functions\(\textit{add}(x,y)\) and \(\textit{mult}(x,y)\) given above.
Note first that although the familiar recursive definitions ofaddition (\ref{defnadd}) and multiplication (\ref{defnmult}) fit theformat of (\ref{prex3}), they do not fit the format of(\ref{prscheme}) which in this case requires that the argument \(g\)to the primitive recursion scheme be a 3-ary function. It is, however,possible to provide a definition of \(\textit{add}(x,y)\) in theofficial form by taking \(f(x_0) = \pi^1_0(x_0)\)—i.e., theidentity function—and \(g(x_0,x_1,x_2) =\mathcal{Comp}^1_3[s,\pi^3_1]\)—i.e., the function which resultsfrom composing the successor function with the 3-ary projectionfunction on to its second argument. The expression\(\mathcal{PrimRec}_1[\pi^1_0,\mathcal{Comp}^1_3[s,\pi^3_1]]\) maythen be understood as a term which encodes the definition we haveprovided for addition. Multiplication can then be defined via(\ref{prscheme}) with \(f = \mathbf{0}\) and
\[g(x_0,x_1,x_2) = \mathcal{Comp}^2_3[add,\pi^3_0,\pi^3_2].\]Thus
\[\mathcal{PrimRec}_1[\mathbf{0},\mathcal{Comp}^2_3[add,\pi^3_0,\pi^3_2]]\]—or in explicit form
\[\mathcal{PrimRec}_1[\mathbf{0},\mathcal{Comp}^2_3[\mathcal{PrimRec}_1[\pi^1_0,\mathcal{Comp}^1_3[s,\pi^3_1]],\pi^3_0,\pi^3_2]]\]—can be taken as a similar term encoding the definition ofmultiplication we have abbreviated by \(\textit{mult}(x,y)\).
These examples illustrate that the simpler recursion schemes which areemployed in many informal recursive definitions may be assimilated toDefinition 2.3—e.g., the function \(h(x,y)\) defined in (\ref{prex3}) maybe obtained as\(\mathcal{PrimRec}_1[f,\mathcal{Comp}^2_3[g,\pi^3_1,\pi^3_2]]\).Repeated use of this and similar observations will be made (generallywithout comment) in the examples provided inSection 2.1.2.
Another consequence of the fact thatevery \(f \in\textbf{PR}\) is defined by a term given in this manner by(\ref{prmc}) is the following:
Proposition 2.1: The class of functionsPR iscountable.
This can be demonstrated by showing that it is possible to enumeratePR as \(f_0,f_1,f_2,\ldots\) by introducing a Gödelnumbering of terms formed from the expressions \(\mathbf{0},\) \(s,\)\(\pi^k_i, \mathcal{Comp}^j_k,\) and \(\mathcal{PrimRec}_k\) in themanner described inSection 1.3. Since there are uncountably many functions of type \(\mathbb{N}^k\rightarrow \mathbb{N}\) for all \(k > 0\), this observation alsoprovides a non-constructive demonstration that there exist numbertheoretic functions which are not primitive recursive.
Almost all number theoretic functions and relations encountered inordinary mathematics can be shown to be primitive recursive. In orderto illustrate the extent of this class, we will present here astandard sequence of definitions which can be traced historically toSkolem (1923). This can be used to show that the sequence coding\(\langle \ldots \rangle\) and decoding \((\cdot)_i\) operationsdefined below are primitive recursive. This is in turn required forGödel’s arithmetization of syntax (seeSection 1.3) as well as results like theNormal Form Theorem (2.3) which will be discussed below.
For each \(k \in \mathbb{N}\) the constantk-function definedas \(\const_k(x) = k\) is primitive recursive. In order to show this,we first define the constant 0-function by primitive recursion asfollows:
\[\begin{aligned}\const_0(0) & = \mathbf{0}\\\const_0(x+1) & = \pi^{2}_{1}(x,const_{0}(x))\end{aligned}\]We may then define the constantk-function by repeatedcomposition as
\[const_{k}(x) = \underbrace{s( \ldots (s(const_{0}(x)) \ldots)}_{k \text{ times }} \]We have already seen that the addition function \(\textit{add}(x,y)\)can be defined by primitive recursion in terms of repeated applicationof successor and that the multiplication function\(\mathit{mult}(x,y)\) can be defined by primitive recursion in termsof repeated application of addition. We can continue this sequence byobserving that the exponentiation function \(x^y\) can be defined byprimitive recursion in terms of repeated multiplication as follows:
\[\begin{align} \label{exp}\textit{exp}(x,0) & = \overline{1}\\ \nonumber\textit{exp}(x+1,y) & = \textit{mult}(x,\textit{exp}(x,y)) \end{align}\]The super-exponentiation function
\[x \uarrow y = \underbrace{x^{x^{\udots^x}}}_{y \textrm{ times}}\]can be defined by primitive recursion in terms of repeatedexponentiation as as follows:
\[\begin{align} \label{superexp}\textit{supexp}(x,0) & = \overline{1}\\ \nonumber\textit{supexp}(x+1,y) & = \textit{exp}(x,\textit{supexp}(x,y)) \end{align}\]The sequence of functions
\[\begin{aligned}\alpha_0(x,y) & = x + y, \\\alpha_1(x,y) & = x \times y, \\\alpha_2(x,y) & = x^y, \\\alpha_3(x,y) & = x \uarrow y, \\ &\vdots\\\end{aligned}\]whose \(i+1\)st member is defined in terms of primitive recursion ofthe \(i\)th member form a hierarchy of functions whose values growincreasingly quickly in proportion to their inputs. While eachfunction in this sequence is primitive recursive, we can also considerthe function \(\alpha(x,y)\) defined as \(\alpha_x(y,y)\)—aversion of the so-calledAckermann-Péter functiondefined inSection 1.4—whose values are not bounded by any fixed function \(\alpha_i\). As it canbe shown that the values of \(\alpha(x,y)\) are not bounded by any ofthe functions \(\alpha_i(x,y)\), this shows that \(\alpha(x,y)\)cannot be defined by any finite number of applications of the scheme\(\mathcal{PrimRec}_1\). This provides a constructive proof that thereexist functions of type \(\mathbb{N}^2 \rightarrow \mathbb{N}\) whichare not primitive recursive.
Theproper predecessor function is given by
\[\textit{pred}(y) = \begin{cases}0 & \text{ if } y = 0 \\ y - 1 & \text{otherwise} \end{cases}\]This function is primitive recursive since it may be defined as
\[\begin{align} \label{pred}\textit{pred}(0) & = 0\\ \nonumber\textit{pred}(y+1) & = y\end{align}\]Note that the second clause of (\ref{pred}) does not depend on theprior value of \(\textit{pred}(y)\). But this definition can still beconformed to the scheme (\ref{prscheme}) by taking \(f(x_0) =\mathbf{0}\) and \(g(x_0,x_1) = \pi^2_0\).
Theproper subtraction function is given by
\[x \dotminus y = \begin{cases}x - y & \text{ if } y \leq x \\ 0 & \text{otherwise} \end{cases}\]This function is also primitive recursive since it may be defined as
\[\begin{align} \label{dotminus}x \dotminus 0 & = x \\ \nonumberx \dotminus (y+1) & = \textit{pred}(x \dotminus y)\end{align}\]The absolute difference function is defined as
\[|x - y| = \begin{cases}x - y & \text{ if } y \leq x \\ y - x & \text{otherwise} \end{cases}\]\(|x - y|\) may be defined by composition as \((x \dotminus y) + (y\dotminus x)\) and is hence primitive recursive since \(\dotminus\)is.
Thesignum function is defined as
\[\textit{sg}(x) = \begin{cases}1 & \text{ if } x \neq 0 \\ 0 & \text{otherwise} \end{cases}\]This function may be defined by composition as \(\textit{sg}(x) = 1\dotminus (1 \dotminus x)\) and is hence primitive recursive as is theinverted signum function defined by\(\overline{\textit{sg}}(x) = 1 \dotminus \textit{sg}(y)\) whichreturns 1 if \(x = 0\) and 1 otherwise.
The minimum and maximum functions may be similarly defined bycomposition from functions previously seen to be primitive recursiveas follows:
\[\begin{aligned}\min(x,y) & = \overline{\textit{sg}}(x \dotminus y) \times x + \textit{sg}(y \dotminus x) \times y\\\max(x,y) & = \textit{sg}(x \dotminus y) \times x + \overline{\textit{sg}}(y \dotminus x) \times y\end{aligned}\]The characteristic functions of theless than relation(\(<\)) andequality relation (\(=\)) on the naturalnumbers are definable as follows:
\[\begin{aligned}\chi_<(x,y) & = \textit{sg}(y \dotminus x) \\ \nonumber\chi_=(x,y) & = 1 \dotminus (\textit{sg}(x \dotminus y) + \textit{sg}(y \dotminus x))\end{aligned}\]These relations are hence primitive recursive.
As theless than or equal to relation (\(\leq\)) is logicallyequivalent to \(x < y \vee x = y\) it will follow from the next setof observations that this relation is also primitive recursive. The isadditionally true of \(x > y\), \(x \geq y,\) and \(x \neq y\).
The set of primitive recursive relations isclosed under booleanoperations. In other words, if \(P(\vec{x})\) and \(Q(\vec{x})\)are primitive recursive, then so are \(\neg P(\vec{x})\), \(P(\vec{x})\wedge Q(\vec{x})\), \(P(\vec{x}) \vee Q(\vec{x})\), \(P(\vec{x})\rightarrow Q(\vec{x}),\) and \(P(\vec{x}) \leftrightarrowQ(\vec{x})\).
Given the interdefinability of the classical connectives, this followsupon noting the following:
\[\begin{align*}\chi_{\neg P}(\vec{x}) & = 1 \dotminus \chi_{P}(\vec{x}) \\ \chi_{P \wedge Q}(\vec{x}) & = \chi_{P}(\vec{x}) \times \chi_{Q}(\vec{x})\end{align*}\]Suppose that \(f(\vec{x},z)\) is primitive recursive. Then thebounded sum \(g(\vec{x},y) = \Sigma_{i=0}^y f(\vec{x},i)\)and thebounded product \(h(\vec{x},y) = \Pi_{i=0}^yf(\vec{x},i)\) are both primitive recursive as they may berespectively defined as follows:
\[\begin{aligned}g(\vec{x},0) & = f(\vec{x},0) \\ \nonumberg(\vec{x},y+1) & = g(\vec{x},y) + f(\vec{x},y+1) \\ \nonumberh(\vec{x},0) & = f(\vec{x},0) \\ \nonumber h(\vec{x},y+1) & = g(\vec{x},y) \times f(\vec{x},y+1) \end{aligned}\]The set of primitive recursive relations is also closed underbounded quantification—i.e., if \(R(\vec{x},z)\) is aprimitive recursive relation, then so are the relations \(\forall z\leq y R(\vec{x},z)\) and \(\exists z \leq y R(\vec{x},z).\) These maybe respectively defined as follows:
\[\begin{aligned}u_R(\vec{x},y) & =_{\textrm{df}} \chi_{\forall z \leq y R(\vec{x},z)}(\vec{x},y) = \Pi_{i=0}^y \chi_R(\vec{x},i) \\ \nonumbere_R(\vec{x},y) & =_{\textrm{df}} \chi_{\exists z \leq y R(\vec{x},z)}(\vec{x},y) = sg\left(\Sigma_{i=0}^y \chi_R(\vec{x},i)\right)\end{aligned}\]As it will be useful below, we have here extended our notationalconvention for characteristic functions so as to display free andbound variables in the subscripts of the functions being defined.
The set of primitive recursive relations is also closed underbounded minimization. This is to say that if \(R(\vec{x},z)\)is a primitive recursive relation, then so is the function\(m_R(\vec{x},y)\) which returns the least \(z\) less than or equal to\(y\) such that \(R(\vec{x},z)\) holds if such a \(z\) exists and\(y+1\) otherwise—i.e.,
\[\begin{align} \label{boundedmin} m_R(\vec{x},y) = \begin{cases}\text{the least $z \leq y$ such that $R(\vec{x},z)$} & \text{ if such a $z$ exists} \\y + 1 & \text{ otherwise}\end{cases}\end{align}\]To see this, observe that if \(R(\vec{x},z)\) is primitive recursive,then so is \(\forall z \leq y \neg R(\vec{x},z)\). It is then notdifficult to verify that
\[m_R(\vec{x},y) = \Sigma_{i=0}^y \chi_{\forall z \leq y \neg R(\vec{x},z)}(\vec{x},i).\]A natural number \(y\) is said to bedivisible by \(x\) justin case there exists a \(z\) such that \(x \times z = y\)—i.e.,\(x\) divides \(y\) without remainder. In this case we write \(x\divides y\). Note that if \(x \divides y\) holds, then this must bewitnessed by a divisor \(z \leq y\) such that \(x \times z = y\). Wemay thus define \(x \divides y\) in the following manner which showsthat it is primitive recursive:
\[x \divides y \Longleftrightarrow \exists z \leq y(x \times z = y)\]We may also define thenon-divisibility relations \(x\notdivides y\) as \(\neg(x \divides y)\) which shows that it too isprimitive recursive.
Next recall that a natural number \(x\) isprime just in caseit is greater than 1 and is divisible by only 1 and itself. We maythus define the relation \(\textit{Prime}(x)\) in the following mannerwhich shows that it is primitive recursive:
\[\begin{aligned}\textit{Prime}(x) \Longleftrightarrow \overline{1} < x \wedge \forall z \leq x(z \divides x \rightarrow (z = \overline{1} \vee z = x))\end{aligned}\]The primes form a familiar infinite sequence \(p_0 = 2,\) \(p_1 = 3,\)\(p_2 = 5,\) \(p_3 = 7,\) \(p_4 = 11,\)…. Let \(p(x) =p_x\)—i.e., the function which returns the \(x\)th prime number.\(p(x)\) can be defined by primitive recursion relative to thefunction \(\nextPrime(x)\) which returns the least \(y > x\) suchthat \(y\) is prime as follows:
\[\begin{aligned}p(0) & = \overline{2} \\ \nonumberp(x+1) & = \nextPrime(p(x))\end{aligned}\]Recall that Euclid’s Theorem states that there is always a primenumber between \(x\) and \(x! + 1\) and also that \(x! = \fact(x)\) isprimitive recursive. It thus follows that \(\nextPrime(x)\) can bedefined via bounded minimization as follows:
\[\begin{aligned}\nextPrime(x) = m_{x < z \ \wedge \ \textit{Prime}(z)}(x,\fact(x)+1)\end{aligned}\]It thus follows that \(p(x)\) is primitive recursive.
The foregoing sequence of definitions provides some evidence for therobustness of the class of primitive recursive relations andfunctions. Further evidence is provided by the fact that it ispossible to develop the machinery for coding and decoding finitesequences of natural numbers and for performing various combinatorialoperations on sequences—e.g., adjunction of an element,concatenation, extracting a subsequence, substituting one element foranother, etc. The primitive recursiveness of these operationsunderpins Gödel’s arithmetization of syntax as described inSection 1.3. We present here only the basic definitions required to demonstratethe primitive recursiveness of thek-tupling and projectionfunctions which are required for results in computability theory suchas theNormal Form Theorem (2.3) discussed below.
Given a finite sequence of natural numbers \(n_0,n_1,\ldots,n_{k-1}\)we define itscode to be the number
\[\begin{align} \label{primecode}p_0^{n_0 + 1} \times p_1^{n_1 + 1} \times p_2^{n_2 + 1} \times \ldots \times p_{k-1}^{n_{k-1}+1}\end{align}\]where \(p_i\) is the \(i\)th prime number as defined above. In otherwords, the code of \(n_0,n_1,\ldots,n_{k-1}\) is the natural numberresulting from taking the product of \(p_i^{n_i + 1}\) for \(0 \leq i\leq k-1\). This will be denote by \(\langle n_0,n_1,\ldots,n_{k-1}\rangle\)—e.g.,
\[\begin{aligned}\langle 3,1,4,1,5 \rangle & = 2^{4} \times 3^{2} \times 5^{5} \times 7^{2} \times 11^{6} \\& = 39062920050000.\\\end{aligned}\](Note that 1 is added to each exponent so that, e.g., 3, 1, 4, 1, 5has a distinct code from that of 3, 1, 4, 1, 5, 0, etc.—i.e., sothat the coding operation isinjective.)
The operation which takes a sequence of arbitrary length to its codedoes not have a fixed arity and hence is not given by a singleprimitive recursive function. But it is not hard to see that if werestrict attention to sequences of given length \(k\), then \(\langlen_0,n_1,\ldots,n_{k-1} \rangle : \mathbb{N}^k \rightarrow \mathbb{N}\)is primitive recursive as it is simply the bounded product given by(\ref{primecode}). Consider next the function \(\textit{element}(s,i)= n_i\) where \(s = \langle n_0,n_1,\ldots,n_{k-1} \rangle\) and \(0\leq i \leq k-1\) and which returns 0 when \(i\) is not in this rangeor \(s = 0\) or 1 (and thus not a code of a sequence). In order to seethat \(\textit{element}(s,i)\) is also primitive recursive, firstobserve that it is possible to recover \(\textit{len}(s)\)—i.e.,thelength of the sequence coded by \(s\)—by searchingfor the least \(i < s\) such that \(p_i \divides s\) and \(p_{i+1}\notdivides s\). Since \(s\) also bounds all the primes \(p_i\) whichdivide it we may define
\[\begin{aligned}len(s) = \begin{cases} 0 & \text{ if $s = 0$ or $s = 1$} \\1 + m_{p_z \divides s \wedge p_{z+1} \notdivides s}(s,s) & \text{ otherwise} \end{cases}\end{aligned}\]It is straightforward to see that a function defined by cases withprimitive recursive conditions is primitive recursive. So\(\textit{len}(s)\) is primitive recursive as well. Similarly, it iseasy to see that relation \( Seq(x) \) of being the code of a sequenceis primitive recursive.
Finally observe that \(\textit{element}(s,i)\) is equal to thesmallest exponent \(n\) such that \(p_i^{n+1} \divides s\) but\(p_i^{n+2} \notdivides s\) and that such an exponent is also boundedby \(s\). We may thus provide a primitive recursive definition of\(\textit{element}(s,i)\) as follows:
\[\begin{aligned}\textit{element}(s,i) = \begin{cases} 0 & \text{ if $i \geq len(s)$ or $\neg Seq(s)$}\\m_{p_i^{z+1} \divides s \wedge p_i^{z+2} \notdivides s}(s,s) \dotminus 1 & \text{ otherwise} \end{cases}\end{aligned}\]The conventional abbreviation \((s)_i = \textit{element}(s,i)\) willbe employed for this function below.
The primitive recursive functions and relations encompass a broadclass including virtually all those encountered in ordinarymathematics outside of logic or computability theory. This isillustrated in part by the fact thatPR contains functions suchas \(supexp(x,y)\) which grow far faster than those whose values wecan feasibly compute in practice in the sense studied incomputational complexity theory. But the robustness of the classPR is also attested to by thefact that its definition is invariant with respect to a variety ofmodifications—e.g., with respect to the classes of initialfunctions \(I_{\textbf{PR}}\) and functionals\(\textit{Op}_{\textbf{PR}}\) on which its definition is based.
As an initial illustration, consider the following scheme of so-calledpure iteration:
\[\begin{align} \label{pureiter}h(0,y) & = y \\ \nonumberh(x+1,y) & = g(h(x,y))\end{align}\]It is easy to see that the function \(h\) defined by (\ref{pureiter})from \(g\) in this manner is the \(x^{\mathrm{th}}\)–iterate of\(g\)—i.e., \(g^{x}(y)=_{\mathrm{df}} g(g(\ldots g(y)))\)\(x\)–times with the convention that \(g^0(y) = y\). We willdenote this functional by \(\mathcal{Iter}[g,x]\). The scheme(\ref{pureiter}) thus generalizes (\ref{prex1}) by making the value ofbase case an argument to \(h\). But it is an apparent restriction of(\ref{prscheme}) in the sense that \(h\) cannot depend on either therecursion variable or additional parameters.
Suppose we now consider an alternative class of initial functions\(In_{\mathbf{IT}}\) containing \(s,\pi^k_i\), the binary codingfunction \(\langle x,y \rangle\), and the decoding functions \((x)_0\)and \((x)_1\) defined at the end ofSection 2.1.2. (Note that these operate analogously to the first and secondproduction functions \(\pi^2_0\) and \(\pi^2_1\) operating oncodes of ordered pairs.) Now define \(\mathbf{IT}\) to be thesmallest class of functions containing \(In_{\mathbf{IT}}\) and closedunder the functionals \(\textit{Op}_{\mathbf{IT}} =\{\mathcal{Comp}^i_j,\mathcal{Iter}\}\).
Theorem 2.1 (Robinson 1947): The class\(\mathbf{IT}\) is equal to the classPR of primitive recursivefunctions.
This illustrates that if we slightly enlarge the class of initialfunctions, it is still possible to obtain the entire classPRvia a scheme of functional iteration which at first appears lessgeneral than primitive recursion. See Odifreddi (1989: ch. I.5) for anaccount of further improvements which can be obtained in thisdirection.
Other results show that the classPR also remains stable ifprimitive recursion is replaced with other schemes which may initiallyappear more general. The most familiar of these is the scheme ofcourse of values recursion which is traditionally illustratedusing the so-calledFibonacci function \(\fib(x)\) which wasbriefly discussed at the beginning ofSection 1. This may be defined as follows:
\[\begin{align} \label{fibdefn}fib(0) & = 0\\ \nonumberfib(1) & = 1\\ \nonumberfib(y+1) & = fib(y) + fib(y-1)\end{align}\]This definition can readily be used to calculate the values of\(\fib(x)\) in a recursive manner—e.g.,
\[\begin{aligned} \fib(4) &= \fib(3) + \fib(2) \\ &= (\fib(2) + \fib(1)) + (\fib(1)+\fib(0)) \\ &= ((\fib(1) + \fib(0)) + 1) + (1 + 1) \\ &= ((1 + 1) + 1) + (1 + 1) \\ & = 5\end{aligned}\]This gives rises to the familiar sequence 0, 1, 1, 2, 5, 8, 13, 21,34, 55, 89, 144,… wherein \(F_0 =0,\) \(F_1 = 1,\) and\(F_{i+2} = F_{i+1} + F_i.\) Note, however, the definition(\ref{fibdefn}) cannot be directly assimilated to the primitiverecursion scheme (\ref{prscheme}) since the third clause defines thevalue of \(\fib(y+1)\) in terms ofboth \(\fib(y)\) and\(\fib(y-1)\). It is, however, still possible to show that \(\fib \in\textbf{PR}\). One means of doing this is to again make use of thebinary coding and projection functions to first define an auxiliaryfunction \(g(0) = \langle 0,1 \rangle\) and
\[g(y+1) = \langle (g(y))_1,(g(y))_0 + (g(y))_1 \rangle\]which enumerates the pairs \(\langle F_0,F_1 \rangle\), \(\langle F_1,F_2 \rangle, \ldots\) It is then easy to see that \(\fib(y) =(g(y))_0\).
(\ref{fibdefn}) is thus an instance in which the value of the function\(h\) at \(y\) depends on both of the prior values \(h(y-1)\) and\(h(y-2)\) from its graph (for \(y \geq 2\)). It is, of course, alsopossible to consider cases where \(h(y)\) depends on an arbitrarynumber of its preceding values \(h(0), \ldots, h(y-1)\). To this end,suppose we are given \(h(\vec{x},y)\) and then define
\[\begin{align*}\widetilde{h}(\vec{x},y) &= \Pi_{i = 0}^y p_i^{h(\vec{x},i)+1} \\& = \langle h(\vec{x},0), \ldots, h(\vec{x},y) \rangle.\\\end{align*}\]We then say that \(h(\vec{x},y)\) is defined bycourse of valuesrecursion from \(f(\vec{x})\) and \(g(\vec{x},y,z)\) if
\[\begin{aligned}h(\vec{x},0) & = f(\vec{x}) \\ \nonumberh(\vec{x},y + 1) & = g(\vec{x},y,\widetilde{h}(\vec{x},y))\end{aligned}\]Suppose that we now let \(\mathcal{CV}_k[f,g]\) denote thecorresponding functional operation and let \(\mathbf{CV}\) be thesmallest class of functions containing \(In_{\textbf{PR}}\) and closedunder \(\mathcal{Comp}^j_k\) and \(\mathcal{CV}_k\). Then since it iseasy to see that \(\widetilde{h}(\vec{x},y)\) is primitive recursiveif \(h(\vec{x},y)\) is, we also have the following:
Theorem 2.2 (Péter 1935): The class\(\mathbf{CV}\) is equal to the classPR of primitive recursivefunctions.
Since course of values recursion is used in mathematical practice, itis significant that it does not lead outside the class of primitiverecursive functions. There are, however, a number of other possibleways in which the scheme (\ref{prscheme}) might also be generalized,including what are known asdouble recursion andnestedrecursion. The definition of the function \(\pi(x,y)\) inSection 1.4 exhibits the former since its value at \(x,y\) depends on its valueatboth \(x-1\) and \(y-1\) and also the latter since theoccurrence of the defined function \(\pi(x,y)\) is“nested” within itself (rather than an auxiliary function)on the righthand side of the third clause. See thesupplement on the Ackermann-Péter function for further details on the closure properties of the primitiverecursive functions with respect to these schemes.
We have now seen two ways of showing that there exist number theoreticfunctions which are not primitive recursive—i.e., by observingthat while there are only countably many primitive recursive functionsthere are uncountably many functions of type \(\mathbb{N}^k\rightarrow \mathbb{N}\) (\(k > 0\)) and also by constructing afunction such as \(\alpha(x,y) = \alpha_x(y,y)\) which grows fasterthan any primitive recursive function. A third proof—originallydue to Hilbert & Bernays (1934: ch. 7)—is based on theobservation that it is possible to enumerate the classPR as\(g_0(x),g_1(x),g_2(x), \ldots\)—e.g., by Gödel numberingthe sorts of definitions considered at the end ofSection 2.1.1. If we then consider the modified diagonal function
\[\begin{aligned}\delta(x) = g_x(x) + 1\end{aligned}\]it is easy to see that this function also cannot be primitiverecursive. For if \(\delta(x)\) coincided with some function\(g_j(x)\) in the enumeration, then we would have \(g_j(j) = \delta(j)= g_j(j) + 1\), a contradiction. Note that this also shows thatrelative to such an enumeration theuniversal function\(u_1(i,x) = g_i(x)\) for unary primitive recursive functions cannotitself be primitive recursive as we could otherwise define\(\delta(x)\) as \(u_1(x,x) + 1\). Hilbert & Bernays (1939: ch. 5)would later discuss this observation in regard to what has becomeknown as theirdenotational paradox—see, e.g., Priest1997.
On the other hand, there are intuitively effective procedures forcomputing each of these functions. For instance, in the case of\(\delta(x)\) we can proceed as follows:
As with the definitions of \(\alpha\) and \(u_1\), the foregoingprocedure is effective in the sense discussed inSection 1.6. But the corresponding function cannot be computed by a singleprimitive recursive definition in virtue of the uniformity in thevariable \(x\) at step ii. There is thus a natural motivation forseeking to expand the definition of the classPR so as toencompass such functions.
One means by which this can be accomplished builds on the observationthat the bounded minimization operation \(m_R(\vec{x},y)\) admits to astraightforward algorithmic characterization—i.e., to computethe value of \(m_R(\vec{x},y)\) successively check \(R(\vec{x},0),\)\(R(\vec{x},1),\) …, \(R(\vec{x},z),\)… giving output\(z\) and halting as soon as \(R(\vec{x},z)\) holds and \(y+1\) if nopositive instance is found before \(z = y\). This can be generalizedto the so-calledunbounded search operation. In particular,given a relation \(R(\vec{x},y)\) we can define the operation\(\mu_R(\vec{x},z)\) which returns the least \(z\) such that\(R(\vec{x},z)\) if such a \(z\) exists and is undefined otherwise.Note that if \(R(\vec{x},y)\) is primitive recursive, then it is stillpossible to effectively search for the value of \(\mu_R(\vec{x},y)\)by successively checking \(R(\vec{x},0),\) \(R(\vec{x},1),\)….But since no upper bound is specified in advance, we are notguaranteed that this procedure will always terminate. In particular,if there is no \(z \in \mathbb{N}\) such that \(R(\vec{x},z)\) holds,then the procedure will continue indefinitely. In this case, westipulate that \(\mu_R(\vec{x},y)\) isundefined, from whichit follows that \(\mu_R(\vec{x},y)\) will correspond to what is knownas apartial function—a notion which is made precise bythe following sequence of definitions.
The class of so-calledpartial recursive functions isobtained from our prior definition ofPR by closing under anoperation similar to \(\mu_R(\vec{x},z)\) which is applied tofunctions rather than relations. In order to define this class, wefirst introduce the following conventions regardingpartialfunctions which extends those given at the beginning ofSection 2:
A function \(f:\mathbb{N}^k \rightarrow \mathbb{N}\) is calledtotal if \(f(\vec{n})\) is defined for all \(\vec{n} \in\mathbb{N}^k\). Otherwise \(f(\vec{x})\) is calledpartial.
We write \(f(\vec{n})\darrow\) to express that \(f(\vec{x})\) isdefined at \(\vec{n}\) and additionally \(f(\vec{n})\darrow = m\) if\(f(\vec{n})\) is defined at \(\vec{n}\) and equal to \(m\). Otherwisewe write \(f(\vec{n})\uarrow\) to express that \(f(\vec{x})\) isundefined at \(\vec{n}.\)
Thedomain of \(f(\vec{n})\) is the set \(\textrm{dom}(f) =\{\vec{n} \in \mathbb{N}^k : f(\vec{n}) \darrow\}\).
We write \(f(\vec{x}) \simeq g(\vec{x})\) just in case for all\(\vec{n} \in \mathbb{N}\), either \(f(\vec{n})\) and \(g(\vec{n})\)are both undefined or are both defined and equal.
Suppose we are given a partial function \(f(x_0,\ldots,x_{k-1},y)\).We now introduce terms of the form \(\mu y f(x_0,\ldots,x_{k-1},y)\)defined as follows:
\[\begin{align} \label{murec}\mu y f(x_0,\ldots,x_{k-1},y) = \begin{cases} z & \text{if } z \text{ is such that } \\ &\:\: f(x_0,\ldots,x_{k-1},z) = 0 \text{ and } \\ &\:\: \forall w < z(f(x_0,\ldots,x_1,w)\darrow \neq 0) \\\uarrow & \text{otherwise}\end{cases}\end{align}\]In other words, \(\mu y f(\vec{n},y)\) is equal to the least \(m\)such that \(f(\vec{n},m) = 0\) provided that such an \(m\) exists andalso that \(f(\vec{n},i)\) is defined but not equal to 0 for all \(0\leq i < m\). On the other hand, \(\mu y f(\vec{n},y)\) isundefined just in case either there is no \(m\) such that\(f(\vec{n},m) = 0\) or there is such a \(m\) but \(f(\vec{n},i)\) isundefined for some \(i < m\).
Since this definition determines \(\mu yf(\vec{x},y)\) uniquely,(\ref{murec}) can also be regarded as defining a functional\(\mathcal{Min}_k\) which maps \(k+1\)-ary partial functions intok-ary partial functions. We now define the classes of functionsPartREC andREC as follow:
Definition 2.5: The class ofpartial recursivefunctionsPartREC (also known as the\(\mu\)-recursivefunctions) is the smallest class of partial functions of type\(\mathbb{N}^k \rightarrow \mathbb{N}\) containing the initialfunctions \(I_{\textbf{PR}} = \{\mathbf{0},s,\pi^i_k\}\) and closedunder the functionals
\[\textit{Op}_{\textbf{PartREC}} = \{\mathcal{Comp}^i_j,\mathcal{PrimRec}_k,\mathcal{Min}_k\}.\]We say that a function \(f:\mathbb{N}^k \rightarrow \mathbf{N}\) ispartial recursive if \(f \in \textbf{PartREC}\). Additionallywe say that \(f\) isrecursive if \(f \in \textbf{PartREC}\)and \(f\) is total. The set of recursive functions will be denoted byREC.
Since the use of the name “partial recursive function” todenote this class has been standard usage since the 1930s, we willretain it here. Nonetheless it is potentially confusing in at leasttwo respects. First, since “partial” serves to modify“function“ rather than “recursive“ in theassertion “\(f\) is a partial recursive function”, a morenatural expression would be “recursive partial function”.Second, despite its name, the class of partial recursive functionscontains total functions. In particular, arecursive functionis, by definition, one which ispartial recursive while also beingtotal. We will see inSection 3.2, there also exist partial recursive functions which are genuinelypartial and total functions which are not recursive.
Note finally that if \(f(\vec{x})\) is recursive it may be defined viasome finite number of applications of composition, primitiverecursion, and unbounded minimization in a manner which preserves thetotality of intermediate functions in its definition. Thus althoughthe specification of \(f(\vec{x})\) may involve one or moreapplications of unbounded search, each search required to compute itsvalue is guaranteed to terminate in a finite number of steps. It thusfollows that all of functions inREC are computable by analgorithm (despite the fact that we will soon see that this classcontains functions which are not primitive recursive). Thisconstitutes part of the evidence forChurch’sThesis—i.e., the claim thatREC coincides with theclass of effectively computable functions—which was surveyed inSection 1.6.
Once we have defined the classPartREC, a question whichnaturally arises is whether all partial recursive functions can bedefined in a canonical way. TheNormal FormTheorem—originally due to Kleene (1936a)—provides apositive answer to this question by showing that a single applicationof the unbounded minimization operator suffices to obtain all suchfunctions. In order to formulate this result, it is convenient toofficially extend the application of the \(\mu\)-operator to primitiverecursive relations \(R(\vec{x})\) in the manner discussed at thebeginning of this section—i.e.,
\[\begin{align} \label{unboundedminrel}\mu y R(\vec{x},y) = \begin{cases}\text{the least $y$ such that $R(\vec{x},y)$} & \text{ if such a $y$ exists} \\\uarrow & \text{ otherwise}\end{cases}\end{align}\]Theorem 2.3: For all \(k \in \mathbb{N}\) thereexists a \(k+2\)-ary primitive recursive relation\(T_k(e,\vec{x},s)\)—the so-calledKleeneT-predicate—and a primitive recursive function\(u(x)\) (not depending on \(k\)) satisfying the following condition:for allk-ary partial recursive functions \(f(\vec{x})\) thereexists \(e \in \mathbb{N}\) such that for all \(\vec{n} \in\mathbb{N}^k\)
\[f(\vec{n}) \simeq u(\mu s T_k(e,\vec{n},s))\]Since \(\mu y R(\vec{x},y) \simeq \mu y \chi_{\neg R}(\vec{x},y)\), itis easy to see that the classPartREC can also be obtained byclosing the primitive recursive functions under the operation definedby (\ref{unboundedminrel}). One consequence ofTheorem 2.3 is thus that it is indeed possible to define anyk-ary partialrecursive function \(f(\vec{x})\) by a single application of unboundedsearch applied to the relation \(T_k(e,\vec{x},s)\) for an appropriatechoice of \(e\). More generally, the Normal Form Theorem illustrateshow any such function may be defined from asingle relation\(T_k(e,\vec{x},s)\) wherein the value of \(e\) serves as adescription of the manner in which \(f(\vec{x})\) is defined in termsof the basis functions \(I_{\textbf{PR}}\) and the operations\(\textit{Op}_{\mathbf{PartRec}}\). Such an \(e\) is known as anindex for \(f(\vec{x})\). As we will see inSection 3, the availability of such indices is one of the central features ofthe partial recursive functions which allows them to provide the basisfor a general theory of computability and non-computability.
The complete details of the proof ofTheorem 2.3 are involved. But the basic idea may be summarized as follows:
Every partial recursive function \(f(\vec{x})\) is defined by a term\(\tau\) over the language
\[\mathbf{0},s,\pi^i_j,\mathcal{Comp}^j_k,\mathcal{PrimRec}_k,\mathcal{Min}_k\]in the manner which extends the notation scheme for partial recursivefunction introduced at the end ofSection 2.1.1. By associating the atomic expressions of this language with naturalnumbers in the manner of Gödel numbering \(\ulcorner \cdot\urcorner\) described inSection 1.3 and then employing the coding machinery described at the end ofSection 2.1.2, it is then possible to associate \(\tau\) with a natural number\(\ulcorner \tau \urcorner = e\) which can serve as an index for\(f(\vec{x})\).
The definition of \(T_k(e,\vec{n},s)\) can now be constructed byformalizing the following decision algorithm:
By performing an unbounded search over codes of computation sequencesin this manner, we achieve the dual purposes of both determining ifthe computation described by \(\tau\) on input \(\vec{n}\) halts aftera finite number of steps and, if so, also finding a code \(s\) of acomputation sequence which witnesses this fact.[22] The function \(u(s)\) can then be defined by formalizing theoperation which extracts the output of the computation from the laststep \((s)_{\textit{len}(s)-1}\) of the sequence encoded by \(s\). Inthe case that \(T_k(e,\vec{n},s)\) holds, \(u(s)\) will thuscorrespond to the value \(f(\vec{n})\). Since the foregoing stepsrequire only performing bounded search and checking the localcombinatorial properties of finite sequences, it can additionally beshown that \(T_k(e,\vec{n},s)\) and \(u(x)\) are primitiverecursive.
The techniques used in this proof can also be used to show that\(\alpha(x,y)\), the universalk-ary primitive recursiveevaluation function \(u_k(i,\vec{x})\), and the modified diagonalfunction \(\delta(x)\) are all recursive (despite the fact that wehave seen above that they arenot primitive recursive). Forinstance note that the coding of definitions ofk-ary partialrecursive functions described above also allows us to uniformlyenumerate all primitive recursive functions\(g_0(\vec{x}),g_1(\vec{x}),\ldots\) by considering the codes of termsnot containing \(\mathcal{Min}_k\). We can define in this manner aprimitive recursive function \(r(i)\) enumerating the indices forthese functions such that we can obtain the universal function fork-ary primitive recursive function as \(u_k(i,\vec{x}) = u(\mus T_1(r(i),\vec{x},s)) = g_i(\vec{x})\). But note that since\(g_i(\vec{x})\) is always defined, \(u_1(i,\vec{x})\) is not onlypartial recursive but also total, and hence recursive.
Taking into account the equivalences between models of computationsummarized inSection 1.6, it is also possible to formulate a version ofTheorem 2.3 for each of the models of computation mentioned there. For instance,in the case of the Turing Machine model \(\mathbf{T}\), the analogousversion of the Normal Form Theorem can be used to show that there is asingleuniversal Turing machine (see entry on Turing machines) \(U\) such that every partial recursive function \(f(\vec{x})\)corresponds to that computed by \(U(e,\vec{x})\) for some \(e \in\mathbb{N}\). Complete proofs of this sort were given by Turing (1937:sec. 6) for \(\mathbf{T}\), by Kleene (1936a: sec. 2) for the generalrecursive functionsGR (see also Kleene 1952: sec. 58), byShoenfield (1967: ch. 7.4) for the class \(\mathbf{F}_{\mathsf{PA}}\)of functions representable in Peano Arithmetic, and by Cutland (1980:ch. 5) for the Unlimited Register Machine model \(\mathbf{U}\).
Computability Theory is a subfield of contemporary mathematical logicdevoted to the classification of functions and sets of natural numbersin terms of their absolute and relative computability anddefinability-theoretic properties. This subject is closely related inboth origin and content to the study of recursive functions. This isreflected by the fact that computability theory was known asrecursive function theory (or simplyrecursiontheory) from the time of its inception in the 1930s until thelate 1990s. It is also reflected in the formulation and proof of theso-calledRecursion Theorem which provides a fundamental linkbetween recursive definability and the sort of self-referentialconstructions which are at the core of many methods in computabilitytheory (seeSection 3.4).
For reasons discussed inSection 1.7, contemporary expositions of computability theory are often presentedin an abstract manner which seeks to minimize reference to thespecific features of a model of computation such as the partialrecursive functions. It is thus useful to stress the followingmodifications to the traditional terminology which has been employedinSections 1 and 2 and the more contemporary terminology which will be employed in thissection:
The expressionscomputable function andpartial computable function will be usedinstead of the traditional termsrecursivefunction andpartial recursivefunction as defined inSection 2.2.1.
The expressioncomputable set will be usedinstead of the traditional termrecursiveset. Similarly,computablyenumerable (or c.e.)set willbe used instead of the traditional termrecursivelyenumerable (or r.e.)set (seeSection 3.3).
The other notational conventions introduced at the beginnings ofSection 2.1 andSection 2.2 will be retained in this section.
The first significant result in computability theory wasKleene’s (1936a) proof of the Normal Form Theorem which waspresented inSection 2.2.2. As discussed there, the Normal Form Theorem can be understood asillustrating how it is possible to associate eachk-ary partialcomputable function \(f(\vec{x})\) with a natural number \(e\) knownas itsindex such that \(f(\vec{x}) \simeq \mus(T_k(e,\vec{x},s))\). Such an \(e\) can be thought of as a name for acomputer program built up from the basis functions, composition,primitive recursion, and minimization by which the values\(f(\vec{x})\) can be computed. This also leads to what is known as anindexation ofk-ary partial computable functions
\[\phi^k_0(\vec{x}), \phi^k_1(\vec{x}), \phi^k_2(\vec{x}), \ldots, \phi^k_i(\vec{x}), \ldots\]where \(\phi^k_i(\vec{x}) \simeq \mu s T_k(i,\vec{x},s)\). Such anenumeration provides a uniform means of listing off all partialcomputable functions in the order of their indices. It should benoted, however, that each partial computable function has infinitelymany indices. For instance, given a function \(f:\mathbb{N}^k\rightarrow \mathbb{N}\) computed by \(\phi_e(\vec{x})\), it ispossible to define infinitely many extensionally coincident functionswith distinct indices \(\phi_{e'}(\vec{x}), \phi_{e''}(\vec{x}),\ldots\)—e.g., by “padding” the definition encodedby \(e\) with terms that successively add and then subtract \(m\) foreach \(m \in \mathbb{N}\). As this yields a definition of anextensionally equivalent function, it thus follows that infinitelymany of the \(\phi^k_i(\vec{x})\) will correspond to the same functionin extension.
A result closely related to the Normal Form Theorem is the followingwhich is conventionally known as thes-m-n Theorem:
Theorem 3.1: For all \(n,m \in \mathbb{N}\), there isa primitive recursive function \(s^m_n(i,x_0,\ldots,x_{m-1})\) suchthat
\[\phi^n_{s^m_n(i,x_0,\ldots,x_{m-1})}(y_0,\ldots,y_{n-1}) \simeq \phi^{n+m}_i(x_0,\ldots,x_{m-1},y_0,\ldots,y_{n-1})\]Here the function \(s^m_n(i,\vec{x})\) should be thought of as actingon an index \(i\) for an \(n+m\)-ary partial computable functiontogether with values \(\vec{x}\) for the first \(m\) of its arguments.This function returns an index for another partial computable functionwhich computes then-ary function determined by carrying out\(\phi^{n+m}_i\) with the first \(m\) of its arguments \(\vec{x}\)fixed but retaining the next \(n\) variables \(\vec{y}\) as inputs.Although the formulation of thes-m-n Theorem may at firstappear technical, its use will be illustrated in the proof ofRice’s Theorem (3.4) and theRecursion Theorem (3.5) below.
Another consequence of the Normal Form Theorem is the following:
Theorem 3.2: For every \(k \in \mathbb{N}\), there isa \(k+1\)-ary partial computable function \(\upsilon^k\) which isuniversal in the sense that for allk-ary partial computablefunctions \(f(\vec{x})\), there is an \(i \in \mathbb{N}\) such that\(\upsilon_k(i,\vec{x}) \simeq f(\vec{x})\).
This follows immediately fromTheorem 2.3 by taking \(\upsilon_k(i,\vec{x}) = u(\mu s T_k(i,\vec{x},s))\) where\(i\) is such that \(f(\vec{x}) \simeq \phi^k_i(\vec{x})\) in theenumeration ofk-ary partial computable functions. As\(\upsilon^k(i,\vec{x})\) can be used to compute the values of allk-ary partial computable functions uniformly in their index, itis conventionally referred to as thek-aryuniversalpartial computable function.
It is useful to observe that while we have just defined such afunction for each \(k\), it is also possible to define a binaryfunction \(\upsilon(i,x)\) which treats its second argument as a codefor a finite sequence \(x_0,\ldots,x_{k-1}\) and then computes in thesame manner as thek-ary universal function so that we have\(\upsilon(i,\langle x_0,\ldots, x_{k-1} \rangle) \simeq\upsilon^k(i,x_0,\ldots,k_{k-1})\). This provides a means of replacingthe prior enumerations ofk-ary partial computable functionswith a single enumeration of unary functions
\[\phi_0(x), \phi_1(x), \phi_2(x), \ldots, \phi_i(x), \ldots\]where
\[\begin{align*}\phi_i(\langle x_0,\ldots, x_{k-1} \rangle) & \simeq \upsilon(i,\langle x_0,\ldots, x_{k-1} \rangle)\\& \simeq \phi^k_i(x_0,\ldots, x_{k-1})\end{align*}\]Together withTheorem 2.3,Theorem 3.1 andTheorem 3.2 codify the basic properties of a model of computation which make itsuitable for the development of a general theory of computability. InSection 2 such a model has been defined in the form of the partial recursivefunctions. But as was discussed briefly at the end ofSection 2.2.2, versions of these results may also be obtained for the other modelsof computation discussed inSection 1.6. This licenses the freer usage of computer-based analogies and otherappeals to Church’s Thesis employed in most contemporarytreatments of computability theory which will also be judiciouslyemployed in the remainder of this entry.
Having just seen that there is a universal partial computable function\(\upsilon(i,x)\), a natural question is whether this function is alsocomputable (i.e.,total). A negative answer is providedimmediately by observing that by using \(\upsilon(i,x)\) we may defineanother modified diagonal function \(d(x) = \upsilon(x,x) + 1\) whichis partial computable (since \(\upsilon(i,x)\) is). This in turnimplies that \(d(x) \simeq \phi_j(x)\) for some \(j\). But now notethat if \(\upsilon(i,x)\) were total, then \(d(j)\) would be definedand we would then have
\[\begin{align*}d(j) & = \phi_j(j) \\& = \upsilon(j,j) + 1 \\& = \phi_j(j) + 1,\end{align*}\]a contradiction. Comparing this situation with that described at thebeginning ofSection 2.2 we can see that the partial computable functions differ from theprimitive recursive functions in admitting a universal function withinthe same class but at the same time giving up the requirement that thefunctions in the class must be total. In other words, while\(\upsilon(i,x) \in \textbf{PartREC}\), the discussion inSection 2.2.2 shows that \(u_1(i,\vec{x}) \in \textbf{REC} - \textbf{PR}\).
Since it is easy to see how the minimization operation can be used todefine partial functions, the foregoing observations are expected.What is more surprising is that there are mathematically well-definedtotal functions which are not computable. Building on thediscussion of theEntscheidungsproblem inSection 1.7, the most famous example of such a function derives from the so-calledHalting Problem (see entry on Turing machines) for the Turing Machine model. This was originally formulated byTuring (1937) as follows:
Given an indexation of \(T_0, T_1, \ldots\) of Turing machines, doesmachine \(T_i\) halt on the input \(n\)?
An equivalent question can also be formulated in terms of the partialrecursive functions:
Is the partial computable function \(\phi_i(x)\) defined for input\(n\)?
The pairs of natural numbers \(\langle i,n \rangle\) corresponding topositive answers to this question determine a subset of \(\mathbb{N}\times \mathbb{N}\) as follows:
\[\begin{aligned} \HP = \{\langle i,n \rangle : \phi_i(n) \darrow\} \end{aligned}\]A set (orproblem) is said to beundecidable just incase its characteristic function is not computable. For instance let\(h(x,y) = \chi_{\HP}(x,y)\) and observe that this, by definition, isatotal function. The so-calledundecidability of theHalting Problem may now be formulated as follows:
Theorem 3.3: \(h(x,y)\) is not a computablefunction.
Proof: Suppose for a contradiction that \(h(x,y)\) werecomputable. Consider the function \(g(x)\) defined as
\[\begin{aligned}g(x) = \begin{cases} 0 & \text{if } h(x,x) \darrow = 0 \\\uarrow & \text{otherwise}\end{cases}\end{aligned}\]On the assumption that \(h(x,y)\) is computable, \(g(x)\) is partialcomputable since, e.g., it may be computed by a program which on input\(x\) computes \(h(x,x)\) and returns 0 just in case \(h(x,x) = 0\)and otherwise goes into an infinite loop. It hence follows that \(g(x)\simeq \phi_j(x)\) for some \(j \in \mathbb{N}\). But now observe thatone of the following two alternatives must hold: i) \(g(j) \darrow\);or ii) \(g(j)\uarrow\). We may thus reason by cases as follows:
Suppose that \(g(j) \darrow\). Then \(h(j,j) = 0\) by definition of\(g(x)\). Since \(h(i,x)\) is the characteristic function of \(\HP\),this means \(\phi_j(j) \uarrow\). But then since \(g(x) \simeq\phi_j(x)\), \(g(j) \uarrow\), a contradiction.
Suppose that \(g(j) \uarrow\). Then \(h(j,j) \neq 0\) by definition of\(g(x)\). Since \(h(x,y)\) is the characteristic function of \(\HP\)(and hence total), the only other possibility is that \(h(j,j) = 1\)which in turn implies that \(\phi_j(j) \darrow\). But then since\(g(x) \simeq \phi_j(x)\), \(g(j) \darrow\), a contradiction. □
\(h(x,y)\) thus provides an initial example of a mathematicallywell-defined total function which is not computable. Othernon-computable functions can be defined by considering decisionproblems similar to \(\HP\). Some well-known examples are as follows:
\[\begin{align} \label{undecexs}K & = \{i : \phi_i(i) \darrow\} \\ Z &= \{i : \phi_i(n)\darrow = 0 \text{ for all $n \in \mathbb{N}$}\} \nonumber \\ \TOT & = \{i : \phi_i(n) \darrow \text{ for all $n \in \mathbb{N}$}\} \nonumber \\\textit{FIN} & = \{i : \phi_i(n)\darrow \text{ for at mostfinitely many distinct } \text{$n \in \mathbb{N}$}\}\nonumber\\& = \{i : W_i \text{ is finite} \} \nonumber \end{align}\]Suppose we let \(k(x), z(x), \textit{tot}(x)\), and\(\textit{fin}(x)\) be the characteristic functions of these sets. Bymaking suitable modifications to the proof ofTheorem 3.3 it is possible to directly show the following:
Proposition 3.1: None of the functions \(k(x), z(x),\textit{tot}(x)\), and \(\textit{fin}(x)\) are computable.
For instance in the case of \(k(x)\), we may argue as follows:
As this is again a contradictory situation, we may conclude that\(k(x)\) is not computable.
Note that each of the sets \(I\) defined in (\ref{undecexs}) has thefollowing property: if \(j \in I\) and \(\phi_j(x) \simeq \phi_k(x)\),then \(k \in I\) as well. Sets with this property are known asindex sets as they collect together the indices of allpartial computable functions which share a common“semantic” property—i.e., one which is completelydetermined by their graphs such as being coincident with the constant0 function in the case of \(Z\) or being defined on all inputs in thecase of \(\TOT\). An index set \(I\) is callednon-trivial if\(I \neq \emptyset\) or \(I \neq \mathbb{N}\)—i.e., it fails toeither include or exclude all indices. It is easy to see that all ofthe sets defined in (\ref{undecexs}) are non-trivial index sets. Theundecidability of these sets thus follows from the following moregeneral result:
Theorem 3.4 (Rice 1953): If \(I\) is a non-trivialindex set, then \(I\) is undecidable.
Proof: Let \(I\) be a non-trivial index set and suppose for acontradiction that \(\chi_I(x)\) is computable. Consider theeverywhere undefined unary function \(u(x)\)—i.e., \(u(n)\uarrow\) for all \(n \in \mathbb{N}\). Since \(u(x)\) is partialcomputable, there is an index \(b\) such that \(\phi_b(x) \simequ(x)\). We may suppose without loss of generality that \(b \not\inI\). (If it is the case that \(b \in I \neq \mathbb{N}\), then we canswitch the role of \(I\) with its complement \(\overline{I}\) in thefollowing argument and obtain the same result). Since \(I \neq\emptyset\), we can also choose an index \(a \in I\) and define afunction as follows:
\[\begin{aligned} f(x,y) = \begin{cases} \phi_a(y) & \text{if } k(x) = 1 \ \ \ \text{(i.e., if $\phi_x(x) \darrow$)} \\\uarrow & \text{if } k(x) = 0 \ \ \ \text{(i.e., if $\phi_x(x) \uarrow$)} \end{cases} \nonumber\end{aligned}\]Note that \(f(x,y)\) is partial computable since it is defined bycases in terms of \(\phi_a(x)\) based on the value of \(\phi_x(x)\).There is thus an index \(c\) such that \(f(x,y) \simeq \phi_c(x,y)\).By applying thes-m-n Theorem (3.1), we thus have that \(\phi_c(x,y) \simeq \phi_{s^2_1(c,x)}(y)\). Butnote that we now have the following sequences of implications:
(by our choice of \(a \in I\))
\[\begin{align*}k(x) = 0 & \Leftrightarrow f(x,y) \simeq \phi_b(y) \\& \Leftrightarrow \phi_{s^2_1(c,x)}(y) \simeq \phi_b(y) \\& \Leftrightarrow s^2_1(c,x) \not\in I\end{align*}\](by our assumptions that \(b\) is an index for \(u(x)\)—theeverywhere undefined function—and that \(b \not\in I\)).
It hence follows that the value of \(k(x)\) may be computed byapplying the following algorithm:
Either by invoking Church’s Thesis or by formalizing the prioralgorithm as a partial recursive definition, it follows that \(k(x)\)is computable. But this contradictsProposition 3.1 which shows that \(k(x)\) is not computable. □
Rice’s Theorem (3.4) provides a means of showing that many decision problems of practicalimport are undecidable—e.g., of determining whether a programalways returns an output or whether it correctly computes a givenfunction (e.g., addition or multiplication). Its proof also shows thatif \(I\) is a non-trivial index set, the problem of deciding \(x \inK\) can be “reduced” to that of deciding \(x \in I\) inthe following sense:if we could effectively decide thelatter,then we could also effectively decide the former byfirst calculating \(s^2_1(c,x)\) and then checking if this value is in\(I\). This method of showing undecidability will be formalized by thenotion of amany-one reduction described inSection 3.5 below.
A set \(A \subseteq \mathbb{N}\) is said to becomputable (orrecursive according to the older terminology ofSection 2) just in case its characteristic function is. More generally we havethe following:
Definition 3.1: A relation \(R \subseteq\mathbb{N}^k\) iscomputable just in case \(\chi_R(\vec{x})\)is computable.
This definition extends the definition of a primitive recursiverelation given inSection 2.1—e.g., since sets likePRIMES andDIV are primitiverecursive they areipso facto computable. Via Church’sThesis, the notion of a computable set thus also generalizes theaccompanying heuristic about effective decidability—i.e., \(R\)is computable just in case there is an algorithm for deciding if\(R(\vec{n})\) holds which always returns an answer after a finite(although potentially unbounded) number of steps. On the other hand,it follows from the observations recorded inSection 3.2 that none ofHP,K,Z,TOT, orFIN are computable sets.
A related definition is that of acomputably enumerable (orc.e.)set—i.e., one whose members can beenumerated by an effective procedure. (In the older terminology ofSection 2 such a set is said to berecursively enumerable which istraditionally abbreviatedr.e.) Officially we have thefollowing:
Definition 3.2: \(A \subseteq \mathbb{N}\) iscomputably enumerable (or c.e.) if \(A = \emptyset\) or \(A\)is the range of a computable function—i.e.,
\[A = \{m : \phi_e(n)\darrow = m \text{ for some } n \in \mathbb{N}\}\]for some index \(e\) of a total computable function.
This definition can be extended to relations by viewing \(m\) as acode for a finite sequence in the obvious way—i.e., \(R\subseteq \mathbb{N}^k\) is c.e. just in case there is acomputable function \(\phi_e(x)\) such that \(R(n_0, \ldots, n_k)\) ifand only if \(\phi_e(n) = \langle n_0, \ldots, n_k \rangle\) for some\(n \in \mathbb{N}\).
If \(A\) is computably enumerable, its members may thus be listed offas
\[A = \{\phi_e(0), \phi_e(1), \phi_e(2), \ldots \}\]possibly with repetitions—e.g., the constant function\(\const_{17}(x)\) enumerates the singleton set \(\{17\}\), which isthereby c.e. It is easy to see that a computable set \(A\) iscomputably enumerable. For if \(A = \emptyset\), then \(A\) isc.e. by definition. And if \(A \neq \emptyset\), we may choose\(a \in A\) and then define
\[\begin{align} \label{cefromc}f(x) = \begin{cases} x & \text{if } \chi_A(x) = 1 \\ a & \text{otherwise} \end{cases}\end{align}\]In this case \(f(x)\) is computable and has \(A\) as its range.
In proving facts about computably enumerable sets, it is oftenconvenient to employ one of several equivalent definitions:
Proposition 3.2: Suppose \(A \subseteq \mathbb{N}\).Then the following are equivalent:
\(A\) is computably enumerable.
\(A = \emptyset\) or \(A\) is the range of a primitive recursivefunction.
\(A = \{n \in \mathbb{N}: \exists y R(n,y)\}\) for a computablerelation \(R\).
\(A\) is the domain of a partial computable function.
The proof ofProposition 3.2 is largely a matter of unpacking definitions. For instance, to seethat iv implies i, suppose that \(A =\textrm{dom}(\phi_e)\)—i.e., \(A = \{n \in \mathbb{N} :\phi_e(n) \darrow\}\). If \(A = \emptyset\) it is automatically c.e.Otherwise, there is an element \(a \in A\). We may now define
\[\begin{aligned}f(x) = \begin{cases} (x)_0 & \text{if } T_1(e,(x)_0,(x)_1) \\ a & \text{otherwise} \end{cases}\end{aligned}\]\(f(x)\) thus treats its input as a pair \(\langle n,s \rangle\)consisting of an input \(n\) to \(\phi_e(x)\) and a computationsequence \(s\) as defined in the proof of theNormal Form Theorem (2.3). As \(x\) varies over \(\mathbb{N}\), it thus steps through allpossible inputs \((x)_0\) to \(\phi_e\) and also all possiblewitnesses \((x)_1\) to the fact that the computation of \(\phi_e\) on\((x)_0\) halts. It then returns \((x)_0\) if \((x)_1\) is such awitness to a halting computation and \(a\) otherwise. Thus the rangeof \(f(x)\) will correspond to that of \(\phi_e(x)\). And as\(T_1(e,x,s)\) is computable (and in fact primitive recursive)relation, it is easy to see that \(f(x)\) is a computable functionwith range \(A\). This shows that \(A\) is c.e. as desired.
Part iv ofProposition 3.2 also provides a convenient uniform notation for computably enumerablesets—i.e., if \(A = \textrm{dom}(\phi_e)\) we denote \(A\) by\(W_e = \{n : \phi_e(n) \darrow\}\). The sequence \(W_0,W_1, W_2,\ldots\) thus provides a uniform enumeration of c.e. setsrelative to our prior enumeration of unary partial computablefunctions. This notation also aids the formulation of thefollowing:
Proposition 3.3:
The computably enumerable sets are effectively closed under union,intersection, and cross product—i.e., there are computablefunctions \(\textit{un}(x,y),\) \(\textit{int}(x,y)\) and\(\textit{cr}(x,y)\) such that if \(A = W_i\) and \(B = W_j\) then
\[A \cup B = W_{\textit{un}(i,j)}, A \cap B = W_{\textit{int}(i,j)}\]and
\[\{\langle x,y \rangle : x \in A \ \& \ y \in B\} = W_{\textit{cr}(i,j)}.\]The computable sets are additionally closed under complementation andrelative complementation—i.e., if \(A\) and \(B\) are recursive,then so are \(\overline{A}\) and \(A - B\).
The proofs of these facts are also straightforward upon appeal toChurch’s Thesis. For instance, if \(\textrm{dom}(\phi_i) = A\)and \(\textrm{dom}(\phi_j) = B\) then \(\textit{un}(i,j)\) can betaken to be an index for a program which simulates the computation of\(\phi_i(n)\) and \(\phi_j(n)\) in alternate stages and halts just incase one of these subcomputations halt. Note also that if \(A = W_i\)is computable, then \(\chi_{\overline{A}}(x) = 1 \dotminus \chi_A(x)\)is also computable, from which it follows that \(\overline{A}\) is computable.[23]
A related observation is the following:
Proposition 3.4 (Post 1944): \(A\) is computable ifand only if \(A\) and \(\overline{A}\) are both computablyenumerable.
The left-to-right direction is subsumed underProposition 3.3. For the right-to-left direction, suppose that \(A =\textrm{dom}(\phi_i)\) and \(\overline{A} = \textrm{dom}(\phi_j)\).Then to decide \(n \in A\) we can perform an unbounded search for acomputation sequence \(s\) such that either \(T_1(i,n,s)\) or\(T_1(j,n,s)\), accepting in the first case and rejecting in thesecond. Since \(A \cup \overline{A} = \mathbb{N}\), the search mustalways terminate and since \(A \cap \overline{A} = \emptyset\), theconditions are exclusive. Thus by again appealing to Church’sThesis, \(A\) is computable.
We have seen that the computable sets are contained in the computablyenumerable sets. Two questions which arise at this stage are asfollows:
A positive answer to both is provided by the following:
Corollary 3.1: Recall the set \(K = \{i : \phi_i(i)\darrow\}\)—i.e., the so calledDiagonal HaltingProblem. \(K\) is computably enumerable but not computable while\(\overline{K}\) is not computably enumerable.
\(K\) is clearly c.e. as it is the domain of \(\mu sT_1(x,x,s)\). On the other hand, we have seen that the characteristicfunction of \(K\)—i.e., the function \(\chi_K(x) = k(x)\) asdefined inSection 3.2—is not computable. Thus \(K\) is indeed a computably enumerable setwhich is not computable. To see that \(\overline{K}\) is not c.e.,observe that if it were, then \(K\) would be computable byProposition 3.4. This in turn suggests a sense in which it is “harder” todecide membership in \(K\) than in any computable set. The hierarchiesintroduced inSections 3.5 andSection 3.6 will provide a means of making such observations precise.
The result which is now most often referred to asKleene’sRecursion Theorem can be used to unify a number of effectivediagonal arguments similar to that underlyingTheorem 3.3 and has a wide range of applications both in computability theory andother areas of mathematical logic and computer science.[24] Although its statement is straightforward, both its significance andthe following proof become clearer upon considering subsequentapplications.
Theorem 3.5 (Kleene 1938): Suppose that \(f(x)\) is atotal computable function. Then there is a number \(n \in \mathbb{N}\)such that \(\phi_n(y) \simeq \phi_{f(n)}(y)\).
Proof: Consider the function \(g(x,y)\) defined as follows:
\[\begin{aligned}g(x,y) = \begin{cases} \phi_{\phi_x(x)}(y) & \text{if } \phi_x(x) \darrow \\ \uarrow & \text{otherwise} \end{cases} \end{aligned}\]As it is evident that \(g(x,y)\) is partial computable, \(g(x,y)\simeq \phi_e(x,y)\) for some \(e\). It thus follows by thes-m-n Theorem (3.1) that \(\phi_e(x,y) \simeq \phi_{s^2_1(e,x)}(y)\). Let \(b(x) =s^2_1(e,x)\) and note that we then have \(\phi_{b(x)}(y)\) is the samefunction as \(\phi_{\phi_x(x)}(y)\) provided that \(\phi_x(x)\) isdefined. Note that \(b(x)\) is a total computable function and isdefined independently of the given function \(f(x)\).
Next let \(k\) be an index for the composition of \(f(x)\) with\(b(x)\)—i.e., \(\phi_k(x) \simeq f(b(x))\). We now claim that\(n = b(k)\) is the number called for in the statement of the theorem.For first note that since \(b(x)\) and \(f(x)\) are both total,\(\phi_k(x)\) is also total and thus \(\phi_k(k)\) is defined. Fromthis it follows that \(\phi_{b(k)}(y) \simeq \phi_{\phi_k(k)}(y)\). Wenow have the following sequence of functional identities:
\[\phi_n(y) \simeq \phi_{b(k)}(y) \simeq \phi_{\phi_k(k)}(y) \simeq \phi_{f(b(k))}(y) \simeq \phi_{f(n)}(y)\]□
The Recursion Theorem is sometimes also referred to as theFixedPoint Theorem. Note, however, thatTheorem 3.5 does not guarantee the existence of an extensional fixed point forthe given function \(f(x)\)—i.e., a number \(n\) such that\(f(n) = n\). (In fact it is evident that there are computablefunctions for which no such value exists—e.g.,\(f(x)= x+1\).) ?But suppose we view \(f(x)\)instead as a mapping on indices to partial computable functions or,more figuratively, as a means of transforming aprogram forcomputing a partial computable function into another program. On thisinterpretation, the theorem expresses that for every such computabletransformation there is some program \(n\) such that the function\(\phi_n(y)\) which it computes is the same as the function\(\phi_{f(n)}(y)\) computed by its image \(f(n)\) under thetransformation.
As it may at first appear such an \(n\) is defined in a circularmanner, it is alsoprima facie unclear why such a programmust exist. Indeed Soare (2016: 28–29) remarks that theforegoing proof of the Recursion Theorem is “very short butmysterious” and is “best visualized as a diagonal argumentthat fails”. In order to clarify both this comment and theproof, consider the matrix depicted in Figure 1 whose rows \(R_i\)enumerate not the values of partial computable functions but ratherthe functions themselves—i.e., the row \(R_i\) will contain thefunctions \(\phi_{\phi_i(0)}, \phi_{\phi_i(1)}, \ldots\) with theunderstanding that if \(\phi_i(j) \uarrow\), then \(\phi_{\phi_i(j)}\)denotes the totally undefined function. (Such a depiction isoriginally due to Owings 1973.)
Figure 1: The matrix of partialcomputable functions employed in the proof of the Recursion Theorem (3.5)
We may think of the function \(f(x)\) given inTheorem 3.5 as inducing a transformation on the rows so that \(R_i\) is mapped to\(R_{f(i)}\). To this end, let \(h_f(x)\) be an index to the totalcomputable function which composes \(f\) with \(\phi_x\) so that wehave
\[\begin{aligned}\phi_{h_f(x)}(y) \simeq f(\phi_x(y))\end{aligned}\]Next consider the diagonal of this matrix—i.e., \(D =\phi_{\phi_0(0)}, \phi_{\phi_1(1)}, \ldots\) Since the indices to thefunctions which comprise \(D\) are given effectively, it must be thecase that \(D\) itself corresponds to some row \(R_d\)—i.e.,
\[\begin{align} \label{dr} \phi_{\phi_d(i)}(y) \simeq \phi_{\phi_i(i)}(y) \text{ for all } i \in \mathbb{N}\end{align}\]But now consider the image of \(R_d\) under \(f\)—i.e., the row\(R_{h_f(d)} = \phi_{\phi_{h_f(d)}(0)}, \phi_{\phi_{h_f(d)}(1)},\ldots\) It follows from (\ref{dr}) that we must have
\[\begin{equation} \label{lastrecthm1}\phi_{\phi_d(h_f(d))}(y) \simeq \phi_{\phi_{h_f(d)}(h_f(d))}(y)\end{equation}\]But note that by the definition of \(h_f\), \(\phi_{h_f(d)}(h_f(d)) =f(\phi_d(h_f(d))\) and thus also from (\ref{lastrecthm1})
\[\begin{equation} \label{lastrecthm2}\phi_{\phi_d(h_f(d))}(y) \simeq \phi_{f(\phi_d(h_f(d))}(y)\end{equation}\]But now note that since \(f,\phi_d\) and \(h_f\) are all total, thevalue \(\phi_d(h_f(d))\) is defined. Thus setting \(n =\phi_d(h_f(d))\) it follows from (\ref{lastrecthm2}) that \(\phi_n(y)\simeq \phi_{f(n)}(y)\) as desired.
As mentioned above, the Recursion Theorem may often be used to presentcompact proofs of results which would traditionally be described asinvolvingself-reference. For instance, an immediateconsequence is that for every \(f(x)\) there is an \(n\) such that\(W_n = W_{f(n)}\). To see this note thatTheorem 3.5 entails the existence of such an \(n\) such that \(\phi_n(x) \simeq\phi_{f(n)}\) for every computable \(f(x)\). But since the domains ofthe functions must then coincide, it follows that \(W_n =W_{f(n)}\).
It is useful to record the following alternative form of the RecursionTheorem:
Corollary 3.2: For every partial computable function\(f(x,y)\), there is an index \(n\) such that \(\phi_n(y) \simeqf(n,y)\).
Proof: By thes-m-n Theorem (3.1), \(f(x,y) \simeq \phi_{s^2_1(i,x)}(y)\) for some \(i\). But then theexistence of the required \(n\) follows by applyingTheorem 3.5 to \(s^2_1(i,x)\). □
Here are some easy consequences in the spirit described above whichmake use of this formulation:
There is a number \(n\) such that \(\phi_n(x) = x + n\). (This followsby taking \(f(x,y) = y + x\) inCorollary 3.2. Analogous observations yield the existence of \(n\) such that\(\phi_n(x) = x \times n, \phi_n(x) = x^n\), etc.)
There is a number \(n\) such that \(W_n = \{n\}\). (Take
\[f(x,y) = \begin{cases}0 & \text{if } x = y \\ \uarrow & \text{otherwise} \end{cases}\]inCorollary 3.2.)
Consider a term \(\tau\) corresponding to a “program”which determines the partial computable program with index \(\ulcorner\tau \urcorner\) (as described inSection 2.2.2). We say that such a program isself-reproducing if for allinputs \(x\), the computation of \(\tau\) on \(x\) outputs \(\ulcorner\tau \urcorner\). Since in order to construct \(\tau\) it would seemthat we need to know \(\ulcorner \tau \urcorner\) in advance, it mightappear that self-reproducing programs need not exist. Note, however,that transposed back into our official terminology, the existence ofsuch a program is equivalent to the existence of a number \(n\) suchthat \(\phi_n(x) = n\). And this is guaranteed by applyingCorollary 3.2 to the function \(f(x,y) = x\).
For further discussions of the Recursion Theorem in regard toself-reference and more advanced applications in computability theorysee, e.g., Cutland (1980: ch. 11), Rogers (1987: ch. 11), Odifreddi(1989: ch. II.2), and Y. Moschovakis (2010).
Before leaving the Recursion Theorem, it will finally be useful toreflect on how it bears on the general concept of recursivedefinability as discussed inSections 1 and 2. Consider, for instance, a simple definition such as
\[\begin{align} \label{recex}h(0) & = k \\ \nonumberh(y+1) & = g(h(y))\end{align}\]In the case that \(g(y)\) is primitive recursive, we have remarkedthat it is possible to show that there exists a unique function\(h(y)\) satisfying (\ref{recex}) by an external set-theoreticargument. But we may also consider the case in which \(g(y)\) isassumed to be computable relative to a model of computation\(\mathbf{M}\) which differs from the primitive recursive functions inthat it does not natively support recursion as a mode ofcomputation—e.g., the Turing Machine model \(\mathbf{T}\) orUnlimited Register Machine model \(\mathbf{U}\). If we simply set down(\ref{recex}) as a definition in this case, we would have noapriori assurance that \(h(y)\) is computable relative to\(\mathbf{M}\) even if \(g(x)\) is.
Upon examination, however, it is clear that the only features of amodel of computation on which the proof ofTheorem 3.5 relies are the existence of an indexation for which a version of thes-m-n Theorem (3.1) is available. If \(\mathbf{M}\) satisfies these conditions, the claimthat \(h(y)\) is computable relative to \(\mathbf{M}\) is equivalentto \(h(y) \simeq \phi_n(y)\) where \(n\) is an index drawn from somesuitable indexation of the \(\mathbf{M}\)-computable functions. Butsince thes-m-n Theorem for \(\mathbf{M}\) allows us to treatan index as a variable, we can also consider the function defined by
\[\begin{aligned}f(x,0) & = k \\ \nonumberf(x,y+1) & = g(\phi_x(y))\end{aligned}\]Now note that the existence of an \(n\) such that \(f(n,y) \simeq\phi_n(y)\) is again guaranteed byCorollary 3.2. This in turn yields
\[\begin{aligned}\phi_n(0) & = k \\ \nonumber\phi_n(y+1) & = g(\phi_n(y))\end{aligned}\]This illustrates how the existence of a computable function satisfyinga recursive definition such as (\ref{recex}) follows from theRecursion Theorem even if we have not started out by characterizing a“computable function” as one defined“recursively” in the informal sense discussed inSection 1. And this in turn helps to explain whyTheorem 3.5 has come to be known as theRecursion Theorem.
A central topic in contemporary computability theory is the study ofrelative computability—i.e., if weassume thatwe are able to decide membership in a given set or compute a givenfunction, which other sets or functions would we be able to decide orcompute? This question may be studied using the notion of areduction of one set \(A\) to another \(B\) which wasintroduced informally by Kolmogorov (1932) as a means of transforminga “solution” of \(A\) into a “solution” of \(B\).[25] Turing (1939) provided the first formal definition of a computationalreduction in his study of ordinal logics. However, it was Post whofirst proposed to systematically study reducibility notions and theirassociated degree structures in his highly influential paper“Recursively enumerable sets of positive integers and theirdecision problems” (1944).
Therein Post explains the basic idea of a reduction and itssignificance as follows:
Related to the question of solvability or unsolvability of problems isthat of the reducibility or non-reducibility of one problem toanother. Thus, if problem \(P_1\) has been reduced to problem \(P_2\),a solution of \(P_2\) immediately yields a solution of \(P_1\), whileif \(P_1\) is proved to be unsolvable, \(P_2\) must also beunsolvable. For unsolvable problems the concept of reducibility leadsto the concept of degree of unsolvability, two unsolvable problemsbeing of the samedegree of unsolvability if each isreducible to the other, one of lower degree of unsolvability thananother if it is reducible to the other, but that other is notreducible to it, of incomparable degrees of unsolvability if neitheris reducible to the other. A primary problem in the theory ofrecursively enumerable sets is the problem of determining the degreesof unsolvability of the unsolvable decision problems thereof. We shallearly see that for such problems there is certainly a highest degreeof unsolvability. Our whole development largely centers on the singlequestion of whether there is, among these problems, a lower degree ofunsolvability than that, or whether they are all of the same degree ofunsolvability. (Post 1944: 289)
In order to appreciate this passage, it is again useful to think of aset \(A \subseteq \mathbb{N}\) as being associated with theproblem of deciding membership in \(A\)—e.g., given anatural number \(n\), is \(n\) prime? (i.e., \(n \in\textit{PRIMES}\)?) or is the \(n\)th partial computable function withinput \(n\) defined? (i.e., \(n \in K\)?). But even given thiscorrespondence, the assertion that a solution to a problem \(B\)“immediately yields” a solution to \(A\) may still beanalyzed in a number of different ways. Two of the most importantpossibilities are as follows:
Assuming that there is an algorithm for deciding questions of the form\(n \in B\), then it is possible to specify an algorithm for decidingquestions of the form \(n \in A\).
Assuming that we had access to an “oracle”capable of answering arbitrary questions of the form \(n \in B\) in asingle step, then it is possible to specify an algorithm employing theoracle for deciding \(n \in A\).
The formalization of these relations between problems leads to thenotions ofmany-one reducibility andTuringreducibility which provide distinct but complementary analyses ofthe notions \(A\)is no harder to solve than \(B\) and alsothe degree of unsolvability (ordifficulty)of\(A\) is equal to that of \(B\).[26] The latter notion came first historically and was introduced byTuring (1939) and in an equivalent form by Kleene (1943). However itwas Post (1944) who both introduced the former notion and alsoinitiated the general study of Turing reducibility. In fact the finalsentence of the passage quoted above describes an important technicalquestion about the Turing degrees which would shape the earlydevelopment of computability theory (i.e., “Post’sproblem” given asQuestion 3.1 below).
We have already seen an example of many-one reducibility in the proofofRice’s Theorem (3.4). In particular, the proof showed that the problem of decidingmembership in \(K\) can be reduced to that of deciding membership inany non-trivial index set \(I\) in the following sense: for all \(n\),if \(n \in K\) then \(s^2_1(c,n) \in I\). Thus if there were analgorithm for deciding membership in \(I\), we would be able to decidewhether \(n \in K\) by using it to test whether \(s^2_1(c,n) \in I\).The function \(s^2_1(c,x)\) (whose computability is given by thes-m-n Theorem) is thus a so-calledmany-one reductionof \(K\) to \(I\).
The formal definition generalizes this example as follows:
Definition 3.3: Given sets \(A, B \subseteq\mathbb{N}\), \(A\) is said to bemany-one (orm-one)reducible to \(B\) if there is a computable function \(f(x)\)such that for all \(n \in \mathbb{N}\),
\[n \in A \text{ if and only if } f(n) \in B\]In this case we write \(A \leq_m B\).
Using this notation, the foregoing example thus shows that \(K \leq_mI\). These observations can be generalized as follows:
Proposition 3.5: Suppose that \(A \leq_m B\).
If \(B\) is computable, then so is \(A\).
If \(B\) is computably enumerable, then so is \(A\).
By contraposingProposition 3.5 it thus follows that in order to show that a set \(B\) isnon-computable (or non-c.e.) it suffices to show that there is a knownnon-computable (or non-c.e.) \(A\) such that \(A\) is many-onereducible to \(B\). For instance suppose that we had first proven thatthe Diagonal Halting Problem \(K = \{i : \phi_i(i) \darrow\} = A\) isnon-computable. Then in order to show that the Halting Problem \(\HP =\{\langle i,n \rangle : \phi_i(n) \darrow\} = B\) is alsonon-computable, it suffices to note that \(f(x) = \langle x,x\rangle\)—i.e., the computable pairing function of \(x\) withitself—is a many-one reduction showing \(K \leq_m \HP\).
Reducibility notions also typically come with an associated notion ofwhat it means for a designated set to becomplete relative toa class of sets—i.e., a set to which every set in the class maybe reduced and which is itself a member of the class. As an initialexample we have the following:
Definition 3.4: A set \(B\) is said to bemany-one (orm-)complete for the computablyenumerable sets just in case the following conditions hold:
\(B\) is computable enumerable;
For all computably enumerable sets \(A\), \(A \leq_m B\).
An obvious example of a complete c.e. set is \(\HP\). For since \(\HP= \{\langle i,n \rangle : \exists s T_1(i,n,s)\}\) and \(T_1(x,y,z)\)is a computable relation, it follows fromProposition 3.2 that \(\HP\) is c.e. And on the other hand, if \(A = W_i\), then \(n\in A\) if and only if \(\langle i,n \rangle \in \HP\) thus showingthat \(W_i \leq_m \HP\).
It is, nonetheless, standard to take \(K\) rather than \(\HP\) as thecanonical complete c.e. Although it might at first seem that \(K\)contains “less computational information” than \(\HP\), itis not hard to see that \(K\) is also such that every c.e. set ism-reducible to it. For supposing that \(A = W_i\), we maydefine a function
\[\begin{aligned} \label{redK}f(x,y) = \begin{cases} 1 & \text{ if } \phi_i(x) \darrow \text{ (i.e., $x \in A$)} \\\uarrow & \text{otherwise}\end{cases}\end{aligned}\]As \(f(x,y)\) is clearly partial computable, thes-m-n Theorem (3.1) gives a total recursive function \(s^2_1(i,x)\) such that \(f(x,y)\simeq \phi_{s^2_1(i,x)}(y)\). We then have
\[n \in A \ \Leftrightarrow \ \phi_i(n) \darrow \ \Leftrightarrow \ \phi_{s^2_1(i,n)}(s^2_1(i,n)) \darrow \ \Leftrightarrow \ s^2_1(i,n) \in K\]These biconditionals hold because \(\phi_i(n) \darrow\) just in case\(\phi_{s^2_1(i,n)}(y)\) is \(\const_1(x)\) (i.e., the constant1-function) as opposed to the everywhere undefined function just incase \(\phi_{s^2_1(i,n)}(s^2_1(i,n)) \darrow\). But as the latercondition is equivalent to \(s^2_1(i,n) \in K\), \(s^2_1(i,x)\) is amany-one reduction showing \(A \leq_m K\).
This illustrates a sense in which deciding membership in \(K\) canalso be understood as universal for computably enumerable sets or,alternatively, that there is no c.e. set which is any“harder” to solve than \(K\). Nonetheless, there areproblems that are harder to solve than \(K\) in the sense that theycould not be solved even if we possessed a decision algorithm for\(K\). For instance, it will follow from results given below thatwhile \(K\) ism-reducible to \(\TOT\), \(\TOT\) is notm-reducible to \(K\). This illustrates howm-reducibility can be used to study therelativedifficulty of solving computational problems.
These considerations lead naturally to the notion of adegree ofdifficulty—another concept which can be made precise withrespect to different reducibility notions. The version for many-onereducibility is given by the following definition:
Definition 3.5: If \(A\) and \(B\) are many-onereducible to each other—i.e., \(A \leq_m B\) and \(B \leq_mA\)—then we say that \(A\) and \(B\) aremany-oneequivalent and we write \(A \equiv_m B\).
It follows immediately fromDefinition 3.3 that \(\leq_m\) is reflexive. It is also clearly transitive. (For if\(f(x)\) and \(g(x)\) are computable functions which respectivelyserve as many-one reductions showing \(A \leq_m B\) and \(B \leq_mC\), then their composition \(f(g(x))\) is a many-one reductionshowing \(A \leq_m C\).) As it thus follows that \(\equiv_m\) is anequivalence relation, it also makes sense to define the following:
Definition 3.6: \(\textrm{deg}_m(A)\)—themany-one (orm-)degree of \(A\)—is theequivalence class of \(A\) with respect to \(\equiv_m\)—i.e.,\(\textrm{deg}_m(A) = \{B \subseteq \mathbb{N} : B \equiv_m A\}\). Wecall anm-degreecomputable if it contains acomputable set andc.e. if it contains a computablyenumerable set.
Them-degree \(\textrm{deg}(A)\) of \(A\) collects together allsets which are many-one equivalent to it. It can thus be thought of asan abstract representation of the relative difficulty of decidingmembership in \(A\) when this latter notion is in turn explicated interms ofm-reducibility. For instance, since our priorobservations show that \(\textrm{deg}_m(\HP) = \textrm{deg}_m(K)\),they are thus “equally difficult” c.e. degrees.
It is traditional to use boldface lower case Roman letters\(\mathbf{a},\mathbf{b}, \ldots\) to denote degrees (although itshould be kept in mind that these aresets of sets of naturalnumbers). We next define an ordering onm-degrees asfollows:
Definition 3.7: Let \(\mathbf{a}\) and \(\mathbf{b}\)bem-degrees. We then define
\(\mathbf{a} \leq_m \mathbf{b}\) just in case there is a set \(A \in\mathbf{a}\) and a set \(B \in \mathbf{b}\) such that \(A \leq_mB\).
\(\mathbf{a} <_m \mathbf{b}\) just in case \(\mathbf{a} \leq_m\mathbf{b}\) and \(\mathbf{a} \neq \mathbf{b}\).
It is easy to see that \(<_m\) is a partial order onm-degrees—i.e., irreflexive, antisymmetric, andtransitive. We accordingly introduce the structure \(\mathcal{D}_m =\langle \{\textrm{deg}_m(A) : A \subseteq \mathbb{N}\},<_m\rangle\) which is traditionally known as themany-one(orm-)degrees.
Together with the similar study of the Turing degrees (which will bedefined inSection 3.5.2), investigating the structure of \(\mathcal{D}_m\) has been a majorfocus of research in computability theory since the time ofPost’s (1944) introduction of the many-one degrees. Someproperties of this structure are as follows:
Proposition 3.6:
Them-degrees are not closed under complementation—i.e.,there exist sets \(A\) such that \(A \not\equiv_m \overline{A}\) andthus \(\overline{A} \not\in \textrm{deg}(A)\).
\(\mathbf{0} =_{\textrm{df}} \textrm{deg}_m(\emptyset) =\{\emptyset\}\) and \(\mathbf{n} =_{\textrm{df}}\textrm{deg}_m(\mathbb{N}) = \{\mathbb{N}\}\) are distinctm-degrees both of which are (trivially) computable.
There is exactly one computablem-degree \(\mathbf{0}_m\) otherthan \(\mathbf{0}\) and \(\mathbf{n}\)—i.e., \(\mathbf{0}_m =\textrm{deg}(A)\) for any computable set \(A \neq \emptyset, A\neq\mathbb{N}\). Additionally, \(\mathbf{0}_m\) is minimal in\(\mathcal{D}_m\) in the sense that \(\mathbf{0}_m \leq_m \mathbf{a}\)for all degrees \(\mathbf{a}\) other than \(\mathbf{0}\) and\(\mathbf{n}\).
If \(\mathbf{b}\) is a c.e. degree and \(\mathbf{a} \leq_m\mathbf{b}\), then \(\mathbf{a}\) is also a c.e. degree.
There is amaximum c.e. m-degree—i.e.,\(\textrm{deg}_m(K) =_{\textrm{df}} \mathbf{0}'_m\)—in the sensethat \(\mathbf{a} \leq \mathbf{0}'_m\) for all c.e. degrees\(\mathbf{a}\).
Any pair ofm-degrees \(\mathbf{a},\mathbf{b}\) have aleast upper bound \(\mathbf{c}\)—i.e., \(\mathbf{a}\leq_m \mathbf{c}\) and \(\mathbf{b} \leq_m \mathbf{c}\) and\(\mathbf{c}\) is \(\leq_m\)-less than any other upper bound of\(\mathbf{a}\) and \(\mathbf{b}\). Since we have seen that \(\leq_m\)is also a partial order, this implies that \(\mathcal{D}_m\) isadditionally anupper semi-lattice.
There exists a c.e. degree \(\mathbf{a}\) properly between\(\mathbf{0}_m\) and \(\mathbf{a} < \mathbf{0}'_m\)—i.e.,\(\mathbf{0}_m < \mathbf{a} < \mathbf{0}'_m\).
Post (1944) demonstrated part vii by showing that there existso-calledsimple sets—i.e., sets \(A\) which arec.e. and such that \(\overline{A}\) is infinite but does notcontain an infinite c.e. subset. It is easy to see that a simpleset cannot be computable. But on the other hand, Post also showed thata simple set cannot bem-complete. And it thus follows that if\(A\) is simple \(\mathbf{a} =_{\textrm{df}} \textrm{deg}_m(A) \neq\mathbf{0}_m\) but \(A \not\equiv_m K\) and thus \(\mathbf{a} <\mathbf{0}'_m\). Suppose we now understand “degrees ofunsolvability” in the passage quoted at the beginning of thissection as corresponding to the c.e. m-degrees. It thusfollows from part v ofProposition 3.6 that \(\mathbf{0}'_m\) is indeed a “highest” such degreeand also from part vii that there is a lower but still“unsolvable” (i.e., non-computable) degree.
Although the other parts ofProposition 3.6 have straightforward proofs, they provide some insight into the factthat \(\mathcal{D}_m\) is itself a highly complex structure (see,e.g., Odifreddi 1999b: 1). Nonetheless the first two parts of thistheorem are often taken to illustrate awkward features of the many-onedegrees as an abstract representation of computationaldifficulty—i.e., the exceptional behavior of\(\textrm{deg}_m(\emptyset)\) and \(\textrm{deg}_m(\mathbb{N})\) andthe fact a set and its complement may inhabit different degrees (as iseasy to see is exemplified by \(K\) and \(\overline{K}\)). It ispartly in light of these features that the Turing degrees\(\mathcal{D}_T\) are the structure which are now most widely studiedin computability theory. But as Post also alludes, it is relative to\(\mathcal{D}_T\) for which he was originally unable to demonstratethe existence of a c.e. set of an intermediate degree ofunsolvability.
The notion ofrelative computability mentioned at thebeginning of this section is now standardly analyzed in terms ofcomputability in a set \(A \subseteq \mathbb{N}\).Informally, we say that a function \(f(\vec{x})\) is computable in\(A\) just in case there exists an algorithm which is effective in thetraditional sense with the exception of the fact its computation mayrely on computing one or more values \(\chi_A(y)\). These values arein turn assumed to be available to the algorithm in a single step eventhough \(\chi_A(y)\) may not itself be computable—e.g., if \(A =K\).
This notion was originally introduced by Turing (1939) who describedwhat he referred to as anoracle (oro-)machine variant of the standard Turing Machine model\(\mathbf{T}\). An o-machine is otherwise like a normal Turing machinebut also possesses a read-onlyoracle tape (and correspondingread-only head) on which the characteristic function of a set \(A\) isassumed to be written at the beginning of its computation. Thetransitions of an o-machine are determined by its internal statetogether with the currently scanned symbols on both its read-writetape and the oracle tape, thus formalizing the idea that the machinemay “consult the oracle” about the characteristic functionof \(A\) one or more times during the course of its computation.[27]
Kleene (1943) described an analogous idea for thegeneralrecursive functions as follows:
A function \(\phi\) which can be defined from given functions\(\psi_1, \ldots, \psi_k\) by a series of applications of generalrecursive schemata we callgeneral recursive in the givenfunctions; and in particular, a function definableab initioby these means we callgeneral recursive. (Kleene 1943:44)
The former part of this characterization differs from the definitionof general recursiveness given inSection 1.5 in allowing that in addition to the initial functions \(\mathbf{0}\)and \(s(x)\), the functions \(\psi_1, \ldots, \psi_k\) can also enterinto systems of equations which define the function \(\phi\). Thiscorresponds to the assumption that the values of \(\psi_1, \ldots,\psi_k\) are available in the course of a computation without the needfor further calculation.
It is also possible to modify the definition of thepartialrecursive functions given inSection 2.2.1 to allow such relativization to an additional class of initialfunctions. Since relativization to a finite set of functions can beaccomplished by successive relativization to a single function and thegraph of a function can also be coded into a set, this is nowstandardly achieved as follows:
Definition 3.8: Given a set \(A \subseteq\mathbb{N}\), we define the class ofA-partial recursivefunctions \(\textbf{PartREC}^A\) to be the smallest class ofpartial functions containing the initial functions \(I_A =\{\mathbf{0},s,\pi^i_k,\chi_A(x)\}\) and closed under thefunctionals
\[\textit{Op}_{\textbf{PartREC}} = \{\mathcal{Comp}^i_j,\mathcal{PrimRec}_k,\mathcal{Min}_k\}.\]There are, of course, uncountably many subsets of the natural numbers.But for each such \(A \subseteq \mathbb{N}\), we may still understand\(\chi_A(x)\) as a new primitive functional symbol which can beemployed in constructing one of countably manyA-partialrecursive definitions in the manner discussed inSection 2.1.1. It is thus also possible to list off all of the unaryA-partial recursive functions relative to the codes of theirdefinitions to obtain a uniform enumeration
\[\begin{aligned}\phi_0^{A}(x), \phi_1^{A}(x), \phi^{A}_2(x), \ldots\end{aligned}\]and similarly for other arities. It is thus not difficult to see thatwe can thereby also obtain relativized versions of results like thes-m-n Theorem (3.1) and the Universality Theorem (3.2) as exemplified by the following:
Theorem 3.6: For all \(A \subseteq \mathbb{N}\),there is anA-partial computable function \(\upsilon\) which isuniversal in the sense that for all unaryA-partialcomputable functions \(f(\vec{x})\), there is an \(i \in \mathbb{N}\)such that \(\upsilon^{A}(i,x) \simeq f(x)\).
These observations in turn license the use of the expressioncomputable in \(A\) to describe both a function\(f(\vec{x})\) which isA-partial recursive and total and alsoa set \(B\) such that \(\chi_B(x)\) is computable in \(A\). We alsouse the expressioncomputably enumerable (c.e.)in\(A\) to describe a set \(B\) which is the range of anA-partial recursive function and the notation \(W^A_e\) todenote the domain of \(\phi^{A}_e(x)\). It is then straightforward tosee that many of our prior proofs aboutnon-computabilityalso carry over to the relativized setting—e.g., \(K^A = \{i :\phi^{A}_i(i)\darrow\}\) is an example of a set which is computablyenumerable in \(A\) but not computable in \(A\).
We may now state the definition ofTuring reducibility asfollows:
Definition 3.9: Given sets \(A, B \subseteq\mathbb{N}\), \(A\) is said to beTuring (or \(T\)-)reducible to \(B\) just in case \(A\) is computable in \(B\).In this case we write \(A \leq_T B\).
It is a consequence of this definition that \(A \leq_T B\) just incase \(\chi_A(x)\) coincides with the (total) \(B\)-computablefunction given by \(\phi^{B}_e(x)\) for some index \(e\). For instanceif we adopt Turing’s characterization of relative computability,we may think of \(e\) as describing a program for a machine which canconsult \(B\) as an oracle. In this case, \(A \leq_T B\) means that itis possible to decide if \(n \in A\) by carrying out the programdescribed by \(e\) on the input \(n\) which may in turn requireperforming queries to the oracle \(B\) during the course of itscomputation.
We may also define a notion of completeness with respect to \(\leq_T\)as follows:
Definition 3.10: We say that \(B\) isTuringcomplete if \(B\) is c.e. and all c.e. sets \(A\) aresuch that \(A \leq_T B\).
It is easy to see that \(A \leq_m B\) implies \(A \leq_T B\). (For if\(f(x)\) is am-reduction of \(A\) to \(B\), then consider theprogram which first computes \(f(n)\) and then, using \(B\) an asoracle, checks if \(f(n) \in B\), outputting 1 if so and 0 if not.) Itthus follows that \(K\) is also Turing complete—i.e., itembodies the maximum “degree of unsolvability” amongthe c.e. sets when this notion is understood in terms of Turingreducibility as well as many-one reducibility.
Such observations can be made precise by first defining the notion ofTuring equivalence:
Definition 3.11: If \(A\) and \(B\) are Turingreducible to each other—i.e., \(A \leq_T B\) and \(B \leq_TA\)—then we say that \(A\) and \(B\) areTuringequivalent and we write \(A \equiv_T B\).
As it is again easy to see that \(\equiv_T\) is an equivalencerelation, we may also define the notion ofTuring degree asfollows:
Definition 3.12: \(\textrm{deg}_T(A)\)—theTuring degree of \(A\)—is the equivalence class of\(A\) with respect to \(\equiv_T\)—i.e., \(\textrm{deg}_T(A) =\{B \subseteq \mathbb{N} : B \equiv_T A\}\).
We can now define an ordering on Turing degrees as follows:
Definition 3.13: Let \(\mathbf{a}\) and\(\mathbf{b}\) be Turing degrees. We then define
\(\mathbf{a} \leq_T \mathbf{b}\) just in case there is a set \(A \in\mathbf{a}\) and a set \(B \in \mathbf{b}\) such that \(A \leq_TB\).
\(\mathbf{a} <_T \mathbf{b}\) just in case \(\mathbf{a} \leq_T\mathbf{b}\) and \(\mathbf{a} \neq \mathbf{b}\).
As with them-degrees, we say that \(\mathbf{a}\) is acomputable Turing degree if it contains a computable set andacomputably enumerable (c.e.)degree if it containsa c.e. set. If we consider the structure
\[\mathcal{D}_T = \langle \{\textrm{deg}_T(A) : A \subseteq \mathbb{N}\},\leq_T\rangle\]—which is known as theTuring degrees—it is againeasy to see that \(\leq_T\) is a partial order. Some observationswhich illustrate the relationship between \(\mathcal{D}_T\) and themany-one degrees \(\mathcal{D}_m\) are as follows:
Theorem 3.7:
There is exactly one computable Turing degree denoted by\(\mathbf{0}_T = \textrm{deg}_T(\emptyset)\) (which is often written\(\mathbf{0}\) when there is no possibility of ambiguity with them-degrees). \(\mathbf{0}_T\) consists of all of the computablesets and is the unique minimum Turing degree.
For all sets \(A\), and \(A \equiv_T \overline{A}\) and thus also\(\textrm{deg}_T(A) = \textrm{deg}_T(\overline{A})\).
\(\textrm{deg}_T(K)\) is the maximum amongst all c.e. Turingdegrees.
For any sets \(A,B\), \(\textrm{deg}_m(A) \subseteq\textrm{deg}_T(A)\) and if
\[\textrm{deg}_m(A) \leq_m \textrm{deg}_m(B),\]then
\[\textrm{deg}_T(A) \leq_T \textrm{deg}_T(B).\]Since \(\emptyset\) and \(\mathbb{N}\) are both (trivially) computablesets, by part i) we have \(\textrm{deg}_T(\emptyset) =\textrm{deg}_T(\mathbb{N}) = \mathbf{0}_T\), unlike them-degrees. And also unlike them-degrees we have by partii that \(\textrm{deg}_T(A) = \textrm{deg}_T(\overline{A})\). (For ifwe can decide \(B\) via an algorithm which uses \(A\) an as oracle,then we can also decide it using \(\overline{A}\) as an oracle bysimply swapping the responses obtained in our former algorithm.)
The structures of both \(\mathcal{D}_T\) and the c.e. degrees
\[\mathcal{E}_T = \langle \{\textrm{deg}_T(A) : A \text{ is c.e.}\}, \leq_T\rangle\]have been extensively investigated since the 1950s. One of their mostbasic properties may be considered by first defining the operation ofsets
\[A \oplus B = \{2n : n \in A\} \cup \{2n+1 : n \in B\}.\]\(A \oplus B\) is called theeffective join of \(A\) and\(B\) as it encodes the “information” contained in \(A\)on the even members of \(A \oplus B\) and that contained \(B\) on itsodd members. \(A \oplus B\) is c.e. if both \(A\) and \(B\) are.Suppose we also define the operation
\[\textrm{deg}_T(A) \vee \textrm{deg}_T(B) =_{\textrm{df}} \textrm{deg}(A \oplus B)\]on the degrees \(\mathbf{a} = \textrm{deg}_T(A)\) and \(\mathbf{b} =\textrm{deg}_T(B)\). Then it is not difficult to see that \(\mathbf{a}\vee \mathbf{b}\) is theleast upper bound of \(\mathbf{a}\)and \(\mathbf{b}\) with respect to the structure \(\mathcal{D}_T\).Like them-degrees, \(\mathcal{D}_T\) and \(\mathcal{E}_T\)both form anupper semi-lattice—i.e., a partial orderin which least upper bounds always exist.[28]
Given \(A \subseteq \mathbb{N}\), we may also consider \(K^A =\{n :\phi^{A}_n(n) \darrow\}\)—i.e., the set considered above whichcorresponds to the Diagonal Halting Problem relativized to the oracle\(A\). \(K^A\) is referred to as thejump of \(A\) for whichwe also write \(A'\). This notation is also used to denote anoperation on Turing degrees by setting \(\mathbf{a}' =\textrm{deg}_T(A')\) for some representative \(A \in \mathbf{a}\). Thefollowing collects together several facts about the jump operation onboth sets and degrees:
Proposition 3.7: For any set \(A, B \subseteq\mathbb{N}\) with \(\textrm{deg}_T(A) = \mathbf{a}\) and\(\textrm{deg}_T(B) = \mathbf{b}\):
If \(A\) is computable, then \(K^A \equiv_T K\).
\(A'\) is c.e. in \(A\) but not computable in \(A\).
If \(A \leq_T B\), then \(A' \leq_T B'\) and if \(A \equiv_T B\), then\(A' \equiv_T B'\).
\(\mathbf{a} <_T \mathbf{a}'\)
If \(\mathbf{a} \leq_T \mathbf{b}\), then \(\mathbf{a}' \leq_T\mathbf{b}'\).
\(\mathbf{0}' \leq_T \mathbf{a}'\)
If \(B\) is c.e. in \(A\), then \(\mathbf{b} \leq_T\mathbf{a}'\).
Part ii ofProposition 3.7 records the fact that the basic result that \(K\) is c.e. butnot computable holds for computability relativized to any set \(A\).From this it follows that \(A <_T A'\) and thus also that theresult of iterating the jump operation on any set \(A\) yields asequence
\[\begin{aligned}A^{(0)} & = A, \\A^{(1)} & = \left(A^{(0)}\right)' = A', \\A^{(2)} & = \left(A^{(1)}\right)' = A'', \\\vdots \\A^{(i+1)} &= \left(A^{(i)}\right)', \\\vdots\end{aligned}\]for which \(A^{(0)} <_T A^{(1)} <_T A^{(2)} <_T \ldots\). Asbenchmarks in the Turing degrees we also define the sets
\[\begin{aligned}\emptyset^0 & = \emptyset, \\\emptyset' & = K, \\\emptyset'' & = K', \\\vdots \\\emptyset^{(i+1)} & = K^{(i)'}, \\\vdots\end{aligned}\]and the degrees \(\mathbf{0}^{(n)} =\textrm{deg}_T(\emptyset^{(n)})\). Note that the latter correspond toa linearly ordered sequence
\[\mathbf{0} <_T \mathbf{0}' < _T\mathbf{0}'' <_T \ldots <_T \mathbf{0}^{(n)} <_T \ldots\]Figure 2: The Turing degrees\(\mathcal{D}_T\). [Anextended text-based description of figure 2 is available.]
As depicted in Figure 2, it is possible to use this sequence toclassify many of the problems defined inSection 3.2:
\(\mathbf{0} = \textrm{deg}_T(\emptyset) = \{A : A \text{ iscomputable}\}\)
\(\mathbf{0}' = \textrm{deg}_T(K) = \textrm{deg}_T(\HP)\)
\(\mathbf{0}'' = \textrm{deg}_T(\TOT) =\textrm{deg}_T(\textit{FIN})\)
Such classifications illustrate how the position of a set within\(\mathcal{D}_T\) can be understood as a measure of how far away it isfrom being computable—i.e., of itsdegree ofunsolvability ordifficulty. However unlike otherconventional measurement scales, the structure of \(\mathcal{D}_T\) isneither simple nor is it always straightforward to discern. Someevidence to this effect was provided by the fact that the answer tothe following question was posed but left unanswered by Post (1944):[29]
Question 3.1 (Post’s Problem): Isthere a c.e. degree \(\mathbf{a}\) such that \(\mathbf{0} <_T\mathbf{a} <_T \mathbf{0}'\)?
Post’s problem was eventually answered in the positiveindependently by Friedberg (1957) and Muchnik (1956) who showed thefollowing:
Theorem 3.8: There are c.e. sets \(A\) and \(B\)such that \(A \nleq_T B\) and \(B \nleq_T A\). Thus if \(\mathbf{a} =\textrm{deg}_T(A)\) and \(\mathbf{b} = \textrm{deg}_T(B)\), then\(\mathbf{a} \nleq_T \mathbf{b}\) and \(\mathbf{b} \nleq_T\mathbf{a}\) and hence also \(\mathbf{0} <_T \mathbf{a} <_T\mathbf{0}'\) and \(\mathbf{0} <_T \mathbf{b} <_T\mathbf{0}'\).
The proof ofFriedberg-Muchnik Theorem (3.8) required the development of a new technique known as theprioritymethod (or also as theinjury method) which has become acentral tool in the subsequent development of computability theory.The method provides a means of constructing a c.e. set \(A\) witha certain property \(P\) which may be described as follows:
In the case ofTheorem 3.8, this technique is used to simultaneously construct the twoc.e. sets \(A\) and \(B\) of degree intermediate between\(\mathbf{0}\) and \(\mathbf{0}'\) by alternating between therequirements \(R_{2i}\) which entail that \(A \neq \{n : \phi^{B}_i(n)\darrow = 1\}\) at even stages to ensure \(A \nleq_T B\) andrequirements \(R_{2i+1}\) which entail that \(B \neq \{n :\phi^{A}_i(n) \darrow = 1\}\) at odd stages so as to ensure \(B\nleq_T A\).
Sophisticated application of the priority method have been employed incomputability theory from the 1960s onward to investigate thestructure of \(\mathcal{D}_T\) and \(\mathcal{E}_T\).[30] Some illustrative results which can be obtained either in this manneror more elementary techniques are as follows:
There are continuum (i.e., \(2^{\aleph_0}\)) many distinct Turingdegrees. In particular, although for a given degree \(\mathbf{a}\) theset of \(\mathbf{b}\) such that \(\mathbf{b} \leq_T \mathbf{a}\) iscountable, the set of \(\mathbf{b}\) such that \(\mathbf{a} <_T\mathbf{b}\) is uncountable (Kleene & Post 1954).
For every degree \(\mathbf{a} \not\equiv_T \mathbf{0}\), there existsa degree \(\mathbf{b}\) which isincomparable to\(\mathbf{a}\)—i.e., \(\mathbf{b} \nleq_T \mathbf{a}\) and\(\mathbf{a} \nleq_T \mathbf{b}\). Moreover, there is a set of\(2^{\aleph_0}\) pairwise incompatible degrees (Kleene & Post1954).
There areminimal degrees \(\mathbf{m}\)—i.e., degreesfor which there is no \(\mathbf{a}\) such that \(\mathbf{0} <_T\mathbf{a} <_T \mathbf{m}\) (Sacks 1963b). Thus in general\(<_T\) isnot a dense order. (But by fact vii below,there are not minimal c.e. degrees.)
There are pairs of degrees \(\mathbf{a}\) and \(\mathbf{b}\) which donot possess a greatest lower bound. Thus although \(\mathcal{D}_T\) isan upper semi-lattice, it is not a lattice (Kleene & Post 1954).The same is true of \(\mathcal{E}_T\) (Lachlan 1966).
Every countable partially ordered set can be embedded into\(\mathcal{D}_T\) (Thomason 1971). However this isnot trueof \(\mathcal{E}_T\) into which there are finite non-distributivelattices which cannot be embedded (Lachlan & Soare 1980).
There is a non-c.e. degree \(\mathbf{a} <_T \mathbf{0}'\)(Shoenfield 1960).
For any c.e. degrees \(\mathbf{a} <_T \mathbf{b}\), there is ac.e. degree \(\mathbf{c}\) such that \(\mathbf{a} <_T\mathbf{c} <_T \mathbf{b}\) (Sacks 1964). Thus unlike the Turingdegrees in general, the c.e. degreesare denselyordered.
For any c.e. degree \(\mathbf{a} >_T \mathbf{0}\), there areincomparable c.e. degrees \(\mathbf{b},\mathbf{c} <_T\mathbf{a}\) such that \(\mathbf{a} = \mathbf{b} \cup \mathbf{c}\)(Sacks 1963b).
Let \(\textrm{Th}({\mathcal{D}_T})\) be the first-order theory of thestructure \(\mathcal{D}_T\) in the language with the with \(\equiv_T\)and \(\leq_T\). Not only is \(\textrm{Th}({\mathcal{D}_T})\)undecidable (Lachlan 1968), it is fact many-one equivalent totruesecond-order arithmetic (Simpson 1977).
As is easily shown to be true of the join operation \(\mathbf{a} \vee\mathbf{b}\), the jump operation \(\mathbf{a}' = \mathbf{b}\) isdefinable in \(\mathcal{D}_T\) in the language with \(\equiv_T\) and\(\leq_T\) (Shore & Slaman 1999).
These properties attest to the complexity of \(\mathcal{D}_T\) as amathematical structure. A related question is whether\(\mathcal{D}_T\) isrigid in the following sense:
Question 3.2: Does there exist anon-trivialautomorphism of \(\mathcal{D}_T\)—i.e., a mapping \(\pi:\mathcal{D}_T \rightarrow \mathcal{D}_T\) which preserves \(\leq_T\)and is not the identity?
A negative answer to this question would show that the relation of\(\textrm{deg}_T(A)\) to other degrees uniquely determines the degreeof unsolvability of \(A\) relative to \(\mathcal{D}_T\). Recent workhas pointed in this direction (see, e.g., Slaman 2008). Nonetheless,at the time of the 2020 update to this entry,Question 3.2 remains a significant open problem in computability theory whoseorigins can be traced back to the original foundational work ofTuring, Post, and Kleene surveyed above.
The many-one degrees \(\mathcal{D}_m\) and the Turing degrees\(\mathcal{D}_T\) are sometimes referred to ashierarchies inthe sense that they determine an ordering on\(\mathcal{P}(\mathbb{N})\)—i.e., the set of subsets of thenatural numbers—in terms of relative computability. In a seriesof papers from the 1940s and 1950s, Kleene (initiating in 1943) andMostowski (initiating in 1947) realized that it was also possible toimpose another sort of ordering on \(\mathcal{P}(\mathbb{N})\) interms of the logical complexity of the simplest predicate whichdefines a set \(A \subseteq \mathbb{N}\) in the languages of first- orsecond-order arithmetic. This idea leads to what are known as thearithmetical andanalytical hierarchies, both ofwhich can be understood as classifying sets in terms of theirdefinitional (ordescriptive) complexity. As we willsee, the resulting classifications are related to those determinedrelative to \(\mathcal{D}_T\) in terms of relative computability. Theyare also similar in form to other definability hierarchies studied incomputational complexity theory (e.g., thepolynomial hierarchy) anddescriptive set theory (e.g., theBorel andprojective hierarchies).
Recall that according to the definitions given inSection 3.3, arelation \(R \subseteq \mathbb{N}^k\) is said to becomputable just in case its characteristic function\(\chi_R(\vec{x})\) is a computable function andcomputablyenumerable just in case it is the range of a computable function.In order to introduce the arithmetical hierarchy, it is useful toemploy an alternative characterization of computable and computablyenumerable relations in the form of a semantic analog to theproof-theoretic notion ofarithmetical representabilitydiscussed inSection 1.3.
Recall that thelanguage of first-order arithmetic\(\mathcal{L}_a\) contains the primitive symbols\(\{<,',+,\times,0\}\) respectively intended to denote the lessthan relation, successor, addition, and multiplication functions onthe natural numbers as well as the first natural number 0. Afirst-order arithmetical formula is one built up from theseexpressions using variables, propositional connectives, and thefirst-order quantifiers \(\forall x, \exists x\) where the variablesare intended to range over the natural numbers \(\mathbb{N}\). Recallalso that thestandard model of first-order arithmetic is thestructure \(\mathfrak{N} = \langle \mathbb{N},0,<,s,+,\times\rangle\) in which the symbols of \(\mathcal{L}_a\) receive theirintended interpretations. Finally we say that an\(\mathcal{L}_a\)-formula \(\varphi(\vec{x})\)defines arelation \(R \subseteq \mathbb{N}^k\) just in case \(R = \{\langlen_1,\ldots,n_k \rangle : \mathfrak{N} \models \varphi(n_1,\ldots,n_k)\}\).[31] For instance \(x < y \vee x = y\) defines the less-than-or-equalrelation \(\leq\), \(\exists y(y + y = x)\) defines the even numbers,and
\[\forall y \forall z(y \times z = x \rightarrow y = s(0) \vee y = x)\]defines the prime numbers.
Definition 3.14: A formula \(\varphi(\vec{x})\) of\(\mathcal{L}_a\) is said to be in the class \(\Delta^0_0\) if itcontains onlybounded first-order quantifiers—i.e.,those of the form \(\exists x(x < t \wedge \ldots)\) and \(\forallx(x < t \rightarrow \ldots)\) for \(t\) an \(\mathcal{L}_a\)-termnot containing \(x\). A formula is said to be in the class\(\Sigma^0_1\) if it is of the form \(\exists \vec{y}\varphi(\vec{x},\vec{y})\) for \(\varphi(\vec{x},\vec{y}) \in\Delta^0_0\) and to be in the class \(\Pi^0_1\) if it is of the form\(\forall \vec{y} \varphi(\vec{x},\vec{y})\) for\(\varphi(\vec{x},\vec{y}) \in \Delta^0_0\). Finally, a formula\(\varphi(\vec{x})\) is said to be in the class \(\Delta^0_1\) if itis semantically equivalent to both a \(\Sigma^0_1\)-formula\(\psi(\vec{x})\) and a \(\Pi^0_1\)-formula\(\chi(\vec{x})\)—i.e., \(\mathfrak{N} \models\varphi(\vec{n})\) iff \(\mathfrak{N} \models \psi(\vec{n})\) iff\(\mathfrak{N} \models \chi(\vec{n})\) for all \(\vec{n} \in\mathbb{N}^k\).
It is standard to extend this syntactic classification of formulas interms of quantifier complexity to sets and relations on the naturalnumbers which can be defined by a formula in a given class. Thus, forinstance, \(x < y \vee x = y\) is a \(\Delta^0_0\)-formula and therelation \(\leq\) on \(\mathbb{N} \times \mathbb{N}\) is accordinglysaid to be \(\Delta^0_0\). On the other hand, while \(\exists y(y + y= x)\) is a \(\Sigma^0_1\)-formula, the set of even numbers is alsodefined by \(\exists y < x(x = 0 \vee y + y = x)\). Thus this setis also classified as \(\Delta^0_0\) in virtue of the existence ofthis latter definition which is simpler in the sense measured by thearithmetical hierarchy.
The first step in relating such classifications tocomputability-theoretic notions is provided by the following:
Proposition 3.8:
A relation \(R \subseteq \mathbb{N}^k\) is computably enumerable ifand only if there is a \(\Sigma^0_1\)-formula which defines\(R(\vec{x})\).
A relation \(R \subseteq \mathbb{N}^k\) is computable if and only ifthere is a \(\Delta^0_1\)-formula which defines\(R(\vec{x})\).
Proposition 3.8 may be proved by directly showing that for each partial recursivefunction \(\phi_e(\vec{x})\) it is possible to construct acorresponding \(\mathcal{L}_a\)-formula \(\varphi(\vec{x})\) whoselogical structure mimics the steps in the definition of the former.This can be achieved by formalizing primitive recursion using anarithmetically definable coding of finite sequences and expressingminimization using an unbounded existential quantifier (see, e.g.,Kaye 1991: ch. 3). But it is also possible to obtainProposition 3.8 in a uniform manner by showing that there is a so-calleduniversal formula for \(\Sigma^0_1\). In order to specifysuch a formula, first note that it is possible to effectivelyenumerate all \(\Delta^0_0\)-formulas with \(k+1\) free variables as\(\psi^{k+1}_0(x,\vec{y}), \psi^{k+1}_1(x,\vec{y}), \ldots\) and thendefine a corresponding enumeration of \(\Sigma^0_1\)-formulas as\(\varphi^k_0(\vec{y}) = \exists x \psi_0(x,\vec{y}),\)\(\varphi^k_1(\vec{y}) = \exists x \psi_1(x,\vec{y}),\)…. Wethen have the following:
Theorem 3.9 (Kleene 1943): For all \(k\), thereexists a \(\Sigma^0_1\)-formula \(\sigma_{k,1}(x,\vec{y})\) such thatfor all \(\Sigma^0_1\)-formulas withk-free variables\(\varphi^k_e(\vec{y})\), the following biconditional
\[\sigma_{k,1}(e,\vec{m}) \leftrightarrow \varphi^k_e(\vec{m})\]holds in the standard model \(\mathfrak{N}\) for all \(\vec{m} \in\mathbb{N}^k\).
Theorem 3.9 can be demonstrated by first observing that the truth of a\(\Sigma^0_1\)-formula \(\varphi^k_e(\vec{x})\) is equivalent to\(\mathfrak{N} \models \psi^k_e(n,\vec{m})\) for some \(n \in\mathbb{N}\). Next note that the sequence of observations recorded inSection 2.1.2 suffices to show that all \(\Delta^0_0\)-definable relations areprimitive recursive. We may thus consider an algorithm which on input\(e,\vec{m}\) uses \(e\) to construct \(\psi^k_e(x,\vec{y})\) and thenperforms an unbounded search for an \(n\) such that\(\psi^k_e(n,\vec{m})\) holds. By an appeal to Church’s Thesis(which can, of course, be replaced by an explicit construction) thereis a computable function \(f(e)\) for which we have the following:
\[\mathfrak{N} \models \varphi^k_e(\vec{m}) \text{ if and only if } \mu s(T_k(f(e),\vec{m},s)) \darrow\]In order to construct the formula \(\sigma_{k,1}(e,\vec{y})\) promisedbyTheorem 3.9, observe that standard techniques from the arithmetization of syntaxallow us to obtain a \(\Delta^0_1\)-formula \(\tau_k(x,\vec{y},z)\)which defines the Kleene \(T\)-predicate \(T_k(x,\vec{y},z)\)introduced inSection 2.2.2. We may finally define \(\sigma_{k,1}(e,\vec{y}) = \exists z\tau_k(f(e),\vec{y},z)\). The first part ofProposition 3.8 now follows by letting \(e\) be such that\(\textrm{dom}(\phi^k_e(\vec{x})) = R\) and then taking\(\sigma_{k,1}(e_0,\vec{x}) \in \Sigma^0_1\) where \(e_0\) is suchthat \(f(e_0) = e\). This is often formulated as what is known as theEnumeration Theorem which can be compared toTheorem 3.2:
Proposition 3.9: A relation \(R \subseteq\mathbb{N}^k\) is computably enumerable if and only if there is anumber \(e\) (known as ac.e. index for \(R\)) such that\(R\) is defined by \(\exists z \tau_k(e,\vec{y},z)\).
The second part ofProposition 3.8 follows by observing that if \(R\) is recursive then both \(R\) and\(\overline{R}\) are c.e. Thus if \(e\) is a c.e. index for\(R\), then \(\overline{R}\) is defined by \(\neg \exists z\tau_k(e,\vec{x},z)\) which is equivalent to a \(\Pi^0_1\)-formulasince \(\tau_k(x,\vec{y},z) \in \Delta^0_1\).
The formula classes \(\Delta^0_1\) and \(\Sigma^0_1\) thus provide analternative arithmetical characterization of the computable andcomputably enumerable sets. These classes also define the lowestlevels of thearithmetical hierarchy which in full generalityis defined as follows:
Definition 3.15: In order to simplify notation, theclasses \(\Sigma^0_0\) and \(\Pi^0_0\) are both used as alternativenames for the class \(\Delta^0_0\). A formula is said to be in theclass \(\Sigma^0_{n+1}\) if it is of the form \(\exists \vec{y}\varphi(\vec{x},\vec{y})\) for \(\varphi(\vec{x},\vec{y}) \in\Pi^0_n\) and to be in the class \(\Pi_{n+1}\) if it is of the form\(\forall \vec{y} \varphi(\vec{x},\vec{y})\) for\(\varphi(\vec{x},\vec{y}) \in \Sigma^0_n\). A formula\(\varphi(\vec{x})\) is \(\Delta^0_{n+1}\) if it is semanticallyequivalent to both a \(\Sigma^0_{n+1}\)-formula \(\psi(\vec{x})\) anda \(\Pi^0_{n+1}\)-formula \(\chi(\vec{x})\).
It thus follows that a formula is \(\Sigma^0_{n}\) just in case it isof the form
\[\exists \vec{x}_1 \forall \vec{x}_2 \exists \vec{x}_3 \ldots \mathsf{Q} \vec{x}_n \varphi(\vec{x}_1,\vec{x}_2,\vec{x}_3,\ldots,\vec{x}_n)\](where there are \(n\) alternations of quantifier types and\(\mathsf{Q}\) is \(\forall\) if \(n\) is even and \(\exists\) if\(n\) is odd). Similarly a \(\Pi^0_n\)-formula is of the form
\[\forall \vec{x}_1 \exists \vec{x}_2 \forall \vec{x}_3 \ldots \mathsf{Q} \vec{x}_n \varphi(\vec{x}_1,\vec{x}_2,\vec{x}_3,\ldots,\vec{x}_n).\]The notations \(\Sigma^0_n\), \(\Pi^0_n\), and \(\Delta^0_n\) are alsostandardly used to denote the classes of sets and relations which aredefinable by a formula in the corresponding syntactic class. Forinstance it follows from the second part ofProposition 3.8 that \(\textit{PRIMES}\) is \(\Delta^0_1\) (since it is computable)and from the first part that \(\HP\) and \(K\) are \(\Sigma^0_1\)(since they are c.e.). It thus follows that their complements\(\overline{HP}\) and \(\overline{K}\) are both \(\Pi^0_1\). It isalso not hard to see that \(\TOT\) is \(\Pi^0_2\) as the fact that\(\phi_x(y)\) is total may be expressed as \(\forall y \exists z\tau_1(x,y,z)\) by using the arithmetized formulation of the\(T\)-predicate introduced above. Similarly, \(\textit{FIN}\) is\(\Sigma^0_2\)-definable since the fact that \(\phi_x(y)\) is definedfor only finitely many arguments is expressible as \(\exists u \forally\forall z(u < y \rightarrow \neg \tau_1(x,y,z))\).
It is a consequence of the Prenex Normal Form Theorem for first-orderlogic that every \(\mathcal{L}_a\)-formula \(\varphi(\vec{y})\) isprovably equivalent to one of the form \(\mathsf{Q}_1 x_1 \mathsf{Q}_2x_2 \ldots \mathsf{Q}_{n} \varphi(\vec{x},\vec{y})\) for\(\mathsf{Q}_i \equiv \exists\) or \(\forall\) (e.g., Boolos, Burgess,& Jeffrey 2007: ch. 19.1). It thus follows that up to provableequivalence, every such formula is \(\Sigma^0_n\) or \(\Pi^0_n\) forsome \(n \in \mathbb{N}\). Since it is conventional to allow thatblocks of quantifiers may be empty in theDefinition 3.15, it follows that
\[\Sigma^0_n \subseteq \Delta^0_{n+1} \subseteq \Sigma^0_{n+1}\]and
\[\Pi^0_n \subseteq \Delta^0_{n+1} \subseteq \Pi^0_{n+1}.\]The fact that these inclusions are strict is a consequence of theso-calledHierarchy Theorem, a simple form of which may bestated as follows:
Theorem 3.10 (Kleene 1943): For all \(n \geq 1\),there exists a set \(A \subseteq \mathbb{N}\) which is\(\Pi^0_n\)-definable but not \(\Sigma^0_n\)-definable and hence alsoneither\(\Sigma^0_m\)- nor\(\Pi^0_m\)-definable for any \(m < n\). It thus follows that\(\overline{A}\) is \(\Sigma^0_n\)-definable but not\(\Pi^0_n\)-definable and hence alsoneither\(\Sigma^0_m\)- nor\(\Pi^0_m\)-definable for any \(m < n\).
It is again possible to proveTheorem 3.10 by a direct syntactic construction. For instance, building on thedefinition of the universal \(\Sigma^0_1\)-predicate\(\sigma_{k,1}(\vec{y})\), it may be shown that for every level\(\Sigma^0_n\) of the arithmetical hierarchy, there is a\(\Sigma^0_n\)-formula \(\sigma_{k,n}(x,\vec{y})\) which defines\(\Sigma^0_n\)-satisfaction in the standard model in thesense that
\[\begin{aligned}\mathfrak{N} \models \sigma_{k,n}(\ulcorner \varphi(y) \urcorner,\vec{m}) \leftrightarrow \varphi(\vec{m}) \end{aligned}\]for all \(\varphi(\vec{x}) \in \Sigma^0_n\) and \(\vec{m} \in\mathbb{N}^k\) (and where we have also defined our Gödelnumbering to agree with the indexation of \(\Sigma^0_n\)-formulasintroduced above). Now consider the \(\Pi^0_n\)-formula \(\lambda(x) =\neg \sigma_{2,n}(x,x) \in \Pi^0_n\) and let \(A\) be the set definedby \(\lambda(x)\). A standard diagonal argument shows that \(A\)cannot be \(\Sigma^0_n\)-definable and also that if \(\ulcorner\sigma_{2,n}(x,x) \urcorner = l\) in the enumeration of\(\Sigma^0_n\)-formulas then \(\neg \sigma_{2,n}(l,l)\) is a\(\Pi^0_n\)-formula which cannot be provably equivalent to a\(\Sigma^0_k\)-formula (see, e.g., Kaye 1991: ch. 9.3). Thus as Kleene(1943: 64) observed, part of the significance of the Hierarchy Theoremis that it illustrates how theLiar Paradox may be formalized to yield a stratified form of Tarski’sTheorem on the undefinability of truth (see the entry onself-reference).
We may also define a notion of completeness with respect to the levelsof the arithmetical hierarchy as follows: \(A\) is\(\Sigma^0_n\)-complete if \(A\) is \(\Sigma^0_n\)-definableand for all \(\Sigma^0_n\)-definable \(B\), we have \(B \leq_m A\)(and similarly for \(\Pi^0_n\)-complete). It is not hard toshow that in addition to being many-one complete, \(K\) is also\(\Sigma^0_1\)-complete. Similarly \(\overline{K}\) is\(\Pi^0_1\)-complete, \(INF\) is \(\Sigma^0_2\)-complete, and \(TOT\)is \(\Pi^0_2\)-complete. These observations can be subsumed under amore general result which relates the arithmetical hierarchy to theTuring degrees and from whichTheorem 3.10 can also be obtained as a corollary.
Theorem 3.11 (Post 1944):
\(A\) is \(\Sigma^0_{n+1}\)-definable iff \(A\) is computablyenumerable in some \(\Pi^0_n\)-definable set iff \(A\) is computablyenumerable in some \(\Sigma_n\)-definable set.
\(\emptyset^{(n)}\) is \(\Sigma^0_n\)-complete for all \(n >0\).
\(B\) is \(\Sigma^0_{n+1}\)-definable if and only if \(B\) iscomputably enumerable in \(\emptyset^{(n)}\).
\(B\) is \(\Delta^0_{n+1}\)-definable if and only if \(B \leq_T\emptyset^{(n)}\).
The various parts ofTheorem 3.11 follow from prior definitions together with Propositions3.2 and3.7. Note in particular that it follows from parts ii and iv ofTheorem 3.11 together with part vii ofProposition 3.7 that \(\emptyset^{(n)}\) is an example of a set in the class\(\Sigma^0_n - \Pi^0_n\) from which it also follows that\(\overline{\emptyset^{(n)}} \in \Pi^0_n - \Sigma^0_n\). Thisobservation in turn strengthens the Hierarchy Theorem (3.10) by showing that \(\Delta^0_n \subsetneq \Sigma^0_n\) and \(\Delta^0_n\subsetneq \Pi^0_n\) as depicted in Figure 3.
Figure 3: The Arithmetical Hierarchy.[Anextended text-based description of figure 3 is available.]
Part iv ofTheorem 3.11 can also be understood as generalizingProposition 3.4 (i.e., Post’s Theorem). In particular, it characterizes the\(\Delta^0_{n+1}\)-definable sets as those sets \(B\) such that both\(B\) and \(\overline{B}\) are computably enumerable in some\(\Sigma^0_n\)-complete set such as \(\emptyset^{(n)}\). Restrictingto the case \(n = 1\), this observation can also be used to provide anindependent computational characterization of the\(\Delta^0_2\)-definable sets, extending those given for the\(\Sigma^0_1\)-definable and \(\Delta^0_1\)-definable sets byProposition 3.8.
Definition 3.16: A set \(A\) is said to belimitcomputable if there is a computable sequence of finite sets\(\{A^s : s \in \mathbb{N}\}\) such that
\[n \in A \text{ if and only if } \textrm{lim}_s A^s(n) = 1\]where \(\lim_s A^s(n) = 1\) means that \(\lim_s \chi_{A_s}(n)\) existsand is equal to 1.
If \(A\) is c.e., then it is clear that \(A\) is limit computable. Forif \(A\) is the range of a computable function \(\phi_e(x)\), then wemay take \(A^s\) to be \(\{\phi_e(0), \ldots, \phi_e(s)\}\) in whichcase \(A^0 \subseteq A^1 \subseteq A^2 \subseteq \ldots\) In thegeneral case of limit computability, the sequence of sets \(\{A^s : s\in \mathbb{N}\}\) may be thought of as an approximation of \(A\)which need not grow monotonically in this way but can rather both growand shrink as long as there is always a stage \(s\) such that for all\(s \leq t\), \(n \in A^t\) if \(n \in A\) and \(n \not\in A^t\) if\(n \not\in A\). Following Putnam (1965), a limit computable set canalso thus also be described as a so-calledtrial-and-errorpredicate—i.e., one for which membership can be determinedby following a guessing procedure which eventually converges to thecorrect answer to the questions of the form \(n \in A\)?
The following is traditionally referred to asThe LimitLemma:
Theorem 3.12 (Shoenfield 1959): The following areequivalent:
\(A\) is limit computable.
\(A \leq_T \emptyset'\)
We have seen that part iv ofProposition 3.11 characterizes the sets Turing reducible to \(\emptyset'\) as the\(\Delta^0_2\)-definable sets.Theorem 3.12 thus extends the characterizations of the computable (i.e.,\(\Delta^0_1\)-definable) and computably enumerable (i.e.,\(\Sigma^0_1\)-definable) sets given inProposition 3.8 by demonstrating the coincidence of the \(\Delta^0_2\)-definable setsand those which are limit computable.
Kleene introduced what is now known as theanalyticalhierarchy in a series of papers (1955a,b,c) which built directlyon his introduction of the arithmetical hierarchy in 1943. Hisproximal motivation was to provide a definability-theoreticcharacterization of the so-calledhyperarithmeticalsets—i.e., those which are computable from transfiniteiterates of the Turing jump through the constructive ordinals. On theother hand, Mostowski (1947) had already noticed similarities betweenthe arithmetical hierarchy of sets of natural numbers and resultsabout hierarchies ofpoint sets studied in descriptive settheory—i.e., sets of elements ofPolish spaces(complete, separable metrizable spaces such as the real numbers,Cantor space, or Baire space)—which have their origins in thework of Borel, Lebesgue, Lusin, and Suslin in the early twentiethcentury. Beginning in his PhD thesis under Kleene, Addison (1954)refined Mostowski’s comparisons by showing that Kleene’sinitial work could also be used to provide effective versions ofseveral classical results in this tradition. We present here thefundamental definitions regarding the analytical hierarchy togetherwith some of some results illustrating how it is connected it to theseother developments.
Note that in the general case a formula\(\varphi(x_1,\ldots,x_j,X_1,\ldots, X_k)\) of \(\mathcal{L}^2_a\) mayhave both free number variables \(x_1,\ldots, x_j\) and free setvariables \(X_1,\ldots,X_k\). If \(R \subseteq \mathbb{N}^j \times\mathcal{P}(\mathbb{N})^k\) is defined by such a formula, then it issaid to beanalytical. Kleene (1955a) proved a normal formtheorem for analytical relations which shows that if \(R\) isanalytical then it is definable by an \(\mathcal{L}^2_a\)-formula ofthe form
\[\forall X_1 \exists X_2 \forall X_3 \ldots \mathsf{Q} X_n \psi(X_1,X_2,X_3,\ldots,X_n)\]or
\[\exists X_1 \forall X_2 \exists X_3 \ldots \mathsf{Q} X_n \psi(X_1,X_2,X_3,\ldots,X_n)\]where \(\psi(\vec{X})\) contains only number quantifiers and\(\mathsf{Q}\) is \(\forall\) or \(\exists\) depending on where \(n\)is even or odd. It thus possible to classify both\(\mathcal{L}^2_a\)-formulas and the sets they define into classes asfollows:
Definition 3.18:
We denote by both \(\Sigma^1_0\) and \(\Pi^1_0\) the class of\(\mathcal{L}^2_a\)-formulas containing no set quantifiers (i.e., aso-calledarithmetical formulas). An \(\mathcal{L}^2_a\)formula is in the class \(\Sigma^1_{n+1}\) if it is of the form\(\exists X \psi(X)\) where \(\psi \in \Pi^1_n\) and a relation is\(\Sigma^1_{n+1}\)-definable if it is defined by a\(\Sigma^1_{n+1}\)-formula. Similarly an \(\mathcal{L}^2_a\)-formulais in the class \(\Pi^1_{n+1}\) if it is of the form \(\forall X\psi(X)\) where \(\psi \in \Sigma^1_n\) and a relation is\(\Pi^1_{n+1}\)-definable if it is defined by a\(\Pi^1_{n+1}\)-formula. A relation is\(\Delta^1_n\)-definable just in case it is definable by botha \(\Sigma^1_n\)- and a \(\Pi^1_n\)-formula.
It hence follows that, as in the case of the arithmetical hierarchy,we have
\[\Sigma^1_n \subseteq \Delta^1_{n+1} \subseteq \Sigma^1_{n+1}\]and
\[\Pi^1_n \subseteq \Delta^1_{n+1} \subseteq \Pi^1_{n+1}.\]In addition, a version of the Enumeration Theorem for arithmeticalsets can also be proven which can be used to obtain the followinggeneralization of the Hierarchy Theorem:
Theorem 3.13 (Kleene 1955a): For all \(n \geq 1\),there exists a set \(A \subseteq \mathbb{N}\) which is\(\Pi^1_n\)-definable but not \(\Sigma^1_n\)-definable and hence alsoneither\(\Sigma^1_m\)- nor\(\Pi^1_m\)-definable for any \(m < n\). It thus follows that\(\overline{A}\) is \(\Sigma^1_n\)-definable but not\(\Pi^1_n\)-definable and also neither \(\Sigma^1_m\)- nor\(\Pi^1_m\)-definable for any \(m < n\).
In order to provide some illustrations of the levels of the analyticalhierarchy, it is useful to record the following:
Definition 3.19: A set \(A \subseteq \mathbb{N}\) isimplicitly definable in \(\mathcal{L}^2_a\) just in casethere is an arithmetical formula \(\varphi(X)\) with \(X\) as its solefree set variable and no free number variables such that \(A\) is theunique set satisfying \(\varphi(X)\) in the standard model of\(\mathcal{L}^2_a\).
True Arithmetic (\(\textrm{TA}\)) corresponds to the set ofGödel numbers of first-order arithmetical sentences true in thestandard model of \(\mathcal{L}_a\)—i.e., \(\textrm{TA} =\{\ulcorner \varphi \urcorner : \varphi \in \mathcal{L}_a \ \wedge \\mathfrak{N} \models \varphi\}\). Prior to the definition of theanalytical hierarchy itself, Hilbert & Bernays had already showedthe following:
Theorem 3.14 (Hilbert & Bernays 1939:§5.2e): \(\textrm{TA}\) is implicitly definable in\(\mathcal{L}^2_a\).
It is then not difficult to show the following:
Proposition 3.10 (Spector 1955): If \(A\) isimplicitly definable, then \(A\) is \(\Delta^1_1\)-definable in\(\mathcal{L}^2_a\).
It thus follows that \(\textrm{TA}\) is \(\Delta^1_1\)-definable. Onthe other hand, it follows from Tarski’s Theorem on theundefinability of truth that \(\textrm{TA}\) is not arithmeticallydefinable—i.e., \(\textrm{TA} \not\in \Sigma^0_n \cup \Pi^0_n\)for any \(n \in \mathbb{N}\). This in turn shows that the analyticalsets properly extend the arithmetical ones.
The class of \(\Delta^1_1\)-definable subsets of \(\mathbb{N}\) isalso related to Kleene’s original study of the class ofhyperarithmetical sets, customarily denoted \(\textrm{HYP}\). Thedefinition of \(\textrm{HYP}\) depends on that of a system ofconstructive ordinal notations known as \(\mathcal{O} = \langle O,<_O \rangle\) which Kleene had introduced in 1938. (It was also inthe context of defining \(\mathcal{O}\) in which he proved theRecursion Theorem 3.5—see Rogers 1987: ch. 11.7, and Y. Moschovakis 2010.) \(\textrm{HYP}\) canbe informally characterized as the class of sets of natural numbers\(A\) such that \(A \leq_T \emptyset^{(\alpha)}\) where \(\alpha\) isan ordinal which receives a notation \(e \in O\)—i.e., \(A \in\textrm{HYP}\) just in case it is computable from a transfiniteiteration of the Turing jump up to the first non-recursive ordinal \(\omega^{ck}_1\).[32] Kleene’s original result was as follows:[33]
Theorem 3.15 (Kleene 1955b): A set \(A \subseteq\mathbb{N}\) is \(\Delta^1_1\)-definable if and only if \(A \in\textrm{HYP}\).
The next step up the analytical hierarchy involves thecharacterization of the \(\Pi^1_1\)-definable sets. Kleene (1955a)originally established his normal form theorem for\(\mathcal{L}^2_a\)-formulas using a variant of the language ofsecond-order arithmetic which containsfunction quantifiers\(f,g,h,\ldots\) which are intended to range over \(\mathbb{N}\rightarrow \mathbb{N}\) instead of set quantifiers intended to rangeover \(\mathcal{P}(\mathbb{N})\) (Rogers 1987: ch. 16.2). In thissetting, it is possible to show the following:
Proposition 3.11: \(A \in \Pi^1_1\) if and only ifthere is a computable (i.e., \(\Delta^0_1\)-definable) relation\(R(x,f)\) such that
\[x \in A \text{ if and only if } \forall f \exists yR(x,\overline{f}(y))\]where \(\overline{f}(y)\) denotes \(\langlef(0),\ldots,f(y-1)\rangle\).
For each such relation, we may also define a computable tree\(\textit{Tr}_x\) consisting of the finite sequences \(\sigma \in\mathbb{N}^{< \mathbb{N}}\) such that for all proper initialsubsequences \(\tau \subset \sigma\), \(\neg R(x,\tau)\) holds. A leafnode in this tree thus corresponds to the first place for which \(R\)holds. An infinite path in \(\textit{Tr}_x\) thus corresponds to afunction \(f\) such that \(\forall y \neg R(x,\overline{f}(y))\),which is in turn a witness to \(x \not\in A\). It thus follows that\(x \in A\) if and only if \(\textit{Tr}_x\) is well-founded. Since itis straightforward to effectively enumerate computable trees, it isalso not difficult to show the following:
Proposition 3.12: The set \(T\) of indices towell-founded computable trees ism-complete for the\(\Pi^1_1\)-definable sets—i.e., \(T \in \Pi^1_1\) and for all\(A \in \Pi^1_1\), \(A \leq_m T\).
Recalling that \(O\) denotes the set of natural numbers which arenotations for ordinals in Kleene’s \(\mathcal{O}\), a relatedresult is the following:
Proposition 3.13: \(O\) is \(\Pi^1_1\)-complete.
It can then be shown using theHierarchy Theorem 3.13 that neither \(T\) nor \(O\) is \(\Sigma^1_1\)-definable. Theseresults provide the basis for an inductive analysis of the structureof \(\Delta^1_1\)- and \(\Pi^1_1\)-definable sets in terms ofconstructive ordinals which builds onTheorem 3.15 (see Rogers 1987: ch. 16.4).
The foregoing results all pertain to the use of\(\mathcal{L}^2_a\)-formulas to describe sets of natural numbers. Theinitial steps connecting the analytical hierarchy to classicaldescriptive set theory are mediated by considering formulas\(\varphi(X)\) which define subclasses \(\mathcal{X} \subseteq\mathcal{P}(\mathbb{N})\). In this case, \(A \in \mathcal{X}\) may beidentified with the graph of its characteristic function\(\chi_A(x)\)—i.e., as an infinite sequence whose \(n\)thelement is 1 if \(n \in A\) and 0 if \(n \not\in A\). In this way aformula \(\psi(X)\) with a single free set variable may be understoodas defining a subset of theCantor space \(\mathcal{C} =2^{\mathbb{N}}\) consisting of all infinite 0-1 sequences and aformula \(\psi(\vec{X})\) with \(X_1,\ldots,X_k\) free as defining asubclass of \(2^{\mathbb{N}} \times \ldots \times2^{\mathbb{N}}\).
In descriptive set theory, a parallel sequence of topologicaldefinitions of subclasses of \(\mathcal{C}\) is given in the contextof defining the Borel sets and projective sets. First recall that onemeans of defining a topology on \(\mathcal{C}\) is to take as basicopen sets all sets of functions \(f: \mathbb{N} \rightarrow \{0,1\}\)such that \(\overline{f}(k) = \sigma\) for some \(\sigma \in 2^{<\mathbb{N}}\) and \(k \in \mathbb{N}\). Theboldface BorelHierarchy on \(\mathcal{C}\) is now given by defining\(\mathbf{\Sigma^0_1}\) to be the collection of all open sets of\(\mathcal{C}\), \(\mathbf{\Pi^0_{n}}\) (for \(n \geq 1\)) to be theset of all complements \(\overline{A}\) of sets \(A \in\mathbf{\Sigma^0_1}\), and \(\mathbf{\Sigma^0_{n+1}}\) to be the setof all countable unions \(\bigcup_{i \in \mathbb{N}} A_i\) where \(A_i\in \mathbf{\Pi^0_n}\). (Thus \(\mathbf{\Pi^0_1}\) denotes the set ofclosed sets, \(\mathbf{\Sigma^0_2}\) denotes the so-called\(F_{\sigma}\) sets, \(\mathbf{\Pi^0_2}\) the \(G_{\delta}\) sets,etc.) The union of these classes corresponds to theboldface Borelsets \(\mathbf{B}\) which may also be characterized as thesmallest class of sets containing the open sets of \(\mathcal{C}\)which is closed under countable unions and complementation. Theso-calledanalytic sets are defined to be the continuousimages of the Borel sets and are denoted by \(\mathbf{\Sigma^1_1}\)while theco-analytic sets are defined to be the complementsof analytic sets and are denoted by \(\mathbf{\Pi^1_1}\). Finally,\(\mathbf{\Delta^1_1}\) is used to denote the intersection of theanalytic and co-analytic sets.
Addison observed (1958, 1959) that these classical definitions can beeffectivized by restricting to computable unions in the definition ofthe \(\mathbf{\Sigma^0_n}\) sets. This leads to the so-calledlightface version of the Borel hierarchy—customarilydenoted using the same notations \(\Sigma^0_n\) and \(\Pi^0_n\) usedfor the levels of arithmetical hierarchy—and correspondingdefinitions of \(\Sigma^1_1\) (i.e., lightface analytic), \(\Pi^1_1\)(i.e., lightface co-analytic), and \(\Delta^1_1\) sets. In particular,this sequence of definitions suggests an analogy betweenTheorem 3.15 and the following classical result of Suslin:
Theorem 3.16 (Suslin 1917): The class of\(\mathbf{\Delta}^1_1\) sets is equal to the class of Borel sets\(\mathbf{B}\).
An effective form ofTheorem 3.16 relating the \(\Delta^1_1\) subsets of \(\mathcal{C}\) to thelightface Borel sets representable by computable codes can be obtainedfrom Kleene’s original proof ofTheorem 3.15 (see, e.g., Y. Moschovakis 2009: ch. 7B). Addison also showed that itis similarly possible to obtain an effective version of Lusin’sTheorem (1927)—i.e., “any two disjoint analytic sets canbe separated by a Borel set”—and Kondô’stheorem (1939)—i.e., “every \(\mathbf{\Pi^1_1}\)-relationcan be uniformized by a \(\mathbf{\Pi^1_1}\)-relation”. See Y.Moschovakis (2009: ch. 2E,4E) and also Simpson (2009: ch.V.3,VI.2)
Historical surveys of the early development of recursive functions andcomputability theory are provided by Sieg (2009), Adams (2011), andSoare (2016: part V). Many of the original sources discussed in§1 are anthologized in Davis (1965), van Heijenoort (1967), and Ewald(1996). Textbook presentation of computability theory at an elementaryand intermediate level include Hopcroft & Ulman (1979), Cutland(1980), Davis, Sigal, & Weyuker (1994), and Murawski (1999). Theoriginal textbook expositions of the material presented in§2 and§3 (up to the formulation of Post’s problem) include Kleene(1952), Shoenfield (1967), and Rogers (1987; first edition 1967). Thestructure of the many-one and Turing Degrees is presented in moreadvanced textbooks such as Sacks (1963a), Shoenfield (1971), Hinman(1978), Soare (1987), Cooper (2004), and Soare (2016). In addition toShoenfield (1967: ch. 7) and Rogers (1987: ch. 16), the classictreatment of the hyperarithmetical and analytical hierarchies is Sacks(1990). Classical and effective descriptive set theory are developedin Y. Moschovakis (2009, first edition 1980) and Kechris (1995).Simpson (2009) develops connections between computability theory andreverse mathematics. (This corresponds to the axiomatic study ofsubtheories of full second-order arithmetic formulated in the language\(\mathcal{L}^2_a\). Such theories form a hierarchy \(\mathsf{RCA}_0\subset \mathsf{WKL}_0 \subset \mathsf{ACA}_0 \subset \mathsf{ATR}_0\subset \Pi^1_1\text{-}\mathsf{CA}_0 \) in which much of classicalmathematics can be developed and whose models can be characterized bycomputability-theoretic means — e.g., the recursive sets formthe minimal \(\omega\)-model of \(\mathsf{RCA}_0 \), the arithmeticalsets form the minimal \(\omega\)-model of \(\mathsf{ACA}_0\), etc.See the entry onReverse Mathematics.) Treatment of sub-recursive hierarchies and connections to prooftheory and theoretical computer science are provided by Péter(1967), Rose (1984), Clote & Kranakis (2002: ch. 6–7), andSchwichtenberg & Wainer (2011). Many of the historical andmathematical topics surveyed in this entry are also presented ingreater detail in the two volumes of Odifreddi’sClassicalRecursion Theory (1989, 1999a), which contain many additionalhistorical references.
Note: In cases where an English translation is available, pagereferences in the main text and notes are to the indicatedtranslations of the sources cited below.
How to cite this entry. Preview the PDF version of this entry at theFriends of the SEP Society. Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entryatPhilPapers, with links to its database.
chance: versus randomness |Church, Alonzo |Church-Turing Thesis |computability and complexity |computational complexity theory |computer science, philosophy of |Gödel, Kurt |Gödel, Kurt: incompleteness theorems |Hilbert, David: program in the foundations of mathematics |lambda calculus, the |learning theory, formal |logic: combinatory |paradoxes: and contemporary logic |proof theory |reverse mathematics |self-reference |Turing, Alan |Turing machines
This work has been partially supported by the ANR projectTheGeometry of Algorithms – GoA (ANR-20-CE27-0004). Theauthors would like to thank Mark van Atten, Benedict Eastaugh,Marianna Antonutti Marfori, Christopher Porter, and MátéSzabó for comments on an earlier draft of this entry. Thanksare also owed to Piergiorgio Odifreddi and S. Barry Cooper for theirwork on the prior versions (2005, 2012).
View this site from another server:
The Stanford Encyclopedia of Philosophy iscopyright © 2025 byThe Metaphysics Research Lab, Department of Philosophy, Stanford University
Library of Congress Catalog Data: ISSN 1095-5054