The Lambda Calculus

First published Wed Dec 12, 2012; substantive revision Tue Jul 25, 2023

The \(\lambda\)-calculus is, at heart, a simple notation for functionsand application. The main ideas areapplying a function to anargument and forming functions byabstraction. The syntax ofbasic \(\lambda\)-calculus is quite sparse, making it an elegant,focused notation for representing functions. Functions and argumentsare on a par with one another. The result is a non-extensional theoryof functions as rules of computation, contrasting with an extensionaltheory of functions as sets of ordered pairs. Despite its sparsesyntax, the expressiveness and flexibility of the \(\lambda\)-calculusmake it a cornucopia of logic and mathematics. This entry developssome of the central highlights of the field and prepares the readerfor further study of the subject and its applications in philosophy,linguistics, computer science, and logic.

1. Introduction

The \(\lambda\)-calculus is an elegant notation for working withapplications offunctions toarguments. Totake a mathematical example, suppose we are given a simple polynomialsuch as \(x^2 -2\cdot x+5\). What is the value of this expression when\(x = 2\)? We compute this by ‘plugging in’ 2 for \(x\) inthe expression: we get \(2^2 -2\cdot 2+5\), which we can furtherreduce to get the answer 5. To use the \(\lambda\)-calculus torepresent the situation, we start with the \(\lambda\)-term

\[\lambda x[x^2 -2\cdot x+5].\]

The \(\lambda\) operators allows us toabstract over \(x\).One can intuitively read ‘\(\lambda x[x^2 -2\cdot x+5]\)’as an expression that is waiting for a value \(a\) for the variable\(x\). When given such a value \(a\) (such as the number 2), the valueof the expression is \(a^2 -2\cdot a+5\). The‘\(\lambda\)’ on its own has no significance; it merelybinds the variable \(x\), guarding it, as it were, fromoutside interference. The terminology in \(\lambda\)-calculus is thatwe want toapply this expression to anargument, andget a value. We write ‘\(Ma\)’ to denote the applicationof the function \(M\) to the argument \(a\). Continuing with theexample, we get:

\[\begin{align}(\lambda x[x^2 -2\cdot x+5])2 \rhd 2^2& -2\cdot 2+5 &\langle \text{Substitute 2 for } x\rangle \\&= 4-4+5 &\langle\text{Arithmetic}\rangle \\&= 5 &\langle\text{Arithmetic}\rangle\end{align}\]

The first step of this calculation, plugging in ‘2’ foroccurrences of \(x\) in the expression ‘\(x^2 -2\cdot x + 5\)’, is the passage from anabstraction term to another term by theoperation of substitution. The remaining equalities are justified bycomputing with natural numbers.

This example suggests the central principle of the\(\lambda\)-calculus, called\(\beta\)-reduction, which is also sometimescalled\(\beta\)-conversion:

\[\tag{\(\beta\)}(\lambda x[M])N \rhd M[x := N]\]

The understanding is that we canreduce orcontract\((\rhd)\) an application \((\lambda xM)N\) of an abstraction term(the left-hand side, \(\lambda xM)\) to something (the right-handside, \(N)\) by simply plugging in \(N\) for the occurrences of \(x\)inside \(M\) (that’s what the notation ‘\(M[x :=N]\)’ expresses).\(\beta\)-reduction,or\(\beta\)-conversion, is the heart of the\(\lambda\)-calculus. When one actually applies \(\beta\)-reduction toreduce a term, there is an important proviso that has to be observed.But this will be described in Section 2.1, when we discuss bound andfree variables.

1.1 Multi-argument operations

What about functions of multiple arguments? Can the\(\lambda\)-calculus represent operations such as computing the lengthof the hypotenuse of a right triangle:

Hypotenuse of a right triangle with legs of length \(x\) and \(y\Rightarrow \sqrt{x^2 + y^2}\).

The length-of-hypotenuse operation maps two positive real numbers\(x\) and \(y\) to another positive real number. One can representsuch multiple-arity operations using the apparatus of the\(\lambda\)-calculus by viewing the operation as taking one input at atime. Thus, the operation can be seen as taking one input, \(x\), apositive real number, and producing as its value not anumber, but anoperation: namely, the operation thattakes a positive real number \(y\) as input and produces as output thepositive real number \(\sqrt{x^2 + y^2}\). One could summarize thediscussion by saying that the operation,hypotenuse-length,that computesthe length of the hypotenuse of a right triangle given the lengths\(a\) and \(b\) of its legs, is:

hypotenuse-length \(:=\lambda a[\lambda b[\sqrt{a^2 + b^2}]]\)

By the principle of \(\beta\)-reduction, we have, for example, thathypotenuse-length 3, theapplication ofhypotenuse-lengthto 3, is\(\lambda b[\sqrt{3^2 + b^2}]\), which is a function of that is‘waiting’ for another argument. The \(\lambda\)-termhypotenuse-length3 can be viewedas a function that computes the length of the hypotenuse of a righttriangle one of whose legs has length 3. We find, finally, that (hypotenuse-length3)4—theapplication ofhypotenuse-lengthto 3 and then to4—is 5, as expected.

Another way to understand the reduction of many-place functions toone-place functions is to imagine a machine \(M\) that initiallystarts out by loading the first \(a\) of multiple arguments \(a,b,\ldots\) into memory. If one then suspends the machine after it hasloaded the first argument into memory, one can view the result asanother machine M\(_a\) that is awaiting one fewer input; the firstargument is now fixed.

1.2 Non-Extensionality

An important philosophical issue concerning the \(\lambda\)-calculusis the question of its underlying concept of functions. In set theory,a function is standardly understood as a set of argument-value pairs.More specifically, a function is understood as a set \(f\) of orderedpairs satisfying the property that \((x,y) \in f\) and \((x,z) \in f\)implies \(y = z\). If \(f\) is a function and \((x,y) \in f\), thismeans that the function f assigns the value \(y\) to the argument\(x\). This is the concept offunctions-as-sets. Consequently, the notionof equality of functions-as-sets is equalityqua sets, which,under the standard principle of extensionality, entails that twofunctions are equal precisely when they contain the same orderedpairs. In other words, two functions are identical if and only if theyassign the same values to the same arguments. In this sense,functions-as-sets areextensional objects.

In contrast, the notion of a function at work in \(\lambda\)-calculusis one where functions are understood asrules: a function isgiven by a rule for how to determine its values from its arguments.More specifically, we can view a \(\lambda\)-term \(\lambda x[M]\) asa description of an operation that, given \(x\), produces \(M\); thebody \(M\) of the abstraction term is, essentially, arulefor what to do with \(x\). This is the conception offunctions-as-rules. Intuitively, given rules\(M\) and \(N\), we cannot in general decide whether \(\lambda x[M]\)is equal to \(\lambda x[N]\). The two terms might ‘behave’the same (have the same value given the same arguments), but it maynot be clear what resources are needed for showing the equality of theterms. In this sense, functions-as-rules arenon-extensionalobjects.

To distinguish the extensional concept of functions-as-sets from thenon-extensional concept of functions-as-rules, the latter is oftenreferred to as an‘intensional’ function concept,in part because of the ostensibly intensional concept of a ruleinvolved. This terminology is particularly predominant in thecommunity of mathematical logicians and philosophers of mathematicsworking on the foundations of mathematics. But from the perspective ofthe philosophy of language, the terminology can be somewhatmisleading, since in this context, the extensional-intensionaldistinction has a slightly different meaning.

In the standard possible-worlds framework of philosophical semantics,we would distinguish between an extensional and an intensionalfunction concept as follows. Let us say that two functions areextensionally equivalent at a world if and only if theyassign the same values to the same arguments at that world. And let ussay that two functions areintensionally equivalent if andonly if they assign the same values to the same arguments atevery possible-world. To illustrate, consider the functionshighest-mountain-on-earth andhighest-mountain-in-the-Himalayas,wherehighest-mountain-on-earthassignsthe highest mountain on earth as the value to every argument andhighest-mountain-in-the-Himalayasassigns the highest mountain in the Himalayas as the value to everyargument. The two functions are extensionally equivalent (at theactual world), but not intensionally so. At the actual world, the twofunctions assign the same value to every argument, namely Mt. Everest.Now consider a world where Mt. Everest is not the highest mountain onearth, but say, Mt. Rushmore is. Suppose further that this is so, justbecause Mt. Rushmore is 30.000 feet/9.100 m higher than it is at theactual world, while Mt. Everest, with its roughly 29.000 feet/8.800 m,is still the highest mountain in the Himalayas. At that world,highest-mountain-on-earthnowassigns Mt. Rushmore as the value to every argument, whilehighest-mountain-in-the-Himalayasstill assigns Mt. Everest to every object. In other words,highest-mountain-on-earthandhighest-mountain-in-the-Himalayasare extensionally equivalent (at the actual world) but notintensionally equivalent.

A function concept may now be calledextensional if and onlyif it requires functions that are extensionally equivalent at theactual world to be identical. And a function concept may be classifiedasintensional if and only if it requires intensionallyequivalent functions to be identical. Note that these classificationsare conceptually different from the distinctions commonly used in thefoundations of mathematics. On the terminology used in the foundationsof mathematics, functions-as-sets are classified as extensional sincethey use the axiom of extensionality as their criterion of identity,and functions-as-rules are classified as intensional because they relyon the ostensibly intensional concept of a rule. In the presentpossible-worlds terminology, function concepts are classified asextensional or intensional based of their behavior atpossible-worlds.

An issue from which conceptual confusion might arise is that the twoterminologies potentially pass different verdicts on the functionconcept at work in the \(\lambda\)-calculus. To see this, consider thefollowing two functions:

\[\begin{align}\addone &:= \lambda x[x+1] \\\addtwosubtractone &:= \lambda x[[x+2]-1] \end{align}\]

These two functions are clearly extensionally equivalent: they assignthe same value to the same input at the actual world. Moreover, givenstandard assumptions in possible worlds semantics, the two functionsare alsointensionally equivalent. If we assume thatmathematical facts, like facts about addition and subtraction, arenecessary in the sense that they are the same at every possible world,then we get that the two functions give the same value to thearguments atevery possible world. So, an intensionalfunction concept would require the two functions to be identical. Inthe \(\lambda\)-calculus, however, it’s not clear at all that weshould identify the two functions. Formally speaking, without the helpof some other principle, we cannot show that the two \(\lambda\)-termsdenote the same function. Moreover, informally speaking, on theconception offunctions-as-rules, it’s not even clearthat weshould identify them: the two terms involve genuinelydifferent rules, and so we might be tempted to say that they denotedifferent functions.

A function concept that allows for intensionally equivalent functionsto be distinct is calledhyperintensional.The point is that in possible-worlds terminology, the function conceptat work in the \(\lambda\)-calculus may be regarded not as intentionalbuthyperintensional—in contrast to what theterminology common in the foundations of mathematics says. Note thatit’s unclear how an intensional semantic framework, like thepossible-worlds framework, could even in principle account for anon-intensional function concept. On the semantics of the\(\lambda\)-calculus, see section7. The point here was simply to clarify any conceptual confusions thatmight arise from different terminologies at play in philosophicaldiscourse.

The hyperintensionality of the \(\lambda\)-calculus is particularlyimportant when it comes to its applications as a theory of not onlyfunctions, but more generally\(n\)-ary relations. On this,see section9.3. It is effectively the hyperintensionality of the \(\lambda\)-calculusthat makes it an attractive tool in this context. It should be noted,however, that the \(\lambda\)-calculus can be made extensional (aswell as intensional) by postulating additional laws concerning theequality of \(\lambda\)-terms. On this, see section5.

2. Syntax

The official syntax of the \(\lambda\)-calculus is quite simple; it iscontained in the next definition.

Definition For the alphabet of the language of the\(\lambda\)-calculus we take the left and right parentheses, left andright square brackets, the symbol ‘\(\lambda\)’, and aninfinite set of variables. The class of\(\lambda\)-terms is defined inductively asfollows:

Every variable is a \(\lambda\)-term.
If \(M\) and \(N\) are \(\lambda\)-terms, then so is\((MN)\).
If \(M\) is a \(\lambda\)-term and \(x\) is a variable, then\((\lambda x[M])\) is a \(\lambda\)-term.

By ‘term’ we always mean ‘\(\lambda\)-term’.Terms formed according to rule (2) are calledapplicationterms. Terms formed according to rule (3) are calledabstraction terms.

As is common when dealing with formal languages that have groupingsymbols (the left and right parenthesis, in our case), someparentheses will be omitted when it is safe to do so (that is, whenthey can be reintroduced in only one sensible way). Juxtaposing morethan two \(\lambda\)-terms is, strictly speaking, illegal. To avoidthe tedium of always writing all needed parentheses, we adopt thefollowing convention:

Convention (association to the left): When more thantwo terms \(M_1 M_2 M_3 \ldots M_n\) are juxtaposed we can recover themissing parentheses byassociating to theleft: reading from left to right, group \(M_1\) and\(M_2\) together, yielding \((M_1 M_2)M_3 \ldots M_n\); then group\((M_1 M_2)\) with \(M_3\): \(((M_1 M_2)M_3)\ldots M_n\), and soforth.

The convention thus gives a unique reading to any sequence of\(\lambda\)-terms whose length is greater than 2.

2.1 Variables, bound and free

The function of \(\lambda\) in an abstraction term \((\lambda x[M]\))is that itbinds the variable appearingimmediately after it in the term \(M\). Thus \(\lambda\) is analogousto the universal and existential quantifiers \(\forall\) and\(\exists\) of first-order logic. One can define, analogously, thenotions of free and bound variable in the expected way, asfollows.

Definition The syntactic functions \(\mathbf{FV}\)and \(\mathbf{BV}\) (for ‘free variable’ and ‘boundvariable’, respectively) are defined on the set of\(\lambda\)-terms by structural induction thus:

For every variable \(x\), term \(M\), and term \(N\):

\[\begin{array}{lll} &\text{Free} &\text{Bound} \\(1) &\mathbf{FV}(x) = \{ x \} \quad &\mathbf{BV}(x) = \varnothing \\(2) &\mathbf{FV}(MN) = \mathbf{FV}(M) \cup \mathbf{FV}(N) &\mathbf{BV}(MN) = \mathbf{BV}(M) \cup \mathbf{BV}(N) \\(3) &\mathbf{FV}(\lambda x[M]) = \mathbf{FV}(M) - \{ x \} &\mathbf{BV}(\lambda x[M]) = \mathbf{BV}(M) \cup \{ x \}\end{array}\]

If \(\mathbf{FV}(M) = \varnothing\) then \(M\) is called acombinator.

Clause (3) in the two definitions supports the intention that\(\lambda\) binds variables (ensures that they are not free). Note thedifference between \(\mathbf{BV}\) and \(\mathbf{FV}\) forvariables.

As is typical in other subjects where the concepts appear, such asfirst-order logic, one needs to be careful about the issue; a casualattitude about substitution can lead to syntactic difficulties.^[1] We can defend a casual attitude by adopting the convention that weare interested not in terms themselves, but in a certain equivalenceclass of terms. We now define substitution, and then lay down aconvention that allows us to avoid such difficulties.

Definition (substitution) We write ‘\(M[x :=N]\)’ to denote the substitution of \(N\) for the freeoccurrences of \(x\) in \(M\). A precise definition^[2] by recursion on the set of \(\lambda\)-terms is as follows: for allterms \(A\), \(B\), and \(M\), and for all variables \(x\) and \(y\),we define

\(x[x := M] \equiv M\)
\(y[x := M] \equiv y\) (\(y\) distinct from \(x)\)
\((AB)[x := M] \equiv A[x := M]B[x := M]\)
\((\lambda x[A])[x := M] \equiv \lambda x[A]\)
\((\lambda y[A])[x := M] \equiv \lambda y[A[x := M]]\) (\(y\)distinct from \(x)\)

Clause (1) of the definition simply says that if we are to substitute\(M\) for \(x\) and we are dealing simply with \(x\), then the resultis just \(M\). Clause (2) says that nothing happens when we aredealing (only) with a variable different from \(x\) but we are tosubstitute something for \(x\). Clause (3) tells us that substitutionunconditionally distributes over applications. Clauses (4) and (5)concern abstraction terms and parallel clauses (1) and (2) (or rather,clauses (2) and (1), in opposite order): If the bound variable \(z\)of the abstraction term \(\lambda z[A]\) is identical to the variable\(x\) for which we are to do a substitution, then we do not performany substitution (that is, substitution “stops”). Thiscoheres with the intention that \(M[x := N]\) is supposed to denotethe substitution of \(N\) for thefree occurrences of \(x\)in \(M\). If \(M\) is an abstraction term \(\lambda x[A]\) whose boundvariable is \(x\), then \(x\) does not occurr freely in \(M\), sothere is nothing to do. This explains clause 4. Clause (5), finally,says that if the bound variable of an abstraction term differs from\(x\), then at least \(x\) has the “chance ” to occurfreely in the abstraction term, and substitution continues into thebody of the abstraction term.

Definition (change of bound variables,\(\alpha\)-convertibility). The term \(N\) is obtained from the term\(M\) by achange of bound variables if,roughly, any abstraction term \(\lambda x[A]\) inside \(M\) has beenreplaced by \(\lambda y[A[x := y]]\).

Let us say that terms \(M\) and \(N\) are\(\alpha\)-convertible if there is asequence of changes of bound variables starting from \(M\) and endingat \(N\).

Axiom.\(\beta\)-conversion(stated with a no-capture proviso):

\( (\lambda x[M])N \rhd M[x := N]\),
provided no variable that occurrs free in \(N\) becomes bound afterits substitution into \(M\).

Roughly, we need to adhere to the principle that free variables oughtto remain free; when an occurrence of a variable is threatened tobecome bound by a substitution, simply perform enough\(\alpha\)-conversions to sidestep the problem. If we keep this inmind, we can work with \(\lambda\)-calculus without worrying aboutthese nettlesome syntactic difficulties. So, for example, wecan’t apply the function \(\lambda x[\lambda y[x(y-5)]]\) to theargument \(2y\) because upon substitution of “\(2y\)” for“\(x\)”, the “\(y\)” in “\(2y\)”would be captured by the variable-binding operator “\(\lambday\)”. Such a substitution would yield a function different fromthe one intended. However, we can first transform \(\lambda x[\lambday[x(y-5)]]\) to \(\lambda x[\lambda z[x(z-5)]]\) by\(\alpha\)-conversion, and then apply this latter function to theargument \(2y\). So whereas the following isnot a valid useof \(\beta\)-conversion: \[ (\lambda x[\lambda y[x(y-5)]])2y \rhd \lambda y[2y(y-5)]\] we can validly use\(\beta\)-conversion to conclude: \[ (\lambda x[\lambda z[x(z-5)]])2y \rhd \lambda z[2y(z-5)]\] This example helps oneto see why the proviso to \(\beta\)-conversion is so important. Theproviso is really no different from the one used in the statement ofan axiom of the predicate calculus, namely: \(\forall x\phi \to\phi^{\tau}_x\), provided no variable that is free in the term\(\tau\) before the substitution becomes bound after thesubstitution.

The syntax of \(\lambda\)-calculus is quite flexible. One can form allsorts of terms, even self-applications such as \(xx\). Such termsappear at first blush to be suspicious; one might suspect that usingsuch terms could lead to inconsistency, and in any case one might findoneself reaching for a tool with which to forbid such terms. If onewere to view functions and sets of ordered pairs of a certain kind,then the \(x\) in \(xx\) would be a function (set of ordered pairs)that contains as an element a pair \((x,y)\) whose first element wouldbe \(x\) itself. But no set can contain itself in this way, lest theaxiom of foundation (or regularity) be violated. Thus, from a settheoretical perspective such terms are clearly dubious. Below one canfind a brief sketch of one such tool, type theory. But in fact suchterms do not lead to inconsistency and serve a useful purpose in thecontext of \(\lambda\)-calculus. Moreover, forbidding such terms, asin type theory, does not come for free (e.g., some of theexpressiveness of untyped \(\lambda\)-calculus is lost).

2.2 Combinators

As defined earlier, acombinator is a\(\lambda\)-term with no free variables. One can intuitivelyunderstand combinators as ‘completely specified’operations, since they have no free variables. There are a handful ofcombinators that have proven useful in the history of\(\lambda\)-calculus; the next table highlights some of these specialcombinators. Many more could be given (and obviously there areinfinitely many combinators), but the following have concisedefinitions and have proved their utility. Below is a table of somestandard \(\lambda\)-terms and combinators.

Name	Definition & Comments
\(\bS\)	\(\lambda x[\lambda y[\lambda z[xz(yz)]]]\) Keep in mind that ‘\(xz(yz)\)’ is to be understood as theapplication \((xz)(yz)\) of \(xz\) to \(yz. \bS\) can thus beunderstood as a substitute-and-apply operator: \(z\)‘intervenes’ between \(x\) and \(y\): instead of applying\(x\) to \(y\), we apply \(xz\) to \(yz\).
\(\mathbf{K}\)	\(\lambda x[\lambda y[x]]\) The value of \(\mathbf{K}M\) is the constant function whose value forany argument is simply \(M.\)
\(\mathbf{I}\)	\(\lambda x[x]\) The identity function.
\(\mathbf{B}\)	\(\lambda x[\lambda y[\lambda z[x(yz)]]]\) Recall that ‘\(xyz\)’ is to be understood as \((xy)z\), sothis combinator is not a trivial identity function.
\(\mathbf{C}\)	\(\lambda x[\lambda y[\lambda z[xzy]]]\) Swaps an argument.
\(\mathbf{T}\)	\(\lambda x[\lambda y[x]]\) Truth value true. Identical to \(\mathbf{K}\). We shall see later howthese representations of truth values plays a role in the blending oflogic and \(\lambda\)-calculus.
\(\mathbf{F}\)	\(\lambda x[\lambda y[y]]\) Truth value false.
\(\boldsymbol{\omega}\)	\(\lambda x[xx]\) Self-application combinator
\(\boldsymbol{\Omega}\)	\(\boldsymbol{\omega \omega}\) Self-application of the self-application combinator. Reduces toitself.
\(\mathbf{Y}\)	\(\lambda f[(\lambda x[f(xx)])(\lambda x[f(xx)]\))] Curry’s paradoxical combinator. For every \(\lambda\)-term\(X\), we have: \[\begin{align}\mathbf{Y}X &\rhd (\lambda x[X(xx)])(\lambda x[X(xx)]) \\ &\rhd X((\lambda x[X(xx)])(\lambda x[X(xx)]))\end{align}\] The first step in the reduction shows that\(\mathbf{Y}\)X reduces to the application term \((\lambdax[X(xx)])(\lambda x[X(xx)]\)), which is recurring in the third step.Thus, \(\mathbf{Y}\) has the curious property that \(\mathbf{Y}\)X andX\((\mathbf{Y}\)X) reduce to a common term.
\(\boldsymbol{\Theta}\)	\((\lambda x[\lambda f[f(xxf)]])(\lambda x[\lambda f[f(xxf)]]\)) Turing’s fixed-point combinator. For every \(\lambda\)-term\(X\), \(\boldsymbol{\Theta}X\) reduces to\(X(\boldsymbol{\Theta}X)\), which one can confirm by hand.(Curry’s paradoxical combinator \(\mathbf{Y}\) does not havethis property.)

Below is a table of notational conventions employed in this entry.

Notation	Reading & Comments
\(MN\)	The application of the function \(M\) to the argument \(N\). Usually, parentheses are used to separate the function from theargument, like so: ‘\(M(N)\)’. However, in\(\lambda\)-calculus and kindred subjects the parentheses are used asgrouping symbols. Thus, it is safe to write the function and theargument adjacent to one other.
\(PQR\)	The application of the function \(PQ\)—which is itself theapplication of the function \(P\) to the argument \(Q\)—to\(R\). If we do not use parentheses to separate function and argument, howare we to disambiguate expressions that involve three or more terms,such as ‘\(PQR\)’? Recall our convention that we are tounderstand such officially illegal expressions by working from left toright, always putting parentheses around adjacent terms. Thus,‘\(PQR\)’ is to be understood as \((PQ)R\).‘\(PQRS\)’ is \(((PQ)R)S\). The expression‘\((PQ)R\)’ is disambiguated; by our convention, it isidentical to \(PQR\). The expression ‘\(P(QR)\)’ is alsoexplicitly disambiguated; it is distinct from \(PQR\) because it isthe application of \(P\) to the argument \(QR\) (which is itself theapplication of the function \(Q\) to the argument \(R)\).
\((\lambda x[M])\)	The \(\lambda\) term thatbinds thevariable \(x\) in the \(\boldsymbol{body}\) term \(M\). The official vocabulary of the \(\lambda\)-calculus consists of thesymbol ‘\(\lambda\)’, left ‘(’and right‘)’ parentheses, and a set of variables (assumed to bedistinct from the three symbols ‘\(\lambda\)’,‘(’, and ‘)’ lest we have syntacticchaos). Alternative notation. It is not necessary to includetwo kinds of grouping symbols (parentheses and square brackets) in thesyntax. Parentheses or square brackets alone would obviously suffice.The two kinds of brackets are employed in this entry for the sake ofreadability. Given the two kinds of grouping symbols, we couldeconomize further and omit the parentheses from abstraction terms, sothat ‘\((\lambda x[M]\))’ would be written as‘\(\lambda x[M]\)’. Some authors write ‘\(\lambda x.M\)’ or ‘\(\lambdax\cdot M\)’, with a full stop or a centered dot separating thebound variable from the body of the abstraction term. As with thesquare brackets, these devices are intended to assist reading\(\lambda\)-terms; they are usually not part of the official syntax.(One sees this device used in earlier works of logic, such asPrincipia Mathematica, where the function of the symbol. in expressions such as ‘\(\forallx\).\(\phi\)’ is to get us to read the whole ofthe formula \(\phi\) as under the scope of the \(\forall x\).) Some authors write abstraction terms without any device separating thebound variable from the body: such terms are crisply written as, e.g.,‘\(\lambda xx\)’, ‘\(\lambda yx\)’. Thepractice is not without its merits: it is about as concise as one canask for, and permits an even simpler official syntax of the\(\lambda\)-calculus. But this practice is not flawless. In‘\(\lambda xyz\)’, is the bound variable \(x\) or is it\(xy\)? Usually the names of variables are single letters, andtheoretically this is clearly sufficient. But it seems undulyrestrictive to forbid the practice of giving longer names tovariables; indeed, such constructions arise naturally in computerprogramming languages. For the sake of uniformity, we will adopt the square bracket notationin this entry. (Incidentally, this notation is used in (Turing,1937).)
\(M[x := A]\)	The \(\lambda\)-term that is obtained by substituting the\(\lambda\)-term A for all free occurrences of \(x\) inside \(M\). A bewildering array of notations to represent substitution can befound in the literature on \(\lambda\)-calculus and kindredsubjects: \[M[x/A], M[A/x], M_{x}^A, M_{A}^x, [x/A]M,\ldots\] Which notation to use for substitution seems to be a personal matter.In this entry we use a linear notation, eschewing superscripts andsubscripts. The practice of representing substitution with‘:=’ comes from computer science, where ‘:=’is read in some programming languages as assigning a value to avariable. As with the square brackets employed to write abstraction terms, thesquare brackets employed to write substitution are not officially partof the syntax of the \(\lambda\)-calculus. \(M\) and A are terms,\(x\) is a variable; \(M[x := A]\) is another term.
\(M \equiv N\)	The \(\lambda\)-terms \(M\) and \(N\) are identical: understoodas sequences of symbols, \(M\) and \(N\) have the same length andcorresponding symbols of the sequences are identical. The syntactic identity relation \(\equiv\) is not part of the officialsyntax of \(\lambda\)-calculus; this relation between\(\lambda\)-terms belongs to the metatheory of \(\lambda\)-calculus.It is clearly a rather strict notion of equality between\(\lambda\)-terms. Thus, it is not the case (if \(x\) and \(y\) aredistinct variables) that\(\lambda x[x] \equiv\lambda y[y]\), even though these two terms clearly‘behave’ in the same way in the sense that both areexpressions of the identity operation \(x \Rightarrow x\). Later wewill develop formal theories of equality of \(\lambda\)-terms with theaim of capturing this intuitive equality of \(\lambda x[x]\) and\(\lambda y[y]\).

3. Brief history of \(\lambda\)-calculus

\(\lambda\)-calculus arose from the study of functions as rules.Already the essential ingredients of the subject can be found inFrege’s pioneering work (Frege, 1893). Frege observed, as we didabove, that in the study of functions it is sufficient to focus onunary functions (i.e., functions that take exactly one argument). (Theprocedure of viewing a multiple-arity operation as a sequence ofabstractions that yield an equivalent unary operation is calledcurrying the operation. Perhaps it would bemore historically accurate to call the operationfregeing, but there are often miscarriagesof justice in the appellation of mathematical ideas.) In the 1920s,the mathematician Moses Schönfinkel took the subject further withhis study of so-calledcombinators. As was common in theearly days of the subject, Schönfinkel was interested in thekinds of transformations that one sees in formal logic, and hiscombinators were intended to be a contribution to the foundations offormal logic. By analogy with the reduction that one sees in classicalpropositional logic with the Sheffer stroke, Schöfinkelestablished the astonishing result that the all functions (in thesense of all transformations) could be given in terms of thecombinators \(\mathbf{K}\) and \(\bS\); later we will see thedefinition of these combinators.

Theorem For every term \(M\) made up of\(\mathbf{K}\) and \(\bS\) and the variable \(x\), there exists a term\(F\) (built only from \(\mathbf{K}\) and \(\bS)\) such that we canderive \(Fx = M\).

(The proof that these two suffice to represent all functions is beyondthe scope of this entry. For further discussion, see the entry oncombinatory logic.) One can prove the theorem constructively: there is an algorithm that,given \(M\), produces the required \(F\). Church called this \(F\)‘\(\lambda x[M]\)’ (Church, 1932).^[3] From this perspective, the \(\beta\)-rule can be justified: if‘\(\lambda x[M]\)’ is to be a function \(F\) satisfying\(Fx = M\), then \(\lambda x[M]\)x should transform to \(M\). This isjust a special case of the more general principle that for all \(N,(\lambda x[M])N\) should transform to \(M[x := N]\).

Although today we have more clearly delimited systems of abstractionand rewriting, in its early days \(\lambda\)-calculus and combinatorylogic (à la Schönfinkel) were bound up with investigationsof foundations of mathematics. In the hands of Curry, Church, Kleene,and Rosser (some of the pioneers in the subject) the focus was ondefining mathematical objects and carrying out logical reasoninginside the these new systems. It turned out that these early attemptsat so-called illative \(\lambda\)-calculus and combinatory logic wereinconsistent. Curry isolated and polished the inconsistency; theresult is now known as Curry’s paradox. See the entry onCurry’s paradox and appendix B of (Barendregt, 1985).

The \(\lambda\)-calculus earns a special place in the history of logicbecause it was the source of the first undecidable problem. Theproblem is: given \(\lambda\)-terms \(M\) and \(N\), determine whether\(M = N\). (A theory of equational reasoningabout \(\lambda\)-terms has not yet been defined; the definition willcome later.) This problem was shown to be undecidable.

Another early problem in the \(\lambda\)-calculus was whether it isconsistent at all. In this context, inconsistency means that all termsare equal: one can reduce any \(\lambda\)-term \(M\) to any other\(\lambda\)-term \(N\). That this is not the case is an early resultof \(\lambda\)-calculus. Initially one had results showing thatcertain terms were not interconvertible (e.g., \(\mathbf{K}\) and\(\bS)\); later, a much more powerful result, the so-calledChurch-Rosser theorem, helped shed more light on \(\beta\)-conversionand could be used to give quick proofs of the non-inter-convertibilityof whole classes of \(\lambda\)-terms. See below for more detaileddiscussion of consistency.

The \(\lambda\)-calculus was a somewhat obscure formalism until the1960s, when, at last, a ‘mathematical’ semantics wasfound. Its relation to programming languages was also clarified. Tillthen the only models of \(\lambda\)-calculus were‘syntactic’, that is, were generated in the style ofHenkin and consisted of equivalence classes of \(\lambda\)-terms (forsuitable notions of equivalence). Applications in the semantics ofnatural language, thanks to developments by Montague and otherlinguists, helped to ‘spread the word’ about the subject.Since then the \(\lambda\)-calculus enjoys a respectable place inmathematical logic, computer science, linguistics (see, e.g., Heim andKratzer 1998), and kindred fields.

4. Reduction

Various notions of reduction for \(\lambda\)-terms are available, butthe principal one is \(\beta\)-reduction, which we have already seenearlier. Earlier we used the notation ‘\(\rhd\)’; we canbe more precise. In this section we discuss \(\beta\)-reduction andsome extensions.

Definition (one-step \(\beta\)-reduction\(\rhd_{\beta ,1})\) For \(\lambda\)-terms \(A\) and \(B\), we saythat \(A\) \(\beta\)-reduces in one step to \(B\), written \(A\rhd_{\beta ,1} B\), just in case there exists an (occurrence of a)subterm \(C\) of \(A\), a variable \(x\), and \(\lambda\)-terms \(M\)and \(N\) such that \(C \equiv(\lambda x[M])N\) and \(B\) is \(A\)except that the occurrence of \(C\) in \(A\) is replaced by \(M[x :=N]\).

Here are some examples of \(\beta\)-reduction:

The variable \(x\) does not \(\beta\)-reduce to anything. (It does nothave the right shape: it is simply a variable, not an application termwhose left-hand side is an abstraction term.)
\((\lambda x[x])a \rhd_{\beta ,1} a\).
If \(x\) and \(y\) are distinct variables, then \((\lambda x[y])a\rhd_{\beta ,1} y\).
The \(\lambda\)-term \((\lambda x[(\lambda y[xy])a])b]\)\(\beta\)-reduces in one step to two different \(\lambda\)-terms:
\[(\lambda x[(\lambda y[xy])a])b \rhd_{\beta ,1} (\lambda y[by])a\]
and
\[(\lambda x[(\lambda y[xy])a])b \rhd_{\beta ,1} (\lambda x[xa])b\]
Moreover, one can check that these two terms \(\beta\)-reduce in onestep to a common term: \(ba\). We thus have:
\((\lambda y[by])a\)
\(\nearrow\) \(\searrow\)
\((\lambda x[(\lambda y[xy])a])b\) \(ba\)
\(\searrow\) \(\nearrow\)
\((\lambda x[xa])b\)

As with any binary relation, one can ask many questions about therelation \(\rhd_{\beta ,1}\) holding between \(\lambda\)-terms, andone can define various derived notions in terms of \(\rhd_{\beta,1}\).

Definition A\(\beta\)-reductionsequence from a \(\lambda\)-term \(A\) to a\(\lambda\)-term \(B\) is a finite sequence \(s_1 , \ldots s_n\) of\(\lambda\)-terms starting with \(A\), ending with \(B\), and whoseadjacent terms \((s_k,s_{k+1})\) satisfy the property that \(s_k\rhd_{\beta ,1} s_{k+1}\).

More generally, any sequence \(s\)—finite orinfinite—starting with a \(\lambda\)-term \(A\) is said to be a\(\beta\)-reduction sequence commencing with \(A\) provided that theadjacent terms \((s_k,s_{k+1})\) of \(s\) satisfy the property that\(s_k \rhd_{\beta ,1} s_{k+1}\).

Continuing with \(\beta\)-reduction Example 1, there are no\(\beta\)-reduction sequences at all commencing with the variable\(x\).
Continuing with \(\beta\)-reduction Example 2, the two-termsequence
\[(\lambda x[x])a, a\]
is a \(\beta\)-reduction sequence from \((\lambda x[x])a\) to \(a\).If \(a\) is a variable, then this \(\beta\)-reduction sequence cannotbe prolonged, and there are no other \(\beta\)-reduction sequencescommencing with \((\lambda x[x])a\); thus, the set of\(\beta\)-reduction sequences commencing with \((\lambda x[x])a\) isfinite and contains no infinite sequences.
The combinator\(\boldsymbol{\Omega}\) has the curious property that \(\Omega\rhd_{\beta ,1} \Omega\). Every term of every \(\beta\)-reductionsequence commencing with \(\boldsymbol{\Omega}\) (finite or infinite)is equal to \(\boldsymbol{\Omega}\).
Consider the term \(\mathbf{K}a\boldsymbol{\Omega}\). There areinfinitely many reduction sequences commencing with this term:
- \(\bK a\boldsymbol{\Omega} \rhd_{\beta ,1} a\)
- \(\bK a\boldsymbol{\Omega} \rhd_{\beta ,1} \bKa\boldsymbol{\Omega} \rhd_{\beta ,1} a\)
- \(\bK a\boldsymbol{\Omega} \rhd_{\beta ,1} \bKa\boldsymbol{\Omega} \rhd_{\beta ,1} \bK a\boldsymbol{\Omega}\rhd_{\beta ,1} a\)
- \(\bK a\boldsymbol{\Omega} \rhd_{\beta ,1} \bKa\boldsymbol{\Omega} \rhd_{\beta ,1} \bK a\boldsymbol{\Omega}\ldots\)
If \(a\) is a variable, one can see that all finite reductionsequences commencing with \(\bK a\boldsymbol{\Omega}\) end at \(a\),and there is exactly one infinite reduction sequence.

Definition A\(\beta\)-redex of a \(\lambda\)-term \(M\)is (an occurrence of) a subterm of \(M\) of the form \((\lambdax[P])Q\). (‘redex’ comes from ‘reducibleexpression.) A \(\beta\)-redex is simply a candidate for anapplication of \(\beta\)-reduction. Doing so, onecontracts the \(\beta\)-redex. A term issaid to be in\(\beta\)-normal form if ithas no \(\beta\)-redexes.

(Can a term have multiple \(\beta\)-normal forms? The answer isliterally ‘yes’, but substantially the answer is‘no’: If a \(M\) and \(M'\) are \(\beta\)-normal forms ofsome term, then \(M\) is \(\alpha\)-convertible to \(M'\) Thus,\(\beta\)-normal forms are unique up to changes of boundvariables.)

So far we have focused only on one step of \(\beta\)-reduction. Onecan combine multiple \(\beta\)-reduction steps into one by taking thetransitive closure of the relation \(\rhd_{\beta ,1}\).

Definition For \(\lambda\)-terms \(A\) and \(B\), onesays that \(A\)\(\beta\)-reduces to \(B\),written \(A \rhd_{\beta} B\), if either \(A \equiv B\) or there existsa finite \(\beta\)-reduction sequence from \(A\) to \(B\).

Definition A term \(M\)has a\(\beta\)-normal form if there exists a term \(N\) suchthat \(N\) is in \(\beta\)-normal form an \(M \rhd_{\beta} N\).

Reducibility as defined is a one-way relation: it is generally nottrue that if \(A \rhd_{\beta} B\), then \(B \rhd_{\beta} A\). However,depending on one’s purposes, one may wish to treat \(A\) and\(B\) as equivalent if either \(A\) reduces to \(B\) or \(B\) reducesto \(A\). Doing so amounts to considering the reflexive, symmetric,and transitive closure of the relation \(\rhd_{\beta ,1,}\).

Definition For \(\lambda\)-terms \(A\) and \(B\), wesay that \(A =_{\beta} B\) if either \(A \equiv B\) or there exists asequence \(s_1 , \ldots s_n\) starting with \(A\), ending with \(B\),and whose adjacent terms \((s_k,s_{k+1})\) are such that either \(s_k\rhd_{\beta ,1} s_{k+1}\) or \(s_{k+1} \rhd_{\beta ,1} s_k\).

4.1 Other notions of reduction

We have thus far developed the theory of \(\beta\)-reduction. This isby no means the only notion of reduction available in the\(\lambda\)-calculus. In addition to \(\beta\)-reduction, a standardrelation between \(\lambda\)-terms is that of\(\eta\)-reduction:

Definition (one-step \(\eta\)-reduction) For\(\lambda\)-terms \(A\) and \(B\), we say that \(A\) \(\beta\eta\)-reduces in one step to \(B\), written \(A \rhd_{\beta \eta ,1}B\), just in case there exists an (occurrence of a) subterm \(C\) of\(A\), a variable \(x\), and \(\lambda\)-terms \(M\) and \(N\) suchthat either

\(C \equiv(\lambda x[M])N\) and \(B\) is \(A\) except that theoccurrence of \(C\) in \(A\) is replaced by \(M[x := N]\)

\(C \equiv(\lambda x[Mx]\)) and \(B\) is \(A\) except that theoccurrence of \(C\) in \(A\) is replaced by \(M\).

The first clause in the definition of \(\rhd_{\beta \eta ,1}\) ensuresthat the relation extends the relation of one-step\(\beta\)-reduction. As we did for the relation of one-step\(\beta\)-reduction, we can replay the development for\(\eta\)-reduction. Thus, one has the notion of an \(\eta\)-redex, andfrom \(\rhd_{\eta ,1}\) one can define the relation \(\rhd_{\eta}\)between \(\lambda\)-terms as the reflexive and transitive closure of\(\rhd_{\eta ,1}\), which captures zero-or-more-steps of\(\eta\)-reduction. Then one defines \(=_{\eta}\) as the symmetric andtransitive closure of \(\rhd_{\eta}\).

If \(A \rhd_{\eta ,1} B\), then the length of \(B\) is strictlysmaller than that of \(A\). Thus, there can be no infinite\(\eta\)-reductions. This is not the case of \(\beta\)-reduction, aswe saw above in\(\beta\)-reduction sequence examples 3 and4.

One can combine notions of reduction. One useful combination is toblend \(\beta\)- and \(\eta\)-reduction.

Definition (one-step \(\beta \eta\)-reduction)\(\lambda x[Mx] \rhd_{\beta \eta ,1} M\) and \((\lambda x[M]N))\rhd_{\beta \eta ,1} M[x := N]\). A \(\lambda\)-term \(A\) \(\beta\eta\)-reduces in one step to a \(\lambda\)-term \(B\) just in caseeither \(A\) \(\beta\)-reduces to \(B\) in one step or \(A\)\(\eta\)-reduces to \(B\) in one step.

Again, one can replay the basic concepts of reduction, as we did for\(\beta\)-reduction, for this new notion of reduction \(\beta\eta\).

4.2 Reduction strategies

Recall that a term is said to be in \(\beta\)-normal form if it has no\(\beta\)-redexes, that is, subterms of the shape \((\lambda x[M]\))N.A term has a \(\beta\)-normal form if it can be reduced to a term in\(\beta\)-normal form. It should be intuitively clear that if a termhas a \(\beta\)-normal form, then we can find one by exhaustivelycontracting all all \(\beta\)-redexes of the term, then exhaustivelycontracting all \(\beta\)-redexes of all resulting terms, and soforth. To say that a term has a \(\beta\)-normal form amounts tosaying that this blind search for one will eventually terminate.

Blind search for \(\beta\)-normal forms is not satisfactory. Inaddition to be aesthetically unpleasant, it can be quite inefficient:there may not be any need to exhaustively contract all\(\beta\)-redexes. What is wanted is astrategy—preferably, a computable one—for findinga \(\beta\)-normal form. The problem is to effectively decide, ifthere are multiple \(\beta\)-redexes of a term, which ought to bereduced.

Definition A\(\beta\)-reductionstrategy is a function whose domain is the set of all\(\lambda\)-terms and whose value on a term \(M\) not in\(\beta\)-normal form is a redex subterm of \(M\), and whose value onall terms M in \(\beta\)-normal form is simply \(M\).

In other words, a \(\beta\)-reduction strategy selects, whenever aterm has multiple \(\beta\)-redexes, which one should be contracted.(If a term is in \(\beta\)-normal form, then nothing is to be done,which is why we require in the definition of \(\beta\)-reductionstrategy that it does not change any term in \(\beta\)-normal form.)One can represent a strategy \(S\) as a relation \(\rhd_S\) on\(\lambda\)-terms, with the understanding that \(M \rhd_S N\) providedthat \(N\) is obtained from \(M\) in one step by adhering to thestrategy S. When viewed as relations, strategies constitute asubrelation of \(\rhd_{\beta ,1}\).

A \(\beta\)-reduction strategy may or may not have the property thatadhering to the strategy will ensure that we (eventually) reach a\(\beta\)-normal form, if one exists.

Definition A \(\beta\)-reduction strategy \(S\) isnormalizing if for all \(\lambda\)-terms\(M\), if \(M\) has a \(\beta\)-normal form \(N\), then the sequence\(M, S(M), S(S(M)),\ldots\) terminates at \(N\).

Some \(\beta\)-reduction strategies are normalizing, but others arenot.

Therightmost strategy, whereby wealways choose to reduce the rightmost \(\beta\)-redex (if there areany \(\beta\)-redexes) is not normalizing. Consider, for example, thetermKI\(\Omega\). This term has two\(\beta\)-redexes: itself, and\(\Omega\) (which,recall, is the term\(\omega\omega\equiv(\lambda\)x[\(xx])(\lambda\)x[\(xx]\))). Byworking with left-hand \(\beta\)-redexes, we can \(\beta\)-reduceKI\(\Omega\) to \(\mathbf{I}\) in two steps. If weinsist on working with the rightmost \(\beta\)-redex\(\Omega\) we reduceKI(\(\Omega\))to \(\mathbf{KI}\)(\(\Omega \)), then\(\mathbf{KI}\)(\(\Omega\)), ….
Theleftmost strategy, whereby we alwayschoose to reduce the leftmost \(\beta\)-redex (if there are any\(\beta\)-redexes) is normalizing. The proof of this fact is beyondthe scope of this entry; see (Barendregt, 1985, section 13.2) fordetails.

Once we have defined a reduction strategy, it is natural to askwhether one can improve it. If a term has a \(\beta\)-normal form,then a strategy will discover a normal form; but might there be ashorter \(\beta\)-reduction sequence that reaches the same normal form(or a term that is \(\alpha\)-convertible to that normal form)? Thisis the question ofoptimality. Defining optimal strategiesand showing that they are optimal is generally considerably moredifficult than simply defining a strategy. For more discussion, see(Barendregt, 1984 chapter 10).

For the sake of concreteness, we have discussed only\(\beta\)-reduction strategies. But in the definitions above thenotion of reduction \(\beta\) is but one possibility. For any notion\(R\) of reduction we have the associated theory of \(R\)-reductionstrategies, and one can replay the problems of normalizability,optimality, etc., for \(R\).

5. \(\lambda\)-theories

We discussed earlier how the \(\lambda\)-calculus is a non-extensionaltheory of functions. If, in the non-extensional spirit, we understand\(\lambda\)-terms as descriptions, how should we treat equality of\(\lambda\)-terms? Various approaches are available. In this section,let us treat the equality relation = as a primitive, undefinedrelation holding between two \(\lambda\)-terms, and try toaxiomatize the properties that equality should have. The taskis to identity axioms and formulate suitablerules ofinference concerning the equality of\(\lambda\)-terms.

Some obvious properties of equality, having nothing to do with\(\lambda\)-calculus, are as follows:

\[\tag{Reflexivity}\frac{}{X=X}\] \[\tag{Symmetry}\frac{X=Y}{Y=X}\]\[\tag{Transitivity}\frac{X=Y \quad Y=Z}{X=Z}\]

As is standard in proof theory, the way to read these rules ofinference is that above the horizontal rule \(\frac{}{\phantom{X=X}}\)are thepremises of the rule (which areequations) and the equation below the horizontal rule is theconclusion of the rule of inference. In thecase of the reflexivity rule, nothing is written above the horizontalrule. We understand such a case as saying that, for all terms \(X\),we may infer the equation \(X = X\) from no premises.

5.1 The basic theory \(\lambda\)

The three rules of inference listed in the previous section governingequality have nothing to do with the \(\lambda\)-calculus. Thefollowing lists rules of inference that relate the undefined notion ofequality and the two term-building operations of the\(\lambda\)-calculus, application and abstraction.

\[\frac{M=N}{AM=AN} \quad \frac{M=N}{MA=NA}\]\[\tag{\(\boldsymbol{\xi}\)}\frac{M=N}{\lambda x[M] = \lambda x[N]}\]

Together, these rules of inference say that = is acongruence relation on the set of\(\lambda\)-terms: it ‘preserves’ both the application andabstraction term-building operations

The final rule of inference, \(\beta\)-conversion, is the mostimportant:

\[\tag{\(\boldsymbol{\beta}\)}\frac{}{(\lambda x[M])A = M[x := A]}\]

As before with the reflexivity rule, the rule \(\boldsymbol{\beta}\)has no premises: for any variable \(x\) and any terms \(M\) and \(A\),one can infer the equation \((\lambda x[M])A = M[x := A]\) at anypoint in a formal derivation in the theory\(\boldsymbol{\lambda}\).

5.2 Extending the basic theory \(\lambda\)

A number of extensions to \(\boldsymbol{\lambda}\) are available.Consider, for example, the rule (\(\boldsymbol{\eta}\)), whichexpresses the principle of \(\eta\)-reduction as a rule ofinference:

\[\tag{\(\boldsymbol{\eta}\)}\frac{}{\lambda x[Mx] = M} \text{ provided } x \not\in \mathbf{FV}(M)\]

Rule \(\boldsymbol{\eta}\) tells us that a certain kind of abstractionis otiose: it is safe to identify \(M\) with the function that, givenan argument \(x\), applies \(M\) to \(x\). Through this rule we canalso see that all terms are effectively functions. One can intuitivelyjustify this rule using the principle of \(\beta\)-reduction.

\[\tag{\(\mathbf{Ext}\)}\frac{Mx=Nx}{M=N}\text{ provided } x \not\in \mathbf{FV}(M) \cup \mathbf{FV}(N)\]

One can view rule \(\mathbf{Ext}\) as a kind of generalizationprinciple. If we have derived that \(Mx = Nx\), but \(x\) figures inneither \(M\) nor \(N\), then we have effectively shown that \(M\) and\(N\) are alike. Compare this principle to the principle of universalgeneralization in first-order logic: if we have derived \(\phi(x)\)from a set \(\Gamma\) of hypotheses in which \(x\) is not free, thenwe can conclude that \(\Gamma\) derives \(\forall x\phi\).

Another productive principle in the \(\lambda\)-calculus permits us toidentify terms that ‘act’ the same:

\[\tag{\(\boldsymbol{\omega}\)}\frac{\text{For all terms }x, Mx=Nx}{M=N}\]

The rule \(\boldsymbol{\omega}\) has infinitely many hypotheses: onthe assumption that \(Mx = Nx\), no matter what \(x\) may be, then wecan conclude that \(M = N\). The \(\boldsymbol{\omega}\) rule is ananalogue in the \(\lambda\)-calculus of the rule of inference underthe same name in formal number theory, according to which one canconclude the universal formula \(\forall x\phi\) provided one hasproofs for \(\phi(x := \mathbf{0}), \phi(x := \mathbf{1}),\ldots\) .Note that unlike the rule \(\mathbf{Ext}\), the condition that \(x\)not occur freely in \(M\) or \(N\) does not arise.

6. Consistency of the \(\lambda\)-calculus

Is the \(\lambda\)-calculus consistent? The question might not bewell-posed. The \(\lambda\)-calculus is not a logic for reasoningabout propositions; there is no apparent notion of contradiction\((\bot)\) or a method of forming absurd propositions (e.g., \(p\wedge \neg p)\). Thus ‘inconsistency’ of the\(\lambda\)-calculus cannot mean that \(\bot\), or some formulatantamount to \(\bot\), is derivable. A suitable notion of‘consistent’ is, however, available. Intuitively, a logicis inconsistent if it permits us to derive too much. The theory\(\lambda\) is a theory of equations. We can thustake inconsistency of\(\lambda\) to mean:allequations are derivable. Such a property, if it were true of\(\lambda\), would clearly show that\(\lambda\) is of little use as a formal theory.

Early formulations of the idea of \(\lambda\)-calculus by A. Churchwere indeed inconsistent; see (Barendregt, 1985, appendix 2) or(Rosser, 1985) for a discussion. To take a concrete problem: how do weknow that the equation \(\bK = \mathbf{I}\) is not a theorem of\(\lambda\)? The two terms are obviously intuitivelydistinct. \(\bK\) is a function of two arguments, whereas\(\mathbf{I}\) is a function of one argument. If we could show that\(\bK = \mathbf{I}\), then we could show that \(\mathbf{KK} =\mathbf{IK}\), whence \(\mathbf{KK} = \bK\) would be a theorem of\(\lambda\), along with many other equations thatstrike us as intuitively unacceptable. But when we’reinvestigating a formal theory such as\(\lambda\),intuitive unacceptability by no means implies underivability. What ismissing is a deeper understanding of \(\beta\)-reduction.

An early result that gave such an understanding is known as theChurch-Rosser theorem:

Theorem (Church-Rosser) If \(P \rhd_{\beta} Q\) and\(P \rhd_{\beta}\) R, then there exists a term \(S\) such that both\(Q \rhd_{\beta} S\) and \(R \rhd_{\beta} S\).

(The proof of this theorem is quite non-trivial and is well-beyond thescope of this entry.) The result is a deep fact about\(\beta\)-reduction. It says that no matter how we diverge from \(P\)by \(\beta\)-reductions, we can always converge again to a commonterm.

The Church-Rosser theorem gives us, among other things, that the plain\(\lambda\)-calculus—that is, the theory\(\lambda\) of equations between\(\lambda\)-terms—is consistent, in the sense that not allequations are derivable.

As an illustration, we can use the Church-Rosser theorem to solve theearlier problem of showing that the two terms \(\bK\) and\(\mathbf{I}\) are not identified by\(\lambda\). Thetwo terms are in \(\beta\)-normal form, so from them there are no\(\beta\)-reduction sequences at all. If \(\bK = \mathbf{I}\) were atheorem of\(\lambda\), then there would be a term\(M\) from which there is a \(\beta\)-reduction path to both\(\mathbf{I}\) and \(\bK\). The Church-Rosser theorem then implies thetwo paths diverging from \(M\) can be merged. But this is impossible,since \(\bK\) and \(\mathbf{I}\) are distinct \(\beta\)-normalforms.

The Church-Rosser theorem implies the existence of \(\beta\)-reductionsequences commencing from \(\bK\) and from \(\mathbf{I}\) that end ata common term. But there are no \(\beta\)-reduction sequences at allcommencing from \(\mathbf{I}\), because it is in \(\beta\)-normalform, and likewise for \(\bK\).

Theorem\(\lambda\) is consistent,in the sense that not every equation is a theorem.

To prove the theorem, it is sufficient to produce one underivableequation. We have already worked through an example: we used theChurch-Rosser theorem to show that the equation \(\bK = \mathbf{I}\)is not a theorem of\(\lambda\). Of course,there’s nothing special about these two terms. A significantgeneralization of this result is available:if \(M\) and \(N\) in\(\beta\)-normal form but \(M\) is distinct from \(N\), then theequation \(M = N\) is not a theorem of\(\lambda\). (This simple condition forunderivability does not generally hold if we add additional rules ofinference to\(\lambda\).)

The theories\(\lambda \eta\) and\(\lambda\omega\) are likewise consistent. One can prove theseconsistency results along the lines of the consistency proof for\(\lambda\) by extending the Church-Rosser theorem tothe wider senses of derivability of these theories.

7. Semantics of \(\lambda\)-calculus

As we’ve said at the outset, the \(\lambda\)-calculus is, atheart, about functions and their applications. But it is surprisinglydifficult to cash this idea out in semantic terms. A natural approachwould be to try to associate with every \(\lambda\)-term \(M\) afunction \(f_M\) over some domain \(D\) and to interpret applicationterms \((MN)\) using function application as \(f_M(f_N).\) But thisidea quickly runs into difficulties. To begin with, it’s easy tosee that, in this context, we can’t use the standardset-theoretic concept of functions-as-sets (see section1.2 of this entry). According to this concept, remember, a function \(f\)is a set of argument-value pairs, where every argument gets assigned aunique value. The problem arises in the context ofself-applications. Remember from section2.1 that the untyped \(\lambda\)-calculus allows \(\lambda\)-terms suchas \((xx)\), which intuitively apply \(x\) to itself. On the semanticpicture we’re exploring, we can obtain the associated function\(f_{(xx)}\) for the term \((xx)\) by taking the function \(f_x\) for\(x\) and applying it to itself:

\[f_{(xx)}=f_x(f_x)\]

But following functions-as-sets, this would mean that theset\(f_x\) needs to contain an argument-value pair that has \(f_x\) asits first component and \(f_{(xx)}\) as the second:

\[f_{x}=\{\ldots, (f_{x},f_{(xx)})), \ldots\}\]

But this would make \(f_x\) a non-well-founded object: defining\(f_x\) would involve \(f_x\) itself. In fact, sets like this areexcluded in standard axiomatic set theory by the axiom of foundation(also known as the axiom of regularity). —This is furthersemantic evidence that the concept of a function underlyingthe \(\lambda\)-calculus can’t be the extensionalfunctions-as-sets concept.

But the problem runs even deeper than that. Even when we use anon-extensional notion of a function, such as the functions-as-rulesconception (see again section1.2), we run into difficulties. In the untyped \(\lambda\)-calculus,everything can both be function and an argument to functions.Correspondingly, we should want our domain \(D\) to include, in somesense, the function space \(D^D\), which contains all and only thefunctions with both arguments and values from \(D\). To see this:

Every element of \(D\) can be a function that applies to elementsof \(D\), and what’s returned can then be again be an argumentfor elements of \(D\). So, every element of \(D\) intuitivelycorresponds to a member of \(D^D\).
If, in turn, we take a member of \(D^D\), i.e., a function witharguments and values from \(D\), this is precisely the kind of thingwe want to include in our domain \(D\). So, intuitively, we want everymember of \(D^D\) to correspond to a member of \(D\).

In short, we want there to be aone-to-one correspondencebetween our domain and its own function space, i.e., we want them tosatisfy the ‘equation’ \(X\cong X^X\). But this isimpossible since it contradicts Cantor’s theorem.

Given these difficulties, the question arises whether it’spossible to give a set-theoretic model for the \(\lambda\)-calculus inthe first place? It turns out that it is. D. Scott was the first todescribe such a model in an unpublished manuscript from 1969. Thismodel, \(D_\infty\), solves the aforementioned problems withCantor’s theorem by suitably restricting the function space\(D^D\), by only lettingsome members of \(D^D\) correspondto members of \(D\). Covering Scott’s construction goes beyondthe scope of this entry, since it involves advanced tools from algebraand topology; see (Meyer 1982), (Barendregt, 1985, chapter 18.2), or(Hindley and Seldin, 2008, chapter 16) for details. Instead,we’ll discuss the more general question:Whatis amodel for the \(\lambda\)-calculus? That is, leaving aside for amoment the question whether sets are functions, rules, or somethingaltogether different, we ask what kind of mathematical structure amodel for the \(\lambda\)-calculus is in the first place.

7.1 \(\lambda\)-Models

It turns out that there are multiple, essentially equivalent, ways ofdefining the notion of a model for the \(\lambda\)-calculus; see(Barendregt, 1985, chapter 5) or (Hindley and Seldin, 2008, chapter15). In the following, we’ll discuss what we consider the mostpalatable notion for philosophers familiar with the standard semanticsfor first-order logic (see, e.g., the entry onClassical Logic), the so-calledsyntactical\(\lambda\)-models. These models first appear in thework of (Hindley and Longo, 1980), (Koymans, 1982), and (Meyer 1982).They derive their name from the fact that their clauses closelycorrespond to the syntactic rules of the calculus\(\boldsymbol{\lambda}\). This is somewhat unsatisfactory andmotivates ‘syntax-free’ definitions (see below). At thesame time, the syntactical \(\lambda\)-models provide a fairlytransparent and accessible route into the world of \(\lambda\)-models.In addition, despite their conceptual shortcomings, syntactical modelshave proven a technically useful tool in the semantical study of the\(\lambda\)-calculus.

In order to avoid the set-theoretic problems mentioned above, mostdefinitions of \(\lambda\)-models use so-calledapplicative structures. The idea is to treatthe denotations of \(\lambda\)-terms not as set-theoretic functions,but as unanalyzed, first-order ‘function-objects’,instead. Correspondingly, then, we treat function application as anunanalyzed binary operation on these function-objects:

Definition Anapplicative structure is apair \((D,\cdot)\), where \(D\) is some set and \(\cdot\) a binaryoperation on \(D\). To avoid trivial models, we usually assume that\(D\) has at least two elements.

Applicative structures are, in a sense, first-order models of functionspaces that satisfy the problematic equation \(X\cong X^X\).\(\lambda\)-models, in turn, are defined over them.

For the definition of our \(\lambda\)-models, we work withvaluations—a concept familiar fromfirst-order semantics. Valuations assign denotations to the variablesand are used primarily in the semantic clauses for the\(\lambda\)-operator. Additionally, they can be used to expressgeneral claims over the domain, in a way that is familiar from thesemantics for the first-order quantifiers \(\exists x\) and \(\forallx\).

Definition Avaluation in an applicativestructure \((D,\cdot)\) is a function \(\rho\) that assigns an element\(\rho(x)\in D\) to every variable \(x\).

As a useful piece of notation, for \(\rho\) a valuation in someapplicative structure \((D,\cdot)\), \(x\) a variable, and \(d\in D\)an object, we define the valuation \(\rho[x\mapsto d]\) by sayingthat: \[\rho[x\mapsto d](y)=\begin{cases} d & \text{ if }y=x\\ \rho(y) & \text{otherwise}\end{cases}\] That is, \(\rho[x\mapsto d]\) is the result ofchanging the value of \(x\) to be \(d\), while leaving all other othervalues under \(\rho\) unchanged.

Definition Asyntactical \(\lambda\)-modelis a triple \(\mathfrak{M}=(D,\cdot,\llbracket \ \rrbracket)\), where\((D,\cdot)\) is an applicative structure and \(\llbracket \\rrbracket\) is a function that assigns to every \(\lambda\)-term Mand valuation \(\rho\) a denotation \(\llbracket M\rrbracket_\rho\inD\) subject to the following constraints:

\(\llbracket x\rrbracket_\rho=\rho(x)\)
\(\llbracket MN\rrbracket_\rho=\llbracket M\rrbracket_\rho\cdot\llbracket N\rrbracket_\rho\)
\(\llbracket \lambda xM\rrbracket_\rho\cdot d=\llbracketM\rrbracket_{\rho[x\mapsto d]}\), for all \(d\in D\)
\(\llbracket \lambda xM\rrbracket_\rho = \llbracket \lambdaxN\rrbracket_\rho\), whenever for all \(d\in D\), we have \(\llbracketM\rrbracket_{\rho[x\mapsto d]}=\llbracket N\rrbracket_{\rho[x\mapstod]}\)
\(\llbracket M\rrbracket_\rho=\llbracket M\rrbracket_\sigma\),whenever \(\rho(x)=\sigma(x)\) for all \(x\in \mathbf{FV}(M)\)

Intuitively, in a model \(\mathfrak{M}\), \(\llbracketM\rrbracket_\rho\) is the function-object denoted by the\(\lambda\)-term \(M\) under the valuation \(\rho\).

It is now straight-forward to define what it means for a\(\lambda\)-model \(\mathfrak{M}\) to satisfy an equation \(M=N\),symbolically \(\mathfrak{M}\vDash M=N\):

Definition (satisfaction).

\[\mathfrak{M}\vDash M=N\text{ iff for all }\rho\text{, we have }\llbracket M\rrbracket_\rho=\llbracket N\rrbracket_\rho\]

In words: an equation \(M=N\) holds in a model \(\mathfrak{M}\) justin case the \(\lambda\)-terms \(M\) and \(N\) have the same denotationunder every valuation in the underlying applicative structure.

Note that clauses 3. and 4. from the definition of a syntactical\(\lambda\)-model directly mirror the \(\boldsymbol{\lambda}\)-rules\(\boldsymbol{\beta}\) and \(\boldsymbol{\xi}\), respectively (seesection5.1 above). This is the ‘syntactic’ nature of our models.While this might be semantically unsatisfactory (see below), it makesit relatively straight-forward to prove a soundness theorem for thesemantics provided by the syntactical \(\lambda\)-models; see(Barendregt, 1985, Theorem 5.3.4) and (Hindley and Seldin, 2008.Theorem 15.12):

Theorem For all terms \(M,N\), if \(M=N\) isderivable in \(\boldsymbol{\lambda}\), then for all syntactical\(\lambda\)-models \(\mathfrak{M}\), we have that \(\mathfrak{M}\vDashM=N\).

This theorem provides a first ‘sanity-check’ for thesemantics. But note that, so far, we haven’t shown that thereexist any syntactical \(\lambda\)-models at all.

This worry is addressed by constructing so-called ‘termmodels’, which are not unlike the well-known Henkinconstructions from first-order semantics. In order to define thesemodels, we first need the notion of a\(\boldsymbol{\lambda}\)-equivalence class for a given\(\lambda\)-term \(M\). This class contains precisely the terms that\(\boldsymbol{\lambda}\) proves identical to \(M\):

\[[M]_{\boldsymbol{\lambda}}=\{N:\boldsymbol{\lambda}\text{ proves }M=N\}\]

We then define theterm model for\(\boldsymbol{\lambda}\), \(\mathfrak{T}\), by setting:

\(D=\{[M]_\boldsymbol{\lambda}:M\text{ is a}\lambda\text{-term}\}\)
\([M]_\boldsymbol{\lambda}\cdot[N]_\boldsymbol{\lambda}=[MN]_\boldsymbol{\lambda}\)
\(\llbracketM\rrbracket_\rho=[M[x_1:=N_1]\ldots[x_n:=N_n]]_\boldsymbol{\lambda}\),where \(\mathbf{FV}(M)=\{x_1,\ldots,x_n\}\) and \(\rho(x_1)=N_1,\ldots,\rho(x_n)=N_n\)

It is easily seen that this indeed defines a syntactical\(\lambda\)-model. In fact, it is easily checked that in the termmodel for \(\boldsymbol{\lambda}\), we have that:

\[ \mathfrak{T}\vDash M=N\text{ iff}\boldsymbol{\lambda}\text{ derives }M=N. \]

This paves a way for a very simple completeness proof for\(\boldsymbol{\lambda}\) with respect to the class of syntactical\(\lambda\)-models; see (Meyer, 1982, 98–99) for one of the fewexplicit mentions of this kind of result in the literature:

Theorem For all terms \(M,N\), if for all syntactical\(\lambda\)-models \(\mathfrak{M}\), we have that \(\mathfrak{M}\vDashM=N\), then \(M=N\) is derivable in \(\boldsymbol{\lambda}\).

The proof is a simple proof by contraposition, which uses the termmodel \(\mathfrak{T}\) as a countermodel to any non-derivable identityin \(\boldsymbol{\lambda}\).

But there are reasons to be dissatisfied with the syntactical\(\lambda\)-models as a semantics for the \(\lambda\)-calculus. Forone, by virtue of clauses 3. and 4. mirroring rules\(\boldsymbol{\beta}\) and \(\boldsymbol{\xi}\), the soundness resultis ‘baked into’ the semantics, as it were. This isunsatisfactory from a semantic perspective since it means that via thesyntactical \(\lambda\)-models, we don’t really learn anythingdirectly about what conditions an applicative structure needs tosatisfy in order to adequately model the \(\lambda\)-calculus.

A related worry is that the clauses 3. and 4. are notrecursive in nature. That is, they don’t allow us tocompute the denotation of a complex \(\lambda\)-term from thedenotations of its parts and information about the syntactic operationused to combine them. In our syntax (see section2), there are two ways of constructing complex \(\lambda\)-terms:application terms of the form \(MN\) and abstraction terms of the form\((\lambda x[M])\). Clause 1. of our syntactical \(\lambda\)-models isa recursive clause for the syntactical application operation, but wedon’t have a recursive clause for the syntactical operation of\(\lambda\)-abstraction. Clauses 3. and 4. are rather conditions onthe denotation function \(\llbracket \ \rrbracket\) than recursiveclauses. This is unsatisfactory since it means that we’re notreally given a compositional semantics for the \(\lambda\)-operator bythe syntactical \(\lambda\)-models.

These worries are taken care of in the development ofsyntax-free \(\lambda\)-models. Acomprehensive discussion of syntax-free models goes beyond the scopeof this entry; but see (Barendregt, 1985, chapter 5.2) and (Hinley andSeldin, 2008, chapter 15B) for the details. Suffice it to say that thedefinition of syntax-free \(\lambda\)-models involves determiningprecisely under which conditions an applicative structure is suitablefor interpreting the \(\lambda\)-calculus. The resulting\(\lambda\)-models, then, indeed provide (something much closer to) arecursive, compositional semantics, where the syntactical operation of\(\lambda\)-abstraction is interepreted by a corresponding semanticoperation on applicative structures.

It is worth noting, however, that syntactical \(\lambda\)-models andthe syntax-free \(\lambda\)-models are, in a certain sense,equivalent: every syntactical \(\lambda\)-model defines a syntax-free\(\lambda\)-model and vice versa; see (Barendregt, 1985, theorem5.3.6) and (Hinley and Seldin, 2008, theorem 15.20) for the details.From a technical perspective, this result allows us to freely movebetween the different presentations of \(\lambda\)-models and to use,in a given context, the notion of a model that is most expedient. Atthe same time, there may bephilosophical reasons to preferone presentation over the other, such as the semantic worries aboutsyntactical \(\lambda\)-models mentioned above.

Before moving to model constructions, let us briefly mention thatthere are various ways of approaching \(\lambda\)-models. Oneparticularly interesting approach we’ve neglected so far is fromthe perspective of category theory and categorical logic. There arewell-known model descriptions using so-called ‘Cartesian closedcategories’; see (Koymans, 1982). Covering these modeldescriptions goes beyond the scope of the present entry since itrequires a familiarity with a wide range of concepts from categorytheory; see the entryCategory Theory for a sense of the machinery involved. For the details of these modeldescriptions, instead, (Barendregt, 1985, sections 5.4–6). Inrecent years, there has been a renewed interest in categoricalapproaches to the \(\lambda\)-calculus, which have mainly focused ontyped versions of the \(\lambda\)-calculus (see sections8.2 and9.1.2 below) but also include the untyped \(\lambda\)-calculus discussed inthis article. See, for example, (Hyland, 2017) for a recentdiscussion.

7.2 Model Constructions

The term model we’ve seen in section7.1 is rather trivial: it directly reflects the syntactic structure ofthe \(\lambda\)-terms by modeling precisely syntactic equality modulo\(\boldsymbol{\lambda}\)-provable equality. This makes the term modelmathematically and philosophically rather uninteresting. Theconstruction and study of more interesting concrete \(\lambda\)-modelsis one of the principal aims of the model theory for the\(\lambda\)-calculus.

We’ve already mentioned what’s perhaps the most important,but was definitely the first non-trivial model for the\(\lambda\)-calculus: Scott’s \(D_\infty\). But there are alsoother interesting model constructions, such as Plotkin andScott’s graph model \(P_\omega\), first described in (Plotkin1972) and (Scott, 1974). These model constructions, however, usuallyrely on fairly involved mathematical methods, both for theirdefinitions and for verifying that they are indeed \(\lambda\)-models.Consequently, covering these constructions goes beyond the scope ofthis entry; see (Hinley and Seldin, 2008, chapter 16F) for an overviewof various model constructions and (Barendregt, 1985, chapter 18) formany of the formal details.

One of the advantages of having different models is that one seesdifferent aspects of equality in the \(\lambda\)-calculus: each of thedifferent models takes a different view on what \(\lambda\)-terms getidentified. An interesting question in this context is:What isthe \(\lambda\)-theory of a given class of models? In thiscontext, we call a class \(\mathcal{C}\) of \(\lambda\)-modelscomplete just in case every (consistent) \(\lambda\)-theoryis satisfied by some model in \(\mathcal{C}\). See (Salibra, 2003) foran overview of various completeness andincompletenessresults for interesting classes of \(\lambda\)-models.

8. Extensions and Variations

8.1 Combinatory logic

A sister formalism of the \(\lambda\)-calculus, developed slightlyearlier, deals with variable-free combinations.Combinatory logic is indeed even simplerthan the \(\lambda\)-calculus, since it lacks a notion of variablebinding.

The language of combinatory logic is built up fromcombinators and variables. There is someflexibility in precisely which combinators are chosen as basic, butsome standard ones are \(\mathbf{I}, \bK , \bS, \mathbf{B}\) and\(\mathbf{C}\). (The names are not arbitrary.)

As with the \(\lambda\)-calculus, with combinatory logic one isinterested inreducibility andprovability. Theprincipal reduction relations are:

Combinator	Reduction Axiom
\(\bI\)	\(\bI x = x\)
\(\bK\)	\(\bK xy = x\)
\(\bS\)	\(\bS xyz = xz(yz)\)
\(\bB\)	\(\bB xyz = x(yz)\)
\(\bC\)	\(\bC xyz = xzy\)

There is a passage from \(\lambda\)-calculus to combinatory logic viatranslation. It turns out that although combinatory logic lacks anotion of abstraction, one can define such a notion and therebysimulate the \(\lambda\)-calculus in combinatory logic. Here is onetranslation; it is defined recursively.

Rule	Expression	Translation	Condition
1	\(x\)	\(x\)	(unconditional)
2	\(MN\)	M\(^\)N\(^\)	(unconditional)
3	\(\lambda x[M]\)	\(\bK\)M	\(x\) does not occur freely in M
4	\(\lambda x[x]\)	\(\bI\)	(unconditional)
5	\(\lambda x[Mx]\)	M	\(x\) does not occur freely in M
6	\(\lambda x[MN]\)	\(\bB M(\lambda x[N)]^*\)	\(x\) does not occur freely in M
7	\(\lambda x[MN]\)	\(\bC (\lambda x[M])^*\)N	\(x\) does not occur freely in \(N\)
8	\(\lambda x[MN]\)	\(\bS M^N^\)	\(x\) occurs freely in both \(M\) and \(N\)

This translation works inside-out, rather than outside-in. Toillustrate:

The translation of the term \(\lambda y[y]\), a representative of theidentity function, is mapped by this translation to the identitycombinator \(\bI\) (because of Rule 4), as expected.
The \(\lambda\)-term \(\lambda x[\lambda y[x]]\) that we have beencalling ‘\(\bK\)’is mapped by this translation to:
\[\begin{align}\lambda x[\lambda y[x]] &\equiv \lambda x[\bK x] &\langle \text{Rule 1}\rangle \\ &\equiv \bK &\langle \text{Rule 3} \rangle\end{align}\]
The \(\lambda\)-term \(\lambda x[\lambda y[yx]]\) that switches itstwo arguments is mapped by this translation to:
\[\begin{align}\lambda x[\lambda y[yx]] &\equiv \lambda x[\bC(\lambda y[y])^* x] &\langle\text{Rule 8}\rangle \\ &\equiv \lambda x[\bC\bI x] &\langle\lambda y[y] \equiv \bI,\text{ by Rule 4}\rangle \\ &\equiv \bB\bC\bI)(\lambda x[x])^* &\langle\text{Rule 7}\rangle \\ &\equiv \bB(\bC\bI)\bI &\langle(\lambda x[x])^* \equiv \bI,\text{ by Rule 4}\rangle\end{align}\]
We can confirm that the \(\lambda\)-term \(\lambda x[\lambda y[yx]]\)and the translated combinatory logic term \(\bB(\bC\bI)\bI\) haveanalogous applicative behavior: for all \(\lambda\)-terms \(P\) and\(Q\) we have
\[(\lambda x[\lambda y[yx]])PQ \rhd (\lambda y[yP]) \rhd QP;\]
likewise, for all combinatory logic terms \(P\) and \(Q\) we have
\[\bB(\bC\bI)\bI PQ \rhd (\bC\bI)(\bI P)Q \rhd \bI Q(\bI P) \rhd Q(\bI P) \rhd QP\]

We can give but a glimpse of combinatory logic; for more on thesubject, consult the entry oncombinatory logic. Many of the issues discussed here for \(\lambda\)-calculus haveanalogues in combinatory logic, and vice versa.

8.2 Adding types

In many contexts of reasoning and computing it is natural todistinguish between different kinds of objects. The way thisdistinction is introduced is by requiring that certain formulas,functions, or relations accept arguments or permit substitution onlyof some kinds of objects rather than others. We might require, forexample, that addition + take numbers as arguments. The effect of thisrestriction is to forbid, say, the addition of 5 and the identityfunction \(\lambda x.x\).^[4] Regimenting objects into types is also the idea behind the passagefrom (unsorted, or one-sorted) first-order logic tomany-sorted first-order logic. (See (Enderton, 2001) and(Manzano, 2005) for more about many-sorted first-order logic.) As itstands, the \(\lambda\)-calculus does not support this kind ofdiscrimination; any term can be applied to any other term.

It is straightforward to extend the untyped \(\lambda\)-calculus sothat it discriminates between different kinds of objects. This entrylimits itself to thetype-free \(\lambda\)-calculus. See theentries ontype theory andChurch’s type theory for a detailed discussion of the extensions of \(\lambda\)-calculusthat we get when we add types, and see (Barendregt, Dekkers, Statman,2013) for a book length treatment of the subject.

From a model-theoretic perspective, it’s interesting to add that(Scott, 1980) uses the semantic fact that categorical models for theuntyped \(\lambda\)-calculus (see section7.1) derive from the categorical models of the typed \(\lambda\)-calculusto argue for a conceptual priority of the typed over the untypedcalculus.

9. Applications

9.1 Logic à la \(\lambda\)

Here are two senses in which \(\lambda\)-calculus is connected withlogic.

9.1.1 Terms as logical constants

In thetable of combinators above, we defined combinators \(\bT\) and \(\bF\) and said that theyserve as representations in the \(\lambda\)-calculus of the truthvalues true and false, respectively. How do these terms function astruth values?

It turns out that when one is treating \(\lambda\)-calculus as a kindof programming language, one can write conditional statements“If \(P\) then \(A\) else \(B\)” simply as \(PAB\), whereof course \(P, A\), and \(B\) are understood as \(\lambda\)-terms. If\(P \rhd \bT\), that is, P is ‘true’, then we have

\[\text{if-}P\text{-then-}A\text{-else-}B := PAB \rhd \bT AB \rhd A,\]

(recall that, by definition, \(\bT \equiv \bK\)) and if \(P \rhd\bF\), that is, \(P\) is ‘false’, then

\[\text{if-}P\text{-then-}A\text{-else-}B := PAB \rhd \bF AB \rhd B,\]

(recall that, by definition, \(\mathbf{F} \equiv \mathbf{KI})\) whichis just what we expect from a notion of if-then-else. If \(P\) reducesneither to \(\mathbf{T}\) nor \(\mathbf{F}\), then we cannot ingeneral say what \(\text{if-}P\text{-then-}A\text{-else-}B\) is.

The encoding we’ve just sketched of some of the familiar truthvalues and logical connectives of classical truth-table logic does notshow that \(\lambda\)-calculus and classical logic are intimatelyrelated. The encoding shows little more than embeddibility of therules of computation of classical truth-table logic in\(\lambda\)-calculus. Logics other than classical truth-table logiccan likewise be represented in the \(\lambda\)-calculus, if one hassufficient computable ingredients for the logic in question (e.g., ifthe logical consequence relation is computable, or if a derivabilityrelation is computable, etc.). For more on computing with\(\lambda\)-calculus, see section9.2 below. A more intrinsic relationship between logic and\(\lambda\)-calculus is discussed in the next section.

9.1.2 Typed \(\lambda\)-calculus and the Curry-Howard-de Bruijn correspondence

The correspondence to be descried here between logic and the\(\lambda\)-calculus is seen with the help of an apparatus known astypes. This section sketches the beginnings of thedevelopment of the subject known astypetheory. We are interested in developing type theory onlyso far as to make the so-called Curry-Howard-de Bruijn correspondencevisible. A more detailed treatment can be found in the entry ontype theory; see also (Hindley, 1997) and (Barendregt, Dekkers, Statman,2013).

Type theory enriches the untyped \(\lambda\)-calculus by requiringthat terms be giventypes. In the untyped\(\lambda\)-calculus, the application \(MN\) is a legal termregardless of what \(M\) and \(N\) are. Such freedom permits one toform such suspicious terms as \(xx\), and thence terms such as theparadoxical combinator \(\mathbf{Y}\). One might wish to exclude termslike \(xx\) on the grounds that \(x\) is serving both as a function(on the left-hand side of the application) and as an argument (on theright-hand side of the application). Type theory gives us theresources for making this intuitive argument more precise.

Assigning types to terms The language of type theorybegins with an (infinite) set oftypevariables (which is assumed to be disjoint from the setof variables of the \(\lambda\)-calculus and from the symbol‘\(\lambda\)’ itself). The set of types is made up of typevariables and the operation \(\sigma \rightarrow \tau\). Variables intype theory now come with atype annotation(unlike the unadorned term variables of untyped \(\lambda\)-calculus).Typed variables are rendered ‘\(x : \sigma\)’; theintuitive reading is ‘the variable \(x\) has the type\(\sigma\)’. The intuitive reading of the judgment ‘\(t :\sigma \rightarrow \tau\)’ is that the term \(t\) is a functionthat transforms arguments of type \(\sigma\) into arguments of type\(\tau\). Given an assignment of types to term variables, one has thetyping rules:

\[(M : \sigma \rightarrow \tau)(N : \sigma) : \tau\]

and

\[(\lambda x : \sigma[M : \tau]) : \sigma \rightarrow \tau\]

The above two rules define the assignment of types to applications andto abstraction terms. The set of terms of type theory is the set ofterms built up according to these formation rules.

The above definition of the set of terms of type theory is sufficientto rule out terms such as \(xx\). Of course, ‘\(xx\)’ isnot a typed term at all for the simple reason that no types have beenassigned to it. What is meant is that there is no type \(\sigma\) thatcould be assigned to \(x\) such that ‘\(xx\)’ could beannotated in a legal way to make a typed term. We cannot assign to\(x\) a type variable, because then the type of the left-hand \(x\)would fail to be a function type (i.e., a type of the shape‘\(\sigma \rightarrow \tau\)’). Moreover, we cannot assignto \(x\) a function type \(\sigma \rightarrow \tau\), because thenthen \(\sigma\) would be equal to \(\sigma \rightarrow \tau\), whichis impossible.

As a leading example, consider the types that are assigned to thecombinators \(\bI\), \(\bK\), and \(\bS\):

Combinator	Type^[5]
\(\bI\)	\(a \rightarrow a\)
\(\bK\)	\(a \rightarrow(b \rightarrow a)\)
\(\bS\)	\( (a \rightarrow(b \rightarrow c)) \rightarrow ((a \rightarrowb) \rightarrow(a \rightarrow c))\)

(See Hindley (1997)Table of principal types for a moreextensive listing.) If we read ‘\(\rightarrow\)’ asimplication and type variables as propositional variables, then werecognize three familiar tautologies in the right-hand column of thetable. The language used is meager: there are only propositionalvariables and implication; there are no other connectives.

The table suggests an interesting correspondence between the typed\(\lambda\)-calculus and formal logic. Could it really be that thetypes assigned to formulas, when understood as logical formulas, arevalid? Yes, though ‘validity’ needs to understood not asclassical validity:

Theorem If \(\tau\) is the type of some\(\lambda\)-term, then \(\tau\) is intuitionistically valid.

The converse of this theorem holds as well:

Theorem If \(\phi\) is an intuitionistically validlogical formula whose only connective is implication\((\rightarrow)\), then \(\phi\) is the type of some\(\lambda\)-term.

The correspondence can be seen when one identifies intuitionisticvalidity with derivability in a certain natural deduction formalism.For a proof of these two theorems, see (Hindley, 1997, chapter 6).

The correspondence expressed by the previous two theorems betweenintuitionistic validity and typability is known as theCurry-Howard-de Bruijn correspondence, after three logicianswho noticed it independently. The correspondence, as stated, isbetween only propositional intuitionistic logic, restricted to thefragment containing only the implication connective \(\rightarrow\).One can extend the correspondence to other connectives and toquantifiers, too, but the most crisp correspondence is at the level ofthe implication-only fragment. For details, see (Howard, 1980).

9.2 Computing

One can represent natural numbers in a simple way, as follows:

Definition (ordered tuples, natural numbers) Theordered tuple \(\langle a_0,\ldots a_n\rangle\) of \(\lambda\)-termsis defined as \(\lambda x[x a_0\ldots a_n]\). One then defines the\(\lambda\)-term \(\ulcorner n\urcorner\) corresponding to the naturalnumber \(n\) as: \(\ulcorner 0\urcorner = \mathbf{I}\) and, for every\(k\), \(\ulcorner k + 1\urcorner = \langle \mathbf{F}, \ulcornerk\urcorner\rangle\).

The \(\lambda\)-term corresponding to the number 1, on thisrepresentation, is:
\[\begin{align}\ulcorner 1 \urcorner &\equiv \langle\bF,\ulcorner 0\urcorner\rangle \\ &\equiv \langle\bF,\bI\rangle \\ &\equiv \lambda x[x\mathbf{FI}]\end{align}\]
The \(\lambda\)-term corresponding to the number 2, on thisrepresentation, is:
\[\begin{align}\ulcorner 2 \urcorner &\equiv \langle\bF,\ulcorner 1\urcorner\rangle \\ &\equiv \lambda x[x\mathbf{F}\lambda x[x\mathbf{FI}]]\end{align}\]
Similarly, \(\ulcorner 3\urcorner\) is \(\lambda x[x\mathbf{F}\lambdax[x\mathbf{F}\lambda x[x\mathbf{FI}]]]\).

Various representations of natural numbers are available; thisrepresentation is but one.^[6]

Using the ingredients provided by the \(\lambda\)-calculus, one canrepresent all recursive functions. This shows that the model isexactly as expressive as other models of computing, such as Turingmachines and register machines. For a more detailed discussion of therelation between these different models of computing, see the sectioncomparing the Turing and Church approaches in the entry on theChurch-Turing Thesis.

Theorem For every recursive function \(f\) of arity\(n\), there exists a \(\lambda\)-term \(f^*\) such that

for all natural numbers \(a_1,\ldots a_n\): \(f(a_1,\ldots a_n) = y\)iff \(\boldsymbol{\lambda} \vdash f^*\langle\bar{a}_1,\ldots,\bar{a}_n\rangle = \bar{y}\)

For a proof, seethe appendix.

Since the class of recursive functions is an adequate representationof the class of all computable (number-theoretic) functions, thanks tothe work above we find that all computable (number-theoretic)functions can be faithfully represented in the\(\lambda\)-calculus.

9.3 Relations

The motivation for the \(\lambda\)-calculus given at the beginning ofthe entry was based on reading \(\lambda\)-expressions as descriptionsof functions. Thus, we have understood ‘\(\lambda x[M]\)’to be a (or the) function that, given \(x\), gives \(M\) (whichgenerally, though not necessarily, involves x). But it is notnecessary to read \(\lambda\)-terms as functions. One could understand\(\lambda\)-terms as denoting relations, and read an abstraction term‘\(\lambda x[M]\)’ as the unary relation (or property)\(R\) that holds of an argument \(x\) just in case \(M\) does (seeCarnap 1947, p. 3). On the relational reading, we can understand anapplication term \(MN\) as a form of predication. One can make senseof these terms using the principle of \(\beta\)-conversion:

\[(\lambda x[M])a = M[x := A],\]

which says that the abstraction relation \(\lambda x[M]\), predicatedof A, is the relation obtained by plugging in A for all freeoccurrences of \(x\) inside \(M\).

As a concrete example of this kind of approach to\(\lambda\)-calculus, consider an extension of first-order logic whereone can form new atomic formulas using \(\lambda\)-terms, in thefollowing way:

Syntax: For any formula \(\phi\) and any finitesequence \(x_1 , \ldots ,x_n\) of variables, the expression‘\(\lambda x_1 \ldots x_n [\phi]\)’ is a predicate symbolof arity n. Extend the notion of free and bound variables (using thefunctions \(\mathbf{FV}\) and \(\mathbf{BV})\) in such a way that

\[\mathbf{FV}(\lambda x_1 \ldots x_n [\phi]) = \mathbf{FV}(\phi) - \{ x_1 , \ldots x_n \}\]

and

\[\mathbf{BV}(\lambda x_1 \ldots x_n [\phi]) = \mathbf{BV}(\phi) \cup \{ x_1 , \ldots x_n \}\]

Deduction Assume as axioms the universal closures ofall equivalences

\[\lambda x_1 \ldots x_n [\phi](t_1 ,\ldots t_n) \leftrightarrow \phi[x_1 ,\ldots x_n := t_1,\ldots t_n]\]

where \(\phi[x_1 ,\ldots x_n := t_1,\ldots t_n]\) denotes thesimultaneous substitution of the terms \(t_k\) for the variables\(x_k\) \((1 \le k \le n)\).

Semantics For a first-order structure \(A\) and anassignment \(s\) of elements of \(A\) to variables, define

\[\begin{align}A \vDash &\lambda x_1 \ldots x_n [\phi](t_1 ,\ldots t_n) [s] \text{ iff } \\ &A \vDash \phi[x_1 ,\ldots x_n := t_1,\ldots t_n] [s]\end{align}\]

According to this approach, one can use a \(\lambda\) to treatessentially any formula, even complex ones, as if they were atomic. Wesee the principle of \(\beta\)-reduction in the deductive and semanticparts. That this approach adheres to the relational reading of\(\lambda\) terms can be seen clearly in the semantics: according tothe standard Tarski-style semantics for first-order logic, theinterpretation of a formula (possibly with free variables) denotes aset of tuples of elements of the structure, as we vary the variableassignment that assigns elements of the structure to thevariables.

One can ‘internalize’ this functional approach. This isdone in the case of variousproperty theories, formaltheories for reasoning about properties as metaphysical objects(Bealer 1982, Zalta 1983, Menzel 1986, 1993, and Turner 1987). Thiskind of theory is employed in certain metaphysical investigationswhere properties are metaphysical entities to be investigated. Inthese theories, metaphysical relations are (or are among) the objectsof interest; just as we add term-building symbols + and \(\times\) informal theories of arithmetic to build numbers, \(\lambda\) is used inproperty theory to build relations. This approach contrasts with theapproach above. There, \(\lambda\) was added to the grammar offirst-order logic by making it a recipe for building atomic formulas;it was a new formula-building operator, like \(\vee\) or\(\rightarrow\) or the other connectives. In the case of propertytheories, the \(\lambda\) plays a role more like + and \(\times\) informal theories of arithmetic: it is used to construct relations(which, in this setting, are to be understood as a kind ofmetaphysical object). Unlike + and \(\times\), though, the \(\lambda\)binds variables.

To give an illustration of how \(\lambda\) is used in this setting,let us inspect the grammar of a typical application (McMichael andZalta, 1980). One typically has apredication operator (or,more precisely, a family of predication operators) \(p_k (k \ge 0)\).In a language where we have terms \(\mary\) and \(\john\) and a binaryrelation loves, we can formally express:

John loves Mary: \(\loves(\john ,\mary)\)
The property that John loves Mary: \(\lambda[\loves(\john,\mary)]\) (note that the \(\lambda\) is binding no variables; wemight call this ‘vacuous binding’. Such properties can beunderstood as propositions.)
The property of an object \(x\) that John loves it: \(\lambda x[\loves(\john,x)]\).
The property that Mary is loved by something: \(\lambda[\existsx(\loves(x,\mary))]\) (another instance of vacuous binding, viz.,proposition)
The predication of the property of \(x\) that John loves \(x\) toMary: \(p_1 (\lambda x[\loves(\john,x)],\mary)\).
The (0-ary) predication of the property that John loves Mary:\(p_0 (\lambda x[\loves(\john,\mary)])\).
The property of objects \(x\) and \(y\) that \(x\) loves \(y\):\(\lambda xy[\loves(x,y)]\).
The property of an objects \(x\) that \(x\) loves itself:\(\lambda x[\loves(x,x)]\).
The predication of the property of objects \(x\) and \(y\) that\(x\) loves \(y\) to John and Mary (in that order): \(p_2 (\lambdaxy[\loves(x,y)],\john,\mary)\).

We reason with these \(\lambda\)-terms using a \(\beta\)-conversionprinciple such as:

\[\begin{align}p_n (\lambda x_1,&\ldots x_n [A], t_1 , \ldots ,t_n) \leftrightarrow \\ &A[x_1 ,\ldots x_n := t_1,\ldots, t_n]\end{align}\]

Formally, the predication operator p\(_k\) is a \((k+1)\)-arypredicate symbol. The first argument is intended to be a\(\lambda\)-term of \(k\) arguments, and the rest of the arguments areintended to be the arguments of the body of the \(\lambda\)-term. The\(\beta\)-principle above says that the predication of an \(n\)-ary\(\lambda\)-term \(L\) to \(n\) terms holds precisely when the body of\(L\) holds of those terms.

It turns out that in these theories, we may or may not be able to befully committed to the principle of \(\beta\)-conversion. Indeed, insome property theories, the full principle of \(\beta\)-conversionleads to paradox, because one can replay a Russell-style argument whenthe full principle of \(\beta\)-conversion is in place. In suchsettings, one restricts the formation of \(\lambda\)-formulas byrequiring that the body of a \(\lambda\)-term not contain further\(\lambda\)-terms or quantifiers. For further discussion, see (Orilia,2000).

One of the reasons why property theories formulated in the\(\lambda\)-calculus are of a particular philosophical importance isthehyperintensional nature of the calculus (see section1.2). A property concept may be called‘hyperintensional’ if and only if it does notidentify necessarily coextensional properties, i.e., properties thatare instanciated by exactly the same objects at every possible world.The properties and relations described by the theories of Bealer,Zalta, Menzel, and Turner have exactly this characteristic. In otherwords, the theories are hyperintensional property theories. Recentyears have seen a significant rise of interest in hyperintensionalconcepts of properties in metaphysics (Nolan 2014), andcorrespondingly property theories formulated in the\(\lambda\)-calculus will likely experience a rise of interest aswell.

In the context of the foundations of mathematics, Zalta andOppenheimer (2011) argue for the conceptual priority of the relationalinterpretation of \(\lambda\)-terms over the functional one.

Bibliography

Baader, Franz and Tobias Nipkow, 1999,Term Rewriting and AllThat, Cambridge: Cambridge University Press.
Barendregt, Henk, 1985,The Lambda Calculus: Its Syntax andSemantics (Studies in Logic and the Foundations of Mathematics103), 2nd edition, Amsterdam: North-Holland.
Barendregt, Henk, 1993, “Lambda calculi with types”,in S. Abramsky, D. Gabbay, T. Maibaum, and H. Barendregt (eds.),Handbook of Logic in Computer Science (Volume 2), New York:Oxford University Press, pp. 117–309.
Barendregt, Henk, Wil Dekkers, and Richard Statman., 2013,Lambda Calculus With Types, Cambridge: Cambridge UniversityPress.
Bealer, George, 1982,Quality and Concept, Oxford:Clarendon Press.
van Benthem, Johan, 1998,A Manual of Intensional Logic,Stanford: CSLI Publications.
Carnap, Rudolf, 1947,Meaning and Necessity, Chicago:University of Chicago Press.
Church, Alonzo, 1932, “A set of postulates for thefoundation of logic”,Annals of Mathematics (2ndSeries), 33(2): 346–366.
Cutland, Nigel J., 1980,Computability, Cambridge:Cambridge University Press.
Doets, Kees and Jan van Eijk, 2004,The Haskell Road to Logic,Maths and Programming, London: College Publications.
Enderton, Herbert B., 2001,A Mathematical Introduction toLogic, 2nd edition, San Diego: Harcourt/Academic Press.
Frege, Gottlob, 1893,Grundgesetze der Arithmetik, Jena:Verlag Hermann Pohle, Band I; partial translation asThe BasicLaws of Arithmetic, M. Furth (trans.), Berkeley: University ofCalifornia Press, 1964.
Kleene, Stephen C., 1981, “Origins of recursive functiontheory”,Annals of the History of Computing, 3(1):52–67.
Heim, Irene and Angelika Kratzer, 1998,Semantics inGenerative Grammar, Malden, MA: Blackwell.
Hindley, J. Roger, 1997,Basic Simple Type Theory(Cambridge Tracts in Theoretical Computer Science 42), New York:Cambridge University Press.
Hindley, J. Roger and G. Longo, 1980, “Lambda-calculusModels and Extensionality.”Zeitschrift fürmathematische Logik und Grundlagen der Mathematik, 26:289–310.
Hindley, J. Roger and Jonathan P. Seldin, 2008,Lambda-Calculus and Combinators: An Introduction, 2ndedition, Cambridge: Cambridge University Press.
Howard, William A., 1980, “The formula-as-types notion ofconstruction”, in J. Hindley and J. Seldin (eds.),To H. B.Curry: Essays on Combinatory Logic, Lambda-Calculus, andFormalism, London: Academic Press, pp. 479–490.
Hyland, J. Martin E., 2017, “Classical Lambda Calculus inModern Dress”,Mathematical Structures in ComputerScience, 27(5): 762–781.
Koymans, C.P.J., 1982, “Models of the LambdaCalculus”,Information and Control, 52:306–332.
Manzano, Maria, 2005,Extensions of First-order Logic(Cambridge Tracts in Theoretical Computer Science 19), Cambridge:Cambridge University Press.
McCarthy, John, 1960, “Recursive functions of symbolicexpressions and their computation by machine (Part I)”,Communications of the ACM, 3(4): 184–195.
McMichael, Alan and Edward N. Zalta, 1980, “An alternativetheory of nonexistent objects”,Journal of PhilosophicalLogic, 9: 297–313.
Menzel, Christopher, 1986, “A complete, type-free secondorder logic of properties, relations, and propositions”,Technical Report #CSLI-86-40, Stanford: CSLI Publications.
Menzel, Christopher, 1993, “The proper treatment ofpredication in fine-grained intensional logic”,Philosophical Perspectives 7: 61–86.
Meyer, Albert R., 1982, “What is a model of the lambdacalculus?”, InInformation and Control, 52(1):87–122.
Nederpelt, Rob, with Herman Geuvers and Roel de Vriejer (eds.),1994,Selected Papers on Automath (Studies in Logic and theFoundations of Mathematics 133), Amsterdam: North-Holland.
Nolan, Daniel, 2014, “Hyperintensional metaphysics”,Philosophical Studies, 171(1); 149–160.
Orilia, Francesco, 2000, “Property theory and the revisiontheory of definitions”,Journal of Symbolic Logic,65(1): 212–246.
Partee, Barbara H., with Alice ter Meulen and Robert E. Wall,1990,Mathematical Methods in Linguistics, Berlin:Springer.
Plotkin, G.D., 1972,A Set-Theoretical Definition ofApplication, School of Artificial Intelligence, Memo MIP-R-95,University of Edinburgh.
Revesz, George E., 1988,Lambda-Calculus, Combinators, andFunctional Programming, Cambridge: Cambridge University Press;reprinted 2008.
Rosser, J. Barkley, 1984, “Highlights of the History of theLambda-Calculus”,Annals of the History of Computing,6(4): 337–349.
Salibra, Antonio, 2003, “Lambda calculus: models andtheories”, inProceedings of the Third AMAST Workshop onAlgebraic Methods in Language Processing (AMiLP-2003), No. 21,University of Twente, pp. 39–54.
Schönfinkel, Moses, 1924, “On the building blocks ofmathematical logic”, in J. van Heijenoort (ed.),From Fregeto Gödel: A Source Book in Mathematical Logic, Cambridge,MA: Harvard University Press, 1967, pp. 355–366.
Scott, Dana, 1974, “The LAMBDA language”,Journalof Symbolic Logic, 39: 425–427.
–––, 1980, “Lambda Calculus: Some Models,Some Philosophy”, in J. Barwise, H.J. Keisler, and K. Kunen(eds.),The Kleene Symposium, Amsterdam: North-Holland, pp.223–265.
Troelstra, Anne and Helmut Schwichtenberg, 2000,Basic ProofTheory (Cambridge Tracts in Theoretical Computer Science 43), 2ndedition, Cambridge: Cambridge University Press.
Turing, Alan M., 1937, “Computability and\(\lambda\)-definability”,Journal of Symbolic Logic,2(4): 153–163.
Turner, Richard, 1987, “A theory of properties”,Journal of Symbolic Logic, 52(2): 455–472.
Zalta, Edward N., 1983,Abstract Objects: An Introduction toAxiomatic Metaphysics, Dordrecht: D. Reidel.
Zalta, Edward N. and Paul Oppenheimer, 2011, “Relationsversus functions at the foundations of logic: type-theoreticconsiderations”,Journal of Logic and Computation 21:351–374.
Zerpa, L., 2021, “The Teaching and Learning of the UntypedLambda Calculus Through Web-Based e-Learning Tools”, in K. AraiIntelligent Computing (Lecture Notes in Networks and Systems:Volume 285), Cham: Springer, pp. 419–436.

Academic Tools

How to cite this entry.
Preview the PDF version of this entry at theFriends of the SEP Society.
Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO).
Enhanced bibliography for this entryatPhilPapers, with links to its database.

Other Internet Resources

The Lambda Calculator, a tool for working with \(\lambda\)-terms with an eye toward theiruse in formal semantics of natural language.
Lambda calculus reduction workbench, for visualizing reduction strategies.
“\(\lambda\)-Calculus: Then and Now,” useful handout on the milestones in, contributors to, andbibliography on the \(\lambda\)-calculus, presented at the severalTuring Centennial conferences. There also exists avideo recording of the lecture given on the occasion of Princeton University’scelebration of the Turing Centennial in 2012.

In addition, see the very helpful discussion in (Zerpa 2021) onthe use of e-learning tools for teaching and learning the untyped\(\lambda\)-calculus.

Acknowledgments

The first author wishes to acknowledge the contributions of HenkBarendregt, Elizabeth Coppock, Reinhard Kahle, Martin Sørensen,and Ed Zalta in helping to craft this entry. He also thanks Nic McPheefor introducing him to the \(\lambda\)-calculus.

The second author would like to acknowledge the useful comments andsuggestions of Fabrizio Cariani, Cameron Moy, Peter Percival, and EdZalta.

Open access to the SEP is made possible by a world-wide funding initiative.
The Encyclopedia Now Needs Your Support
Please Read How You Can Help Keep the Encyclopedia Free

Browse

About

Support SEP

Mirror Sites

View this site from another server:

USA (Main Site)Philosophy, Stanford University

Info about mirror sites

Library of Congress Catalog Data: ISSN 1095-5054

Movatterモバイル変換