The Turing Test

First published Wed Apr 9, 2003; substantive revision Mon Oct 4, 2021

The phrase “The Turing Test” is most properly used torefer to a proposal made by Turing (1950) as a way of dealing with thequestion whether machines can think. According to Turing, the questionwhether machines can think is itself “too meaningless” todeserve discussion (442). However, if we consider the moreprecise—and somehow related—question whether a digitalcomputer can do well in a certain kind of game that Turing describes(“The Imitation Game”), then—at least inTuring’s eyes—we do have a question that admits of precisediscussion. Moreover, as we shall see, Turing himself thought that itwould not be too long before we did have digital computers that could“do well” in the Imitation Game.

The phrase “The Turing Test” is sometimes used moregenerally to refer to some kinds of behavioural tests for the presenceof mind, or thought, or intelligence in putatively minded entities.So, for example, it is sometimes suggested that The Turing Test isprefigured in Descartes’Discourse on the Method.(Copeland (2000:527) finds an anticipation of the test in the 1668writings of the Cartesian de Cordemoy. Abramson (2011a) presentsarchival evidence that Turing was aware of Descartes’ languagetest at the time that he wrote his 1950 paper. Gunderson (1964)provides an early instance of those who find that Turing’s workis foreshadowed in the work of Descartes.)

The phrase “The Turing Test” is also sometimes used torefer to certain kinds of purely behavioural allegedly logicallysufficient conditions for the presence of mind, or thought, orintelligence, in putatively minded entities. So, for example, NedBlock’s “Blockhead” thought experiment is often saidto be a (putative) knockdown objection to The Turing Test. (Block(1981) contains a direct discussion of The Turing Test in thiscontext.) Here, what a proponent of this view has in mind is the ideathat it islogically possible for an entity to pass the kindsof tests that Descartes and (at least allegedly) Turing have inmind—to use words (and, perhaps, to act) in just the kind of waythat human beings do—and yet to be entirely lacking inintelligence, not possessed of a mind, etc.

The subsequent discussion takes up the preceding ideas in the order inwhich they have been introduced. First, there is a discussion ofTuring’s paper (1950), and of the arguments contained therein.Second, there is a discussion of current assessments of variousproposals that have been called “The Turing Test” (whetheror not there is much merit in the application of this label to theproposals in question). Third, there is a brief discussion of somerecent writings on The Turing Test, including some discussion of thequestion whether The Turing Test sets an appropriate goal for researchinto artificial intelligence. Finally, there is a very shortdiscussion of Searle’s Chinese Room argument, and, inparticular, of the bearing of this argument on The Turing Test.

For other introductory discussions of the Turing Test, from a range ofperspectives, see, for example: Copeland (2000), Damassino and Novelli(2020), French (2000), Korukonda (2003), Moor (2008), Neufeld andFinnestad (2020a) (2020b), Proudfoot and Copeland (2008), Saygin etal. (2000), and Shieber (2004). For further information about Turinghimself, see, for example: Cooper and van Leeuwen (2013), Copeland etal. (2017), Hodges (1983), Millican and Clark (1999) and Turing(1992).

1. Turing (1950) and the Imitation Game

Turing (1950) describes the following kind of game. Suppose that wehave a person, a machine, and an interrogator. The interrogator is ina room separated from the other person and the machine. The object ofthe game is for the interrogator to determine which of the other twois the person, and which is the machine. The interrogator knows theother person and the machine by the labels ‘X’and ‘Y’—but, at least at the beginning ofthe game, does not know which of the other person and the machine is‘X’—and at the end of the game says either‘X is the person andY is the machine’or ‘X is the machine andY is theperson’. The interrogator is allowed to put questions to theperson and the machine of the following kind: “WillXplease tell me whetherX plays chess?” Whichever of themachine and the other person isX must answer questions thatare addressed toX. The object of the machine is to try tocause the interrogator to mistakenly conclude that the machine is theother person; the object of the other person is to try to help theinterrogator to correctly identify the machine. About this game,Turing (1950) says:

I believe that in about fifty years’ time it will be possible toprogramme computers, with a storage capacity of about 10⁹,to make them play the imitation game so well that an averageinterrogator will not have more than 70 percent chance of making theright identification after five minutes of questioning. … Ibelieve that at the end of the century the use of words and generaleducated opinion will have altered so much that one will be able tospeak of machines thinking without expecting to be contradicted.

There are at least two kinds of questions that can be raised aboutTuring’s predictions concerning his Imitation Game. First, thereare empirical questions, e.g., Is it true that we now—or willsoon—have made computers that can play the imitation game sowell that an average interrogator has no more than a 70 percent chanceof making the right identification after five minutes of questioning?Second, there are conceptual questions, e.g., Is it true that, if anaverage interrogator had no more than a 70 percent chance of makingthe right identification after five minutes of questioning, we shouldconclude that the machine exhibits some level of thought, orintelligence, or mentality?

There is little doubt that Turing would have been disappointed by thestate of play at the end of the twentieth century. Participants in theLoebner Prize Competition—an annual event in which computerprogrammes are submitted to the Turing Test— had come nowherenear the standard that Turing envisaged. A quick look at thetranscripts of the participants for the preceding decade reveals thatthe entered programs were all easily detected by a range ofnot-very-subtle lines of questioning. Moreover, major players in thefield regularly claimed that the Loebner Prize Competition was anembarrassment precisely because we were still so far from having acomputer programme that could carry out a decent conversation for aperiod of five minutes—see, for example, Shieber (1994). It waswidely conceded on all sides that the programs entered in the LoebnerPrize Competition were designed solely with the aim of winning theminor prize of best competitor for the year, with no thought that theembodied strategies would actually yield something capable of passingthe Turing Test.

At the end of the second decade of the twenty-first century, it isunclear how much has changed. On the one hand, there have beeninteresting developments in language generators. In particular, therelease of Open AI’s GPT-3 (Brown, et al. 2020, Other InternetResources) has prompted a flurry of excitement. GPT-3 is quite good atgenerating fiction, poetry, press releases, code, music, jokes,technical manuals, and news articles. Perhaps, as Chalmers speculates(2020, Other Internet Resources), GPT-3 “suggests a potentialmindless path to artificial general intelligence”. But, ofcourse, GPT-3 is not close to passing the Turing Test: GPT-3 neitherperceives nor acts, and it is, at best, highly contentious whether itis a site of understanding. What remains to be seen is whether, withinthe next couple of generations of language generators – GPT-4 orGPT-5 – we have something that can be linked to perceptualinputs and behavioural outputs in a way that does produce somethingcapable of passing the Turing Test. (For further discussion, seeFloridi and Chiriatti (2020).)

On the other hand, as, for example, Floridi (2008) complains, thereare other ways in which progress has been frustratingly slow. In 2014,claims emerged that, because the computer programEugeneGoostman had fooled 33% of judges in the Turing Test 2014competition, it had “passed the Turing Test”. But therehave been other one-off competitions in which similar results havebeen achieved. Back in 1991,PC Therapist had 50% of judgesfooled. And, in a 2011 demonstration,Cleverbot had an evenhigher success rate. In all three of these cases, the size of thetrial was very small, and the result was not reliably projectible: inno case were there strong grounds for holding that an averageinterrogator had no more than a 70% chance of making the rightdetermination about the relevant program after five minutes ofquestioning. Moreover—and much more importantly—we mustdistinguish between the test the Turing proposed, and the particularprediction that he made about how things would be by the end of thetwentieth century. The percentage chance of making the correctidentification, the time interval over which the test takes place, andthe number of conversational exchanges required are all adjustableparameters in the Test, despite the fact that they are fixed in theparticular prediction that Turing made. Even if Turing was very farout in the prediction that he made about how things would be by theend of the twentieth century, it remains possible that the test thathe proposes is a good one. However, before one can endorse thesuggestion that the Turing Test is good, there are various objectionsthat ought to be addressed.

Some people have suggested that the Turing Test is chauvinistic: itonly recognizes intelligence in things that are able to sustain aconversation with us. Why couldn’t it be the case that there areintelligent things that are unable to carry on a conversation, or, atany rate, unable to carry on a conversation with creatures like us?(See, for example, French (1990).) Perhaps the intuition behind thisquestion can be granted; perhaps it is unduly chauvinistic to insistthat anything that is intelligent has to be capable of sustaining aconversation with us. (On the other hand, one might think that, giventhe availability of suitably qualified translators, it ought to bepossible for any two intelligent agents that speak different languagesto carry on some kind of conversation.) But, in any case, the chargeof chauvinism is completely beside the point. What Turing claims isonly that, if something can carry out a conversation with us, then wehave good grounds to suppose that that thing has intelligence of thekind that we possess; he does not claim that only something that cancarry out a conversation with us can possess the kind of intelligencethat we have.

Other people have thought that the Turing Test is not sufficientlydemanding: we already have anecdotal evidence that quite unintelligentprograms (e.g., ELIZA—for details of which, see Weizenbaum(1966)) can seem to ordinary observers to be loci of intelligence forquite extended periods of time. Moreover, over a short period oftime—such as the five minutes that Turing mentions in hisprediction about how things will be in the year 2000—it mightwell be the case that almost all human observers could be taken in bycunningly designed but quite unintelligent programs. However, it isimportant to recall that, in order to pass Turing’s Test, it isnot enough for the computer program to fool “ordinaryobservers” in circumstances other than those in which the testis supposed to take place. What the computer program has to be able todo is to survive interrogation by someone who knows that one of theother two participants in the conversation is a machine. Moreover, thecomputer program has to be able to survive such interrogation with ahigh degree of success over a repeated number of trials. (Turing saysnothing about how many trials he would require. However, we can safelyassume that, in order to get decent evidence that there is no morethan a 70% chance that a machine will be correctly identified as amachine after five minutes of conversation, there will have to be areasonably large number of trials.) If a computer program could dothis quite demanding thing, then it does seem plausible to claim thatwe would have at leastprima facie reason for thinking thatwe are in the presence of intelligence. (Perhaps it is worthemphasizing again that there might be all kinds of intelligentthings—including intelligent machines—that would not passthis test. It is conceivable, for example, that there might bemachines that, as a result of moral considerations, refused to lie orto engage in pretence. Since the human participant is supposed to doeverything that he or she can to help the interrogator, the question“Are you a machine?” would quickly allow the interrogatorto sort such (pathological?) truth-telling machines from humans.)

Another contentious aspect of Turing’s paper (1950) concerns hisrestriction of the discussion to the case of “digitalcomputers.” On the one hand, it seems clear that thisrestriction is really only significant for the prediction that Turingmakes about how things will be in the year 2000, and not for thedetails of the test itself. (Indeed, it seems that if the test thatTuring proposes is a good one, then it will be a good test for anykinds of entities, including, for example, animals, aliens, and analogcomputers. That is: if animals, aliens, analog computers, or any otherkinds of things, pass the test that Turing proposes, then there willbe as much reason to think that these things exhibit intelligence asthere is reason to think that digital computers that pass the testexhibit intelligence.) On the other hand, it is actually a highlycontroversial question whether “thinking machines” wouldhave to be digital computers; and it is also a controversial questionwhether Turing himself assumed that this would be the case. Inparticular, it is worth noting that the seventh of the objections thatTuring (1950) considers addresses the possibility of continuous statemachines, which Turing explicitly acknowledges to be different fromdiscrete state machines. Turing appears to claim that, even if we arecontinuous state machines, a discrete state machine would be able toimitate us sufficiently well for the purposes of the Imitation Game.However, it seems doubtful that the considerations that he gives aresufficient to establish that, if there are continuous state machinesthat pass the Turing Test, then it is possible to make discrete statemachines that pass the test as well. (Turing himself was keen to pointout that some limits had to be set on the notion of“machine” in order to make the question about“thinking machines” interesting:

It is natural that we should wish to permit every kind of engineeringtechnique to be used in our machine. We also wish to allow thepossibility that an engineer or team of engineers may construct amachine which works, but whose manner of operation cannot besatisfactorily described by its constructors because they have applieda method which is largely experimental. Finally, we wish to excludefrom the machines men born in the usual manner. It is difficult toframe the definitions so as to satisfy these three conditions. Onemight for instance insist that the team of engineers should all be ofone sex, but this would not really be satisfactory, for it is probablypossible to rear a complete individual from a single cell of the skin(say) of a man. To do so would be a feat of biological techniquedeserving of the very highest praise, but we would not be inclined toregard it as a case of ‘constructing a thinking machine’.(435/6)

But, of course, as Turing himself recognized, there is a large classof possible “machines” that are neither digital norbiotechnological.) More generally, the crucial point seems to be that,while Turing recognized that the class of machines is potentially muchlarger than the class of discrete state machines, he was himselfvery confident that properly engineered discrete statemachines could succeed in the Imitation Game (and, moreover, at thetime that he was writing, there were certain discrete statemachines—“electronic computers”—that loomedvery large in the public imagination).

2. Turing (1950) and Responses to Objections

Although Turing (1950) is pretty informal, and, in some ways ratheridiosyncratic, there is much to be gained by considering thediscussion that Turing gives of potential objections to his claim thatmachines—and, in particular, digital computers—can“think”. Turing gives the following labels to theobjections that he considers: (1) The Theological Objection; (2) The“Heads in the Sand” Objection; (3) The MathematicalObjection; (4) The Argument from Consciousness; (5) Arguments fromVarious Disabilities; (6) Lady Lovelace’s Objection; (7)Argument from Continuity of the Nervous System; (8) The Argument fromInformality of Behavior; and (9) The Argument from Extra-SensoryPerception. We shall consider these objections in the correspondingsubsections below. (In some—but not all—cases, thecounter-arguments to these objections that we discuss are alsoprovided by Turing.)

2.1 The Theological Objection

Substance dualists believe that thinking is a function of anon-material, separately existing, substance that somehow“combines” with the body to make a person. So—theargument might go—making a body can never be sufficient toguarantee the presence of thought: in themselves, digital computersare no different from any other merely material bodies in beingutterly unable to think. Moreover—to introduce the“theological” element—it might be further addedthat, where a “soul” is suitably combined with a body,this is always the work of the divine creator of the universe: it isentirely up to God whether or not a particular kind of body is imbuedwith a thinking soul. (There is well known scriptural support for theproposition that human beings are “made in God’simage”. Perhaps there is also theological support for the claimthat only God can make things in God’s image.)

There are several different kinds of remarks to make here. First,there are many serious objections to substance dualism. Second, thereare many serious objections to theism. Third, even if theism andsubstance dualism are both allowed to pass, it remains quite unclearwhy thinking machines are supposed to be ruled out by this combinationof views. Given that God can unite souls with human bodies, it is hardto see what reason there is for thinking that God could not unitesouls with digital computers (or rocks, for that matter!). Perhaps, onthis combination of views, there is no especially good reason why,amongst the things that we can make, certain kinds of digitalcomputers turn out to be the only ones to which God givessouls—but it seems pretty clear that there is also noparticularly good reason for ruling out the possibility that God wouldchoose to give souls to certain kinds of digital computers. Evidencethat God is dead set against the idea of giving souls to certain kindsof digital computers is not particularly thick on the ground.

2.2 The ‘Heads in the Sand’ Objection

If there were thinking machines, then various consequences wouldfollow. First, we would lose the best reasons that we have forthinking that we are superior to everything else in the universe(since our cherished “reason” would no longer be somethingthat we alone possess). Second, the possibility that we might be“supplanted” by machines would become a genuine worry: ifthere were thinking machines, then very likely there would be machinesthat could think much better than we can. Third, the possibility thatwe might be “dominated” by machines would also become agenuine worry: if there were thinking machines, who’s to saythat they would not take over the universe, and either enslave orexterminate us?

As it stands, what we have here is not an argument against the claimthat machines can think; rather, we have the expression of variousfears about what might follow if there were thinking machines. Someonewho took these worries seriously—and who was persuaded that itis indeed possible for us to construct thinking machines—mightwell think that we have here reasons for giving up on the project ofattempting to construct thinking machines. However, it would be amajor task—which we do not intend to pursue here—todetermine whether there really are any good reasons for taking theseworries seriously.

2.3 The Mathematical Objection

Some people have supposed that certain fundamental results inmathematical logic that were discovered during the 1930s—byGödel (first incompleteness theorem) and Turing (the haltingproblem)—have important consequences for questions about digitalcomputation and intelligent thought. (See, for example, Lucas (1961)and Penrose (1989); see, too, Hodges (1983:414) who mentionsPolanyi’s discussions with Turing on this matter.) Essentially,these results show that within a formal system that is strong enough,there are a class of true statements that can be expressed but notproven within the system (see the entry onGödel’s incompleteness theorems). Let us say that such a system is “subject to the Lucas-Penroseconstraint” because it is constrained from being able to prove aclass of true statements expressible within the system.

Turing (1950:444) himself observes that these results frommathematical logic might have implications for the Turing test:

There are certain things that [any digital computer] cannot do. If itis rigged up to give answers to questions as in the imitation game,there will be some questions to which it will either give a wronganswer, or fail to give an answer at all however much time is allowedfor a reply. (444)

So, in the context of the Turing test, “being subject to theLucas-Penrose constraint” implies the existence of a class of“unanswerable” questions. However Turing noted that in thecontext of the Turing test, these “unanswerable” questionsare only a concern if humans can answer them. His “short”reply was that it is not clear that humans are free from such aconstraint themselves. Turing then goes on to add that he does notthink that the argument can be dismissed “quite solightly.”

To make the argument more precise, we can write it as follows:

Let C be a digital computer.
Since C is subject to the Lucas-Penrose constraint, there is an“unanswerable” question q for C.
If an entity, E, is not subject to the Lucas-Penrose constraint,then there are no “unanswerable” questions for E.
The human intellect is not subject to the Lucas-Penroseconstraint.
Thus, there are no “unanswerable” questions for thehuman intellect.
The question q is therefore “answerable” to the humanintellect.
By asking question q, a human could determine if the responder isa computer or a human.
Thus C may fail the Turing test.

Once the argument is laid out as above, it becomes clear that premise(3) should be challenged. Putting that aside, we note that oneinterpretation of Turing’s “short” reply is thatclaim (4) is merely asserted—without any kind of proof. The“short” reply then leads us to examine whether humans arefree from the Lucas-Penrose constraint.

If humans are subject to the Lucas-Penrose constraint then theconstraint does not provide any basis for distinguishing humans fromdigital computers. If humans are free from the Lucas-Penroseconstraint, then (granting premise 3) it follows that digitalcomputers may fail the Turing test and thus, it seems, cannotthink.

However, there remains a question as to whether being free from theconstraint is necessary for the capacity to think. It may be that theTuring test is too strict. Since, by hypothesis, we are free from theLucas-Penrose constraint, we are, in some sense, too good at askingand answering questions. Suppose there is a thinking entity that issubject to the Lucas-Penrose constraint. By an argument analogous tothe one above, it can fail the Turing test. Thus, an entity which canthink would fail the Turing test.

We can respond to this concern by noting that the construction ofquestions suggested by the results from mathematicallogic—Gödel, Turing, etc.—are extremely complicated,and require extremely detailed information about the language andinternal programming of the digital computer (which, of course, is notavailable to the interrogators in the Imitation Game). At the veryleast, much more argument is required to overthrow the view that theTuring Test could remain a very high quality statistical test for thepresence of mind and intelligence even if digital computers differfrom human beings in being subject to the Lucas-Penrose constraint.(See Bowie 1982, Dietrich 1994, Feferman 1996, Abramson 2008, andSection 6.3 of the entry onGödel’s incompleteness theorems, for further discussion.)

2.4 The Argument from Consciousness

Turing cites Professor Jefferson’sLister Oration for1949 as a source for the kind of objection that he takes to fall underthis label:

Not until a machine can write a sonnet or compose a concerto becauseof thoughts and emotions felt, and not by the chance fall of symbols,could we agree that machine equals brain—that is, not only writeit but know that it had written it. No mechanism could feel (and notmerely artificially signal, an easy contrivance) pleasure at itssuccesses, grief when its valves fuse, be warmed by flattery, be mademiserable by its mistakes, be charmed by sex, be angry or depressedwhen it cannot get what it wants. (445/6)

There are several different ideas that are being run together here,and that it is profitable to disentangle. One idea—the one uponwhich Turing first focuses—is the idea that the only way inwhich one could be certain that a machine thinks is to be the machine,and to feel oneself thinking. A second idea, perhaps, is that thepresence of mind requires the presence of a certain kind ofself-consciousness (“not only write it but know that it hadwritten it”). A third idea is that it is a mistake to take anarrow view of the mind, i.e. to suppose that there could be abelieving intellect divorced from the kinds of desires and emotionsthat play such a central role in the generation of human behavior(“no mechanism could feel …”).

Against the solipsistic line of thought, Turing makes the effectivereply that he would be satisfied if he could secure agreement on theclaim that we might each have just as much reason to suppose thatmachines think as we have reason to suppose thatother peoplethink. (The point isn’t that Turing thinks that solipsism is aserious option; rather, the point is that following this line ofargument isn’t going to lead to the conclusion that there arerespects in which digital computers could not be our intellectualequals or superiors.)

Against the other lines of thought, Turing provides a little“viva voce” that is intended to illustrate thekind of evidence that he supposes one might have that a machine isintelligent. Given the right kinds of responses from the machine, wewould naturally interpret its utterances as evidence ofpleasure, grief, warmth, misery, anger, depression, etc.Perhaps—though Turing doesn’t say this—the only wayto make a machine of this kind would be to equip it with sensors,affective states, etc., i.e., in effect, to make an artificialperson. However, the important point is that if the claimsabout self-consciousness, desires, emotions, etc. are right, thenTuring can accept these claims with equanimity:his claim isthen that a machine with a digital computing “brain” canhave the full range of mental states that can be enjoyed by adulthuman beings.

2.5 Arguments from Various Disabilities

Turing considers a list of things that some people have claimedmachines will never be able to do: (1) be kind; (2) be resourceful;(3) be beautiful; (4) be friendly; (5) have initiative; (6) have asense of humor; (7) tell right from wrong; (8) make mistakes; (9) fallin love; (10) enjoy strawberries and cream; (11) make someone fall inlove with one; (12) learn from experience; (13) use words properly;(14) be the subject of one’s own thoughts; (15) have as muchdiversity of behavior as a man; (16) do something really new.

An interesting question to ask, before we address these claimsdirectly, is whether we should suppose that intelligent creatures fromsome other part of the universe would necessarily be able to do thesethings. Why, for example, should we suppose that there must besomething deficient about a creature that does not enjoy—or thatis not able to enjoy—strawberries and cream? True enough, wemight suppose that an intelligent creature ought to have the capacityto enjoy some kinds of things—but it seems unduly chauvinisticto insist that intelligent creatures must be able to enjoy just thekinds of things that we do. (No doubt, similar considerations apply tothe claim that an intelligent creature must be the kind of thing thatcan make a human being fall in love with it. Yes, perhaps, anintelligent creature should be the kind of thing that can love and beloved; but what is so special about us?)

Setting aside those tasks that we deem to be unduly chauvinistic, weshould then ask what grounds there are for supposing that no digitalcomputing machinecould do the other things on the list.Turing suggests that the most likely ground lies in our prioracquaintance with machines of all kinds: none of the machines that anyof us has hitherto encountered has been able to do these things. Inparticular, the digital computers with which we are now familiarcannot do these things. (Except perhaps for make mistakes: after all,even digital computers are subject to “errors offunctioning.” But this might be set aside as an irrelevantcase.) However, given the limitations of storage capacity andprocessing speed of even the most recent digital computers, there areobvious reasons for being cautious in assessing the merits of thisinductive argument.

(A different question worth asking concerns the progress that has beenmade until now in constructing machines that can do the kinds ofthings that appear on Turing’s list. There is at least room fordebate about the extent to which current computers can: make mistakes,use words properly, learn from experience, be beautiful, etc.Moreover, there is also room for debate about the extent to whichrecent advances in other areas may be expected to lead to furtheradvancements in overcoming these alleged disabilities. Perhaps, forexample, recent advances in work on artificial sensors may one daycontribute to the production of machines that can enjoy strawberriesand cream. Of course, if the intended objection is to the notion thatmachines can experience any kind of feeling of enjoyment, then it isnot clear that work on particular kinds of artificial sensors is tothe point.)

2.6 Lady Lovelace’s Objection

One of the most popular objections to the claim that there can bethinking machines is suggested by a remark made by Lady Lovelace inher memoir on Babbage’s Analytical Engine:

The Analytical Engine has no pretensions to originate anything. It cando whatever we know how to order it to perform (cited by Hartree,p. 70)

The key idea is that machines canonly do what we know how toorder them to do (or that machines can never do anything really new,or anything that would take us by surprise). As Turing says, one wayto respond to these challenges is to ask whether we can ever doanything “really new.” Suppose, for instance, that theworld is deterministic, so that everything that we do is fullydetermined by the laws of nature and the boundary conditions of theuniverse. There is a sense in which nothing “really new”happens in a deterministic universe—though, of course, theuniverse’s being deterministic would be entirely compatible withour being surprised by events that occur within it. Moreover—asTuring goes on to point out—there are many ways in which evendigital computers do things that take us by surprise; more needs to besaid to make clear exactly what the nature of this suggestion is.(Yes, we might suppose, digital computers are“constrained” by their programs: they can’t doanything that is not permitted by the programs that they have. Buthuman beings are “constrained” by their biology and theirgenetic inheritance in what might be argued to be just the same kindof way: they can’t do anything that is not permitted by thebiology and genetic inheritance that they have. If a program weresufficiently complex—and if the processor(s) on which it ranwere sufficiently fast—then it is not easy to say whether thekinds of “constraints” that would remain would necessarilydiffer in kind from the kinds of constraints that are imposed bybiology and genetic inheritance.)

Bringsjord et al. (2001) claim that Turing’s response to theLovelace Objection is “mysterious” at best, and“incompetent” at worst (p.4). In their view,Turing’s claim that “computers do take us bysurprise” is only true when “surprise” is given avery superficial interpretation. For, while it is true that computersdo things that we don’t intend them to do—becausewe’re not smart enough, or because we’re not carefulenough, or because there are rare hardware errors, orwhatever—it isn’t true that there are any cases in whichwe should want to say that a computer hasoriginatedsomething. Whatever merit might be found in this objection, it seemsworth pointing out that, in the relevant sense oforigination, human beings “originate something”on more or less every occasion in which they engage in conversation:they produce new sentences of natural language that it is appropriatefor them to produce in the circumstances in which they findthemselves. Thus, on the one hand—for all that Bringsjord et al.have argued—The Turing Test is a perfectly good test for thepresence of “origination” (or “creativity,” orwhatever). Moreover, on the other hand, for all that Bringsjord et al.have argued, it remains an open question whether a digital computingdevice is capable of “origination” in this sense (i.e.capable of producing new sentences that are appropriate to thecircumstances in which the computer finds itself). So we are notoverly inclined to think that Turing’s response to the LovelaceObjection is poor; and we are even less inclined to think that Turinglacked the resources to provide a satisfactory response on thispoint.

2.7 Argument from Continuity of the Nervous System

The human brain and nervous system is not much like a digitalcomputer. In particular, there are reasons for being skeptical of theclaim that the brain is a discrete-state machine. Turing observes thata small error in the information about the size of a nervous impulseimpinging on a neuron may make a large difference to the size of theoutgoing impulse. From this, Turing infers that the brain is likely tobe a continuous-state machine; and he then notes that, sincediscrete-state machines are not continuous-state machines, there mightbe reason here for thinking that no discrete-state machine can beintelligent.

Turing’s response to this kind of argument seems to be that acontinuous-state machine can be imitated by discrete-state machineswith very small levels of error. Just as differential analyzers can beimitated by digital computers to within quite small margins of error,so too, the conversation of human beings can be imitated by digitalcomputers to margins of error that would not be detected by ordinaryinterrogators playing the imitation game. It is not clear that this isthe right kind of response for Turing to make. If someone thinks thatreal thought (or intelligence, or mind, or whatever) can only belocated in a continuous-state machine, then the fact—if, indeed,it is a fact—that it is possible for discrete-state machines topass the Turing Test shows only that the Turing Test is no good. Abetter reply is to ask why one should be so confident that realthought, etc. can only be located in continuous-state machines (if,indeed, it is right to suppose that we are not discrete-statemachines). And, before we ask this question, we would do well toconsider whether we really do have such good reason to suppose that,from the standpoint of our ability to think, we are not essentiallydiscrete-state machines. (As Block (1981) points out, it seems thatthere is nothing in our concept of intelligence that rules outintelligent beings with quantised sensory devices; and nor is thereanything in our concept of intelligence that rules out intelligentbeings with digital working parts.)

2.8 Argument from Informality of Behavior

This argument relies on the assumption that there is no set of rulesthat describes what a person ought to do in every possible set ofcircumstances, and on the further assumption that there is a set ofrules that describes what a machine will do in every possible set ofcircumstances. From these two assumptions, it is supposed tofollow—somehow!—that people are not machines. As Turingnotes, there is some slippage between “ought” and“will” in this formulation of the argument. However, oncewe make the appropriate adjustments, it is not clear that an obviousdifference between people and digital computers emerges.

Suppose, first, that we focus on the question of whether there aresets of rules that describe what a person and a machine“will” do in every possible set of circumstances. If theworld is deterministic, then there are such rules for both persons andmachines (though perhaps it is not possible to write down the rules).If the world is not deterministic, then there are no such rules foreither persons or machines (since both persons and machines can besubject to non-deterministic processes in the production of theirbehavior). Either way, it is hard to see any reason for supposing thatthere is a relevant difference between people and machines that bearson the description of what they will do in all possible sets ofcircumstances. (Perhaps it might be said that what the objectioninvites us to suppose is that, even though the world is notdeterministic, humans differ from digital machines precisely becausethe operations of the latter are indeed deterministic. But, if theworld is non-deterministic, then there is no reason why digitalmachines cannot be programmed to behave non-deterministically, byallowing them to access input from non-deterministic features of theworld.)

Suppose, instead, that we focus on the question of whether there aresets of rules that describe what a person and a machine“ought” to do in every possible set of circumstances.Whether or not we suppose that norms can be codified—and quiteapart from the question of which kinds of norms are inquestion—it is hard to see what grounds there could be for thisjudgment, other than the question-begging claim that machines are notthe kinds of things whose behavior could be subject to norms. (And, inthat case, the initial argument is badly mis-stated: the claim oughtto be that, whereas there are sets of rules that describe what aperson ought to do in every possible set of circumstances, there areno sets of rules that describe what machinesought to do inall possible sets of circumstances!)

2.9 Argument from Extra-Sensory Perception

The strangest part of Turing’s paper is the few paragraphs onESP. Perhaps it is intended to be tongue-in-cheek, though, if it is,this fact is poorly signposted by Turing. Perhaps, instead, Turing wasinfluenced by the apparently scientifically respectable results of J.B. Rhine. At any rate, taking the text at face value, Turing seems tohave thought that there was overwhelming empirical evidence fortelepathy (and he was also prepared to take clairvoyance, precognitionand psychokinesis seriously). Moreover, he also seems to have thoughtthat if the human participant in the game was telepathic, then theinterrogator could exploit this fact in order to determine theidentity of the machine—and, in order to circumvent thisdifficulty, Turing proposes that the competitors should be housed in a“telepathy-proof room.” Leaving aside the point that, as amatter of fact, there is no current statistical support fortelepathy—or clairvoyance, or precognition, ortelekinesis—it is worth asking what kind of theory of the natureof telepathy would have appealed to Turing. After all, if humans canbe telepathic, why shouldn’t digital computers be so as well? Ifthe capacity for telepathy were a standard feature of any sufficientlyadvanced system that is able to carry out human conversation, thenthere is no in-principle reason why digital computers could not be theequals of human beings in this respect as well. (Perhaps this responseassumes that a successful machine participant in the imitation gamewill need to be equipped with sensors, etc. However, as we notedabove, this assumption is not terribly controversial. A plausibleconversationalist has to keep up to date with goings-on in theworld.)

After discussing the nine objections mentioned above, Turing goes onto say that he has “no very convincing arguments of a positivenature to support my views. If I had I should not have taken suchpains to point out the fallacies in contrary views.” (454)Perhaps Turing sells himself a little short in this self-assessment.First of all—as his brief discussion of solipsism makesclear—it is worth asking what grounds we have for attributingintelligence (thought, mind) to other people. If it is plausible tosuppose that we base our attributions on behavioral tests orbehavioral criteria, then his claim about the appropriate test toapply in the case of machines seems apt, and his conjecture thatdigital computing machines might pass the test seems like areasonable—though controversial—empirical conjecture.Second, subsequent developments in the philosophy of mind—and,in particular, the fashioning of functionalist theories of themind—have provided a more secure theoretical environment inwhich to place speculations about the possibility of thinkingmachines. If mental states are functional states—and if mentalstates are capable of realisation in vastly different kinds ofmaterials—then there is some reason to think that it is anempirical question whether minds can be realised in digital computingmachines. Of course, this kind of suggestion is open to challenge; weshall consider some important philosophical objections in the laterparts of this review.

3. Some Minor Issues Arising

There are a number of much-debated issues that arise in connectionwith the interpretation of various parts of Turing (1950), and that wehave hitherto neglected to discuss. What has been said in the firsttwo sections of this document amounts to our interpretation of whatTuring has to say (perhaps bolstered with what we take to be furtherrelevant considerations in those cases where Turing’s remarkscan be fairly readily improved upon). But since some of thisinterpretation has been contested, it is probably worth noting wherethe major points of controversy have been.

3.1 Interpreting the Imitation Game

Turing (1950) introduces the imitation game by describing a game inwhich the participants are a man, a woman, and a human interrogator.The interrogator is in a room apart from the other two, and is set thetask of determining which of the other two is a man and which is awoman. Both the man and the woman are set the task of trying toconvince the interrogator that they are the woman. Turing recommendsthat the best strategy for the woman is to answer all questionstruthfully; of course, the best strategy for the man will require somelying. The participants in this game also use teletypewriter tocommunicate with one another—to avoid clues that might beoffered by tone of voice, etc. Turing then says: “We now ask thequestion, ‘What will happen when a machine takes the part of Ain this game?’ Will the interrogator decide wrongly as oftenwhen the game is played like this as he does when the game is playedbetween a man and a woman?” (434).

Now, of course, it ispossible to interpret Turing as hereintending to say what he seems literally to say, namely, that the newgame is one in which the computer must pretend to be a woman, and theother participant in the game is a woman. (For discussion, see, forexample, Genova (1994) and Traiger (2000).) And it is alsopossible to interpret Turing as intending to say that the newgame is one in which the computer must pretend to be a woman, and theother participant in the game is a man who must also pretend to be awoman. However, as Copeland (2000), Piccinini (2000), and Moor (2001)convincingly argue, the rest of Turing’s article, and materialin other articles that Turing wrote at around the same time, verystrongly support the claim that Turing actually intended the standardinterpretation that we gave above, viz. that the computer is topretend to be a human being, and the other participant in the game isa human being of unspecified gender. Moreover, as Moor (2001) argues,there is no reason to think that one would get a better test if thecomputer must pretend to be a woman and the other participant in thegame is a man pretending to be a woman; and, indeed, there is somereason to think that one would get a worse test. Perhaps it would makeno difference to the effectiveness of the test if the computer mustpretend to be a woman, and the other participant is a woman (any morethan it would make a difference if the computer must pretend to be anaccountant and the other participant is an accountant); however, thisconsideration is simply insufficient to outweigh the strong textualevidence that supports the standard interpretation of the imitationgame that we gave at the beginning of our discussion of Turing (1950).(For a dissenting view about many of the matters discussed in thisparagraph, see Sterrett (2000; 2020).)

3.2 Turing’s Predictions

As we noted earlier, Turing (1950) makes the claim that:

I believe that in about fifty years’ time it will be possible toprogramme computers, with a storage capacity of about 10⁹,to make them play the imitation game so well that an averageinterrogator will not have more than 70 percent chance of making theright identification after five minutes of questioning. … Ibelieve that at the end of the century the use of words and generaleducated opinion will have altered so much that one will be able tospeak of machines thinking without expecting to be contradicted.

Most commentators contend that this claim has been shown to bemistaken: in the year 2000,no-one was able to programcomputers to make them play the imitation game so well that an averageinterrogator had no more than a 70% chance of making the correctidentification after five minutes of questioning. Copeland (2000)argues that this contention is seriously mistaken: “about fiftyyears” is by no means “exactly fifty years,” and itremains open that we may soon be able to do the required programming.Against this, it should be noted that Turing (1950) goes onimmediately to refer to how things will be “at the end of thecentury,” which suggests that not too much can be read into thequalifying “about.” However, as Copeland (2000) pointsout, there are other more cautious predictions that Turing makeselsewhere (e.g., that it would be “at least 100 years”before a machine was able to pass an unrestricted version of histest); and there are other predictions that are made in Turing (1950)that seem to have been vindicated. In particular, it is plausible toclaim that, in the year 2000, educated opinion had altered to theextent that, in many quarters, one could speak of the possibility ofmachines’ thinking—and of machines’learning—without expecting to be contradicted. As Moor (2001)points out, “machine intelligence” is not the oxymoronthat it might have been taken to be when Turing first started thinkingabout these matters.

3.3 A Useful Distinction

There are two different theoretical claims that are run together inmany discussions of The Turing Test that can profitably be separated.One claim holds that the general scheme that is described inTuring’s Imitation Game provides a good test for the presence ofintelligence. (If something can pass itself off as a person undersufficiently demanding test conditions, then we have very good reasonto suppose that that thing is intelligent.) Another claim holds thatan appropriately programmed computer could pass the kind of test thatis described in the first claim. We might call the first claim“The Turing Test Claim” and the second claim “TheThinking Machine Claim”. Some objections to the claims made inTuring (1950) are objections to the Thinking Machine Claim, but notobjections to the Turing Test Claim. (Consider, for example, theargument of Searle (1982), which we discuss further in Section 6.)However, other objections are objections to the Turing Test Claim.Until we get to Section 6, we shall be confining our attention todiscussions of the Turing Test Claim.

3.4 A Further Note

In this article, we follow the standard philosophical conventionaccording to which “a mind” means “at least onemind”. If “passing the Turing Test” impliesintelligence, then “passing the Turing Test” implies thepresence of at least one mind. We cannot here explore recentdiscussions of “swarm intelligence”, “collectiveintelligence”, and the like. However, it is surely clear thattwo people taking turns could “pass the Turing Test” incircumstances in which we should be very reluctant to say that thereis a “collective mind” that has the minds of the two ascomponents.

4. Assessment of the Current Standing of The Turing Test

Given the initial distinction that we made between different ways inwhich the expression The Turing Test gets interpreted in theliterature, it is probably best to approach the question of theassessment of the current standing of The Turing Test by dividingcases. True enough, we think that there is a correct interpretation ofexactly what test it is that is proposed by Turing (1950); but acomplete discussion of the current standing of The Turing Test shouldpay at least some attention to the current standing of other teststhat have been mistakenly supposed to be proposed by Turing(1950).

There are a number of main ideas to be investigated. First, there isthe suggestion that The Turing Test provides logically necessary andsufficient conditions for the attribution of intelligence. Second,there is the suggestion that The Turing Test provides logicallysufficient—but not logically necessary—conditions for theattribution of intelligence. Third, there is the suggestion that TheTuring Test provides “criteria”—defeasiblesufficient conditions—for the attribution of intelligence.Fourth—and perhaps not importantly distinct from the previousclaim—there is the suggestion that The Turing Test provides(more or less strong) probabilistic support for the attribution ofintelligence. We shall consider each of these suggestions in turn.

4.1 (Logically) Necessary and Sufficient Conditions

It is doubtful whether there are very many examples of people who haveexplicitly claimed that The Turing Test is meant to provide conditionsthat are both logically necessary and logically sufficient for theattribution of intelligence. (Perhaps Block (1981) is one such case.)However, some of the objections that have been proposed against TheTuring Test only make sense under the assumption that The Turing Testdoes indeed provide logically necessary and logically sufficientconditions for the attribution of intelligence; and many more of theobjections that have been proposed against The Turing Test only makesense under the assumption that The Turing Test provides necessary andsufficient conditions for the attribution of intelligence, where themodality in question is weaker than the strictly logical, e.g., nomicor causal.

Consider, for example, those people who have claimed that The TuringTest is chauvinistic; and, in particular, those people who haveclaimed that it is surely logically possible for there to be somethingthat possesses considerable intelligence, and yet that is not able topass The Turing Test. (Examples: Intelligent creatures might fail topass The Turing Test because they do not share our way of life;intelligent creatures might fail to pass The Turing Test because theyrefuse to engage in games of pretence; intelligent creatures mightfail to pass The Turing Test because the pragmatic conventions thatgovern the languages that they speak are so very different from thepragmatic conventions that govern human languages. Etc.) None of thiscan constitute objections to The Turing Test unless The Turing Testdeliversnecessary conditions for the attribution ofintelligence.

French (1990) offers ingenious arguments that are intended to showthat “the Turing Test provides a guarantee not of intelligence,but of culturally-oriented intelligence.” But, of course,anything that has culturally-oriented intelligencehasintelligence; so French’s objections cannot be taken to bedirected towards the idea that The Turing Test provides sufficientconditions for the attribution of intelligence. Rather—as weshall see later—French supposes that The Turing Test establishessufficient conditions that no machine will ever satisfy. That is, inFrench’s view, what is wrong with The Turing Test is that itestablishes utterly uninteresting sufficient conditions for theattribution of intelligence.

Floridi and Chiriatti (2020: 683) say that The Turing Test providesnecessary but insufficient conditions for intelligence: not passingThe Turing Test disqualifies an AI from being intelligent, but passingThe Turing Test is not sufficient to qualify an AI as intelligent.However, they also say that “any reader ... will be wellacquainted with the nature of the test, so we shall not describeit.” The account that they would give of The Turing Testmust be quite different from the account of The Turing Test that wehave been presenting.

4.2 Logically Sufficient Conditions

There are many philosophers who have supposed that The Turing Test isintended to provide logically sufficient conditions for theattribution of intelligence. That is, there are many philosophers whohave supposed that The Turing Test claims that it is logicallyimpossible for something that lacks intelligence to pass The TuringTest. (Often, this supposition goes with an interpretation accordingto which passing The Turing Test requires rather a lot, e.g.,producing behavior that is indistinguishable from human behavior overan entire lifetime.)

There are well-known arguments against the claim that passing TheTuring Test—or any other purely behavioral test—provideslogically sufficient conditions for the attribution of intelligence.The standard objection to this kind of analysis ofintelligence (mind, thought) is that a being whose behavior wasproduced by “brute force” methods ought not to count asintelligent (as possessing a mind, as having thoughts).

Consider, for example, Ned Block’sBlockhead. Blockheadis a creature that looks just like a human being, but that iscontrolled by a “game-of-life look-up tree,” i.e. by atree that contains a programmed response for every discriminable inputat each stage in the creature’s life. If we agree that Blockheadis logically possible, and if we agree that Blockhead is notintelligent (does not have a mind, does not think), then Blockhead isa counterexample to the claim that the Turing Test provides alogically sufficient condition for the ascription of intelligence.After all, Blockhead could be programmed with a look-up tree thatproduces responses identical with the ones thatyou wouldgive over the entire course ofyour life (given the sameinputs).

There are perhaps only two ways in which someone who claims that TheTuring Test offers logically sufficient conditions for the attributionof intelligence can respond to Block’s argument. First, it couldbe denied that Blockhead is a logical possibility; second, it could beclaimed that Blockhead would be intelligent (have a mind, think).

In order to deny that Blockhead is a logical possibility, it seemsthat what needs to be denied is the commonly accepted link betweenconceivability and logical possibility: it certainly seems thatBlockhead isconceivable, and so, if (properly circumscribed)conceivability is sufficient for logical possibility, then it seemsthat we have good reason to accept that Blockhead is a logicalpossibility. Since it would take us too far away from our presentconcerns to explore this issue properly, we merely note that itremains a controversial question whether (properly circumscribed)conceivability is sufficient for logical possibility. (For furtherdiscussion of this issue, see Crooke (2002).)

The question of whether Blockhead is intelligent (has a mind, thinks)may seem straightforward, but—despite Block’s confidentassertion that Blockhead “has all of the intelligence of atoaster”—it is not obvious that we should deny thatBlockhead is intelligent. Blockhead may not be a particularlyefficient processor of information; but it is at least a processor ofinformation, and that—in combination with the behavior that isproduced as a result of the processing of information—might wellbe taken to be sufficient grounds for the attribution ofsomelevel of intelligence to Blockhead. For further critical discussion ofthe argument of Block (1981), see McDermott (2014), and Pautz andStoljar (2019).

4.3 Criteria

In hisPhilosophical Investigations, Wittgenstein famouslywrites: “An ‘inner process’ stands in need ofoutward criteria” (580). Exactly what Wittgenstein meant by thisremark is unclear, but one way in which it might be interpreted is asfollows: in order to be justified in ascribing a “mentalstate” to some entity, there must be some true claims about theobservable behavior of that entity that, (perhaps) together with othertrue claims about that entity (not themselves couched in“mentalistic” vocabulary), entail that the entity has themental state in question. If no true claims about the observablebehavior of the entity can play any role in the justification of theascription of the mental state in question to the entity, then thereare no grounds for attributing that kind of mental state to theentity.

The claim that, in order to be justified in ascribing a mental stateto an entity, there must be some true claims about the observablebehavior of that entity that alone—i.e. without the addition ofany other true claims about that entity—entail that the entityhas the mental state in question, is a piece of philosophicalbehaviorism. It may be—for all that we are able toargue—that Wittgenstein was a philosophical behaviorist; it maybe—for all that we are able to argue—that Turing was one,too. However, if we go by the letter of the account given in theprevious paragraph, then all that need follow from the claim that theTuring Test is criterial for the ascription of intelligence (thought,mind) is that, when other true claims (not themselves couched in termsof mentalistic vocabulary) are conjoined with the claim that an entityhas passed the Turing Test, it then follows that the entity inquestion has intelligence (thought, mind).

(Note that the parenthetical qualification that the additional trueclaims not be couched in terms of mentalistic vocabulary is only oneway in which one might try to avoid the threat of trivialization. Thedifficulty is that the addition of the true claim that an entity has amind will always produce a set of claims that entails that that entityhas a mind, no matter what other claims belong to the set!)

To see how the claim that the Turing Test is merely criterial for theascription of intelligence differs from the logical behaviorist claimthat the Turing Test provides logically sufficient conditions for theascription of intelligence, it suffices to consider the question ofwhether it isnomically possible for there to be a“hand simulation” of a Turing Test program. Many peoplehave supposed that there is good reason to deny that Blockhead is anomic (or physical) possibility. For example, inThe Physics ofImmortality, Frank Tipler provides the following argument indefence of the claim that it is physically impossible to “handsimulate” a Turing-Test-passing program:

If my earlier estimate that the human brain can code as much as10¹⁵ bits is correct, then since an average book codesabout 10⁶ bits … it would require more than 100million books to code the human brain. It would take at least thirtyfive-story main university libraries to hold this many books. We knowfrom experience that we can access any memory in our brain in about100 seconds, so a hand simulation of a Turing Test-passing programwould require a human being to be able to take off the shelf, glancethrough, and return to the shelf all of these 100 million books in 100seconds. If each book weighs about a pound (0.5 kilograms), and on theaverage the book moves one yard (one meter) in the process of takingit off the shelf and returning it, then in 100 seconds the energyconsumed in just moving the books is 3 x 10¹⁹ joules; therate of energy consumption is 3 x 10¹¹ megawatts. Since ahuman uses energy at a normal rate of 100 watts, the power required isthe bodily power of 3 x 10¹⁵ human beings, about a milliontimes the current population of the entire earth. A typical largenuclear power plant has a power output of 1,000 megawatts, so a handsimulation of the human program requires a power output equal to thatof 300 million large nuclear power plants. As I said, a man can nomore hand-simulate a Turing Test-passing program than he can jump tothe Moon. In fact, it is far more difficult. (40)

While there might be ways in which the details of Tipler’sargument could be improved, the general point seems clearly right: thekind of combinatorial explosion that is required for a look-up treefor a human being is ruled out by the laws and boundary conditionsthat govern the operations of the physical world. But, if this isright, then, while it may be true that Blockhead is alogicalpossibility, it follows that Blockhead is not anomic orphysical possibility. And then it seems natural to hold thatThe Turing Test does indeed providenomically sufficientconditions for the attribution of intelligence: given everything elsethat we already know—or, at any rate, take ourselves toknow—about the universe in which we live, we would be fullyjustified in concluding that anything that succeeds in passing TheTuring Test is, indeed, intelligent (possessed of a mind, and soforth).

There are ways in which the argument in the previous paragraph mightbe resisted. At the very least, it is worth noting that there is aserious gap in the argument that we have just rehearsed. Even if wecan rule out “hand simulation” of intelligence, it doesnot follow that we have ruled out all other kinds of mere simulationof intelligence. Perhaps—for all that has been argued sofar—there are nomically possible ways of producing meresimulations of intelligence. But, if that’s right, then passingThe Turing Test need not be so much as criterial for the possession ofintelligence: it need not be that given everything else that wealready know—or, at any rate, take ourselves to know—aboutthe universe in which we live, we would be fully justified inconcluding that anything that succeeds in passing The Turing Test is,indeed, intelligent (possessed of a mind, and so forth).

(McDermott (2014) calculates that a look-up table for a participantwho makes 50 conversational exchanges would have about10²²²⁷⁸ nodes. It is tempting to take this calculation toestablish that it is neither nomically nor physically possible forthere to be a “hand simulation” of a Turing Test program,on the grounds that the required number of nodes could not be fittedinto a space much much larger than the entire observableuniverse.)

4.4 Probabilistic Support

When we look at the initial formulation that Turing provides of histest, it is clear that he thought that the passing of the test wouldprovide probabilistic support for the hypothesis of intelligence.There are at least two different points to make here. First, theprediction that Turing makes is itself probabilistic: Turingpredicts that, in about fifty years from the time of his writing, itwill be possible to programme digital computers to make them play theimitation game so well that an average interrogator will have no morethan a seventy per cent chance of making the right identificationafter five minutes of questioning. Second, the probabilistic nature ofTuring’s prediction provides good reason to think that thetest that Turing proposes is itself of a probabilisticnature: a given level of success in the imitation gameproduces—or, at any rate, should produce—a specifiablelevel of increase in confidence that the participant in question isintelligent (has thoughts, is possessed of a mind). Since Turingdoesn’t tell us how he supposes that levels of success in theimitation game correlate with increases in confidence that theparticipant in question is intelligent, there is a sense in which TheTuring Test is greatly underspecified. Relevant variables clearlyinclude: the length of the period of time over which the questioningin the game takes place (or, at any rate, the “amount” ofquestioning that takes place); the skills and expertise of theinterrogator (this bears, for example, on the “depth” and“difficulty” of the questioning that takes place); theskills and expertise of the third player in the game; and the numberof independent sessions of the game that are run (particularly whenthe other participants in the game differ from one run to the next).Clearly, a machine that is very successful in many different runs ofthe game that last for quite extended periods of time and that involvehighly skilled participants in the other roles has a much strongerclaim to intelligence than a machine that has been successful in asingle, short run of the game with highly inexpert participants. Thata machine has succeeded in one short run of the game against inexpertopponents might provide some reason for increase in confidence thatthe machine in question is intelligent: but it is clear that resultson subsequent runs of the game could quickly overturn this initialincrease in confidence. That a machine has done much better thanchance over many long runs of the imitation game against a variety ofskilled participants surely provides much stronger evidence that themachine is intelligent. (Given enough evidence of this kind, it seemsthat one could be quite confident indeed that the machine isintelligent, while still—of course—recognizing thatone’s judgment could be overturned by further evidence, such asa series of short runs in which it does much worse than chance againstparticipants who use the same strategy over and over to expose themachine as a machine.)

The probabilistic nature of The Turing Test is often overlooked. Trueenough, Moor (1976, 2001)—along with various othercommentators—has noted that The Turing Test is“inductive,” i.e. that “The Turing Test”provides no more than defeasible evidence of intelligence. However, itis one thing to say that success in “a rigorous Turingtest” provides no more than defeasible evidence of intelligence;it is quite another to note the probabilistic features to which wehave drawn attention in the preceding paragraph. Consider, forexample, Moor’s observation (Moor 2001:83) that “…inductive evidence gathered in a Turing test can be outweighed by newevidence. … If new evidence shows that a machine passed theTuring Test by remote control run by a human behind the scenes, thenreassessment is called for.” This—and other similarpassages—seems to us to suggest that Moor supposes that a“rigorous Turing test” is a one-off event in which themachine either succeeds or fails. But this interpretation of TheTuring Test is vulnerable to the kind of objection lodged byBringsjord (1994): even on a moderately long single run withrelatively expert participants, it may not be all that unlikely thatan unintelligent machine serendipitously succeeds in the imitationgame. In our view, given enough sufficiently long runs with differentsufficiently expert participants, the likelihood of serendipitoussuccess can be made as small as one wishes. Thus, whileBringsjord’s “argument from serendipity” has forceagainst some versions of The Turing Test, it has no force against themost plausible interpretation of the test that Turing actuallyproposed.

It is worth noting that it is quite easy to construct moresophisticated versions of “The Imitation Game” that yieldmore fine-grained statistical data. For example, rather than gettingthe judges to issue Yes/No verdicts about both of the participants inthe game, one could get the judges to provide probabilistic answers.(“I give a 75% probability to the claim that A is the machine,and only 25% probability to the claim that B is the machine.”)This point is important when one comes to consider criticisms of the“methodology” implicit in “The Turing Test”.(For further discussion of the probabilistic nature of “TheTuring Test”, see Shieber (2007).)

5. Alternative Tests

Some of the literature about The Turing Test is concerned withquestions about the framing of a test that can provide a suitableguide to future research in the area of Artificial Intelligence. Theidea here is very simple. Suppose that we have the ambition to producean artificially intelligent entity. What tests should we take assetting the goals that putatively intelligent artificial systemsshould achieve? Should we suppose that The Turing Test provides anappropriate goal for research in this field? In assessing theseproposals, there are two different questions that need to be borne inmind. First, there is the question whether it is a useful goal for AIresearch to aim to make a machine that can pass the given test(administered over the specified length of time, at the specifieddegree of success). Second, there is the question of the appropriateconclusion to draw about the mental capacities of a machine that doesmanage to pass the test (administered over the specified length oftime, at the specified degree of success).

Opinion on these questions is deeply divided. Some people suppose thatThe Turing Test does not provide a useful goal for research in AIbecause it is far too difficult to produce a system that can pass thetest. Other people suppose that The Turing Test does not provide auseful goal for research in AI because it sets a very narrow target(and thus sets unnecessary restrictions on the kind of research thatgets done). Some people think that The Turing Test provides anentirely appropriate goal for research in AI; while other people thinkthat there is a sense in which The Turing Test is not really demandingenough, and who suppose that The Turing Test needs to be extended invarious ways in order to provide an appropriate goal for AI. We shallconsider some representatives of each of these positions in turn.

There are some people who continue to endorse The Turing Test. Forexample, Neufeld and Finnestad (2020a) (2020b) argue that The TuringTest is no barrier to progress in AI, requires no significantredefinition, and does not shut down other avenues of investigation.Maybe we do better just to take The Turing Test to define a watershedrather than a threshold towards which we might hope to makeincremental progression.

5.1 The Turing Test is Too Hard

Some people have claimed that The Turing Test doesn’t set anappropriate goal for current research in AI because we are plainly sofar away from attaining this goal. Amongst these people there are somewho have gone on to offer reasons for thinking that it is doubtfulthat we shall ever be able to create a machine that can pass TheTuring Test—or, at any rate, that it is doubtful that we shallbe able to do this at any time in the foreseeable future. Perhaps themost interesting arguments of this kind are due to French (1990); atany rate, these are the arguments that we shall go on to consider.(Cullen (2009) sets out similar considerations.)

According to French, The Turing Test is “virtuallyuseless” as a real test of intelligence, because nothing withouta “human subcognitive substrate” could pass the test, andyet the development of an artificial “human cognitivesubstrate” is almost impossibly difficult. At the very least,there are straightforward sets of questions that reveal“low-level cognitive structure” and that—inFrench’s view—are almost certain to be successful inseparating human beings from machines.

First, if interrogators are allowed to draw on the results of researchinto, say,associative priming, then there is data that willvery plausibly separate human beings from machines. For example, thereis research that shows that, if humans are presented with series ofstrings of letters, they require less time to recognize that a stringis a word (in a language that they speak) if it is preceded by arelated word (in the language that they speak), rather than by anunrelated word (in the language that they speak) or a string ofletters that is not a word (in the language that they speak). Providedthat the interrogator has accurate data about average recognitiontimes for subjects who speak the language in question, theinterrogator can distinguish between the machine and the human simplyby looking at recognition times for appropriate series of strings ofletters. Or so says French. It isn’t clear to us that this isright. After all, the design of The Turing Test makes it hard to seehow the interrogator will get reliable information about responsetimes to series of strings of symbols. The point of putting thecomputer in a separate room and requiring communication by teletypewas precisely to rule out certain irrelevant ways of identifying thecomputer. If these requirements don’t already rule outidentification of the computer by the application of tests ofassociative priming, then the requirements can surely be altered tobring it about that this is the case. (Perhaps it is also worth notingthat administration of the kind of test that French imagines is notordinary conversation; nor is it something that one would expect thatany but a few expert interrogators would happen upon. So, even if thecircumstances of The Turing Test do not rule out the kind of procedurethat French here envisages, it is not clear that The Turing Test willbe impossibly hard for machines to pass.)

Second, at a slightly higher cognitive level, there are certain kindsof “ratings games” that French supposes will be veryreliable discriminators between humans and machines. For instance, the“Neologism Ratings Game”—which asks participants torank made-up words on their appropriateness as names for given kindsof entities—and the “Category RatingGame”—which asks participants to rate things of onecategory as things of another category—are both, according toFrench, likely to prove highly reliable in discriminating betweenhumans and machines. For, in the first case, the ratings that humansmake depend upon large numbers of culturally acquired associations(which it would be well-nigh impossible to identify and describe, andhence which it would (arguably) be well-nigh impossible to programinto a computer). And, in the second case, the ratings that peopleactually make are highly dependent upon particular social and culturalsettings (and upon the particular ways in which human life isexperienced). To take French’s examples, there would bewidespread agreement amongst competent English speakers in thetechnologically developed Western world that “Flugblogs”is not an appropriate name for a breakfast cereal, while“Flugly” is an appropriate name for a child’s teddybear. And there would also be widespread agreement amongst competentspeakers of English in the developed world that pens rate higher asweapons than grand pianos rate as wheelbarrows. Again, there arequestions that can be raised about French’s argument here. It isnot clear to us that the data upon which the ratings games rely is asreliable as French would have us suppose. (At least one of us thinksthat “Flugly” would be an entirely inappropriate name fora child’s teddy bear, a response that is due to the similaritybetween the made-up word “Flugly” and the word“Fugly,” that had some currency in the primarilyundergraduate University college that we both attended. At least oneof us also thinks that young children would very likely be delightedto eat a cereal called “Flugblogs,” and that a good answerto the question about ratings pens and grand pianos is that it alldepends upon the pens and grand pianos in question. What if the grandpiano has wheels? What if the opponent has a sword or a sub-machinegun? It isn’t obvious that a refusal to play this kind ofratings game would necessarily be a give-away that one is a machine.)Moreover, even if the data is reliable, it is not obvious that any buta select group of interrogators will hit upon this kind of strategyfor trying to unmask the machine; nor is it obvious that it isimpossibly hard to build a machine that is able to perform in the wayin which typical humans do on these kinds of tests. In particular,if—as Turing assumes—it is possible to make learningmachines that can be “trained up” to learn how to dovarious kinds of tasks, then it is quite unclear why these machinescouldn’t acquire just the same kinds of “subcognitivecompetencies” that human children acquire when they are“trained up” in the use of language.

There are other reasons that have been given for thinking that TheTuring Test is too hard (and, for this reason, inappropriate insetting goals for current research into artificial intelligence). Ingeneral, the idea is that there may well be features of humancognition that are particularly hard to simulate, but that are not inany sense essential for intelligence (or thought, or possession of amind). The problem here is not merely that The Turing Test really doestest forhuman intelligence; rather, the problem here is thefact—if indeed it is a fact—that there are quiteinessential features of human intelligence that are extraordinarilydifficult to replicate in a machine. If this complaint isjustified—if, indeed, there are features of human intelligencethat are extraordinarily difficult to replicate in machines,and that could and would be reliably used to unmask machinesin runs of The Turing Test—then there is reason to worry aboutthe idea that The Turing Test sets an appropriate direction forresearch in artificial intelligence. However, as our discussion ofFrench shows, there may be reason for caution in supposing that thekinds of considerations discussed in the present section show that weare already in a position to say that The Turing Test does indeed setinappropriate goals for research in artificial intelligence.

5.2 The Turing Test is Too Narrow

There are authors who have suggested that The Turing Test does not seta sufficiently broad goal for research in the area of artificialintelligence. Amongst these authors, there are many who suppose thatThe Turing Test is too easy. (We go on to consider some of theseauthors in the next sub-section.) But there are also some authors whohave supposed that, even if the goal that is set by The Turing Test isvery demanding indeed, it is nonetheless too restrictive.

Objection to the notion that the Turing Test provides a logicallysufficient condition for intelligence can be adapted to the goal ofshowing that the Turing Test is too restrictive. Consider, forexample, Gunderson (1964). Gunderson has two major complaints to makeagainst The Turing Test. First, he thinks that success inTuring’s Imitation Game might come for reasons other than thepossession of intelligence. But, second, he thinks that success in theImitation Game would be but one example of the kinds of things thatintelligent beings can do and—hence—in itself could not betaken as a reliable indicator of intelligence. By way of analogy,Gunderson offers the case of a vacuum cleaner salesman who claims thathis product is “all-purpose” when, in fact, all it does isto suck up dust. According to Gunderson, Turing is in the sameposition as the vacuum cleaner salesmanif he is prepared tosay that a machine is intelligent merely on the basis of its successin the Imitation Game. Just as “all purpose” entails theability to do a range of things, so, too, “thinking”entails the possession of a range of abilities (beyond the mereability to succeed in the Imitation Game).

There is an obvious reply to the argument that we have here attributedto Gunderson, viz. that a machine that is capable of success in theImitation Game is capable of doing a large range of different kinds ofthings. In order to carry out a conversation, one needs to have manydifferent kinds of cognitive skills, each of which is capable ofapplication in other areas. Apart from the obvious general cognitivecompetencies—memory, perception, etc.—there are manyparticular competencies—rudimentary arithmetic abilities,understanding of the rules of games, rudimentary understanding ofnational politics, etc.—which are tested in the course ofrepeated runs of the Imitation Game. It is inconceivable that thatthere be a machine that is startlingly good at playing the ImitationGame, and yet unable to do well atany other tasks that mightbe assigned to it; and it is equally inconceivable that there is amachine that is startlingly good at the Imitation Game and yet thatdoes not have a wide range of competencies that can be displayed in arange of quite disparate areas. To the extent that Gunderson considersthis line of reply, all that he says is that there is no reason tothink that a machine that can succeed in the Imitation Gamemust have more than a narrow range of abilities; we thinkthat there is no reason to believe that this reply should be takenseriously.

More recently, Erion (2001) has defended a position that has someaffinity to that of Gunderson. According to Erion, machines might be“capable of outperforming human beings in limited tasks inspecific environments, [and yet] still be unable to act skillfully inthe diverse range of situations that a person with common sensecan” (36). On one way of understanding the claim that Erionmakes, he too believes that The Turing Test only identifies oneamongst a range of independent competencies that are possessed byintelligent human beings, and it is for this reason that he proposes amore comprehensive “Cartesian Test” that “involves amore careful examination of a creature’s language, [and] alsotests the creature’s ability to solve problems in a wide varietyof everyday circumstances” (37). In our view, at least when TheTuring Test is properly understood, it is clear that anything thatpasses The Turing Test must have the ability to solve problems in awide variety of everyday circumstances (because the interrogators willuse their questions to probe these—and other—kinds ofabilities in those who play the Imitation Game).

5.3 The Turing Test is Too Easy

There are authors who have suggested that The Turing Test should bereplaced with a more demanding test of one kind or another. It is notat all clear that any of these tests actually proposes a better goalfor research in AI than is set by The Turing Test. However, in thissection, we shall not attempt to defend that claim; rather, we shallsimply describe some of the further tests that have been proposed, andmake occasional comments upon them. (One preliminary point upon whichwe wish to insist is that Turing’s Imitation Game was devisedagainst the background of the limitations imposed by then currenttechnology. It is, of course, not essential to the game that tele-textdevices be used to prevent direct access to information about the sexor genus of participants in the game. We shall not advert to theserelatively mundane kinds of considerations in what follows.)

5.3.1 The Total Turing Test

Harnad (1989, 1991) claims that a better test than The Turing Testwill be one that requires responses to all of our inputs, and notmerely to text-formatted linguistic inputs. That is, according toHarnad, the appropriate goal for research in AI has to be to constructa robot with something like human sensorimotor capabilities. Harnadalso considers the suggestion that it might be an appropriate goal forAI to aim for “neuromolecular indistinguishability,” butrejects this suggestion on the grounds that once we know how to make arobot that can pass his Total Turing Test, there will be no problemsabout mind-modeling that remain unsolved. It is an interestingquestion whether the test that Harnad proposes sets a more appropriategoal for AI research. In particular, it seems worth noting that it isnot clear that there could be a system that was able to pass TheTuring Test and yet that was not able to pass The Total Turing Test.Since Harnad himself seems to think that it is quite likely that“full robotic capacities [are] … necessary to generate… successful linguistic performance,” it is unclear whythere is reason to replace The Turing Test with his extended test.(This point against Harnad can be found in Hauser (1993:227), andelsewhere.)

5.3.2 The Lovelace Test

Bringsjord et al. (2001) propose that a more satisfactory aim for AIis provided by a certain kind of meta-test that they call the LovelaceTest. They say that an artificial agentA, designed by humanH, passes the Lovelace Test just in case three conditions are jointlysatisfied: (1) the artificial agentA produces outputO; (2)A’s outputtingO is not theresult of a fluke hardware error, but rather the result of processesthatA can repeat; and (3)H—or someone whoknows whatH knows and who hasH’sresources—cannot explain howA producedO byappeal toA’s architecture, knowledge-base and corefunctions. Against this proposal, it seems worth noting that there arequestions to be raised about the interpretation of the thirdcondition. If a computer program is long and complex, then no humanagent can explain incomplete detail how the output wasproduced. (Why did the computer output 3.16 rather than 3.17?) But ifwe are allowed to give a highly schematic explanation—thecomputer took the input, did some internal processing and thenproduced an answer—then it seems that it will turn out to bevery hard to support the claim that human agents ever do anythinggenuinely creative. (After all, we too take external input, performinternal processing, and produce outputs.) What is missing from theaccount that we are considering is any suggestion about theappropriatelevel of explanation that is to be provided. Itis quite unclear why we should suppose that there is a relevantdifference between people and machines at any level of explanation;but, if that’s right, then the test in question is trivial. (Onemight also worry that the proposed test rules outby fiat thepossibility that creativity can be best achieved by using genuinerandomising devices.)

5.3.3 The Truly Total Turing Test

Schweizer (1998) claims that a better test than The Turing Test willadvert to the evolutionary history of the subjects of the test. Whenwe attribute intelligence to human beings, we rely on an extensivehistorical record of the intellectual achievements of human beings. Onthe basis of this historical record, we are able to claim that humanbeings are intelligent; and we can rely upon this claim when weattribute intelligence to individual human beings on the basis oftheir behavior. According to Schweizer, if we are to attributeintelligence to machines, we need to be able to advert to a comparablehistorical record of cognitive achievements. So, it will only be whenmachines have developed languages, written scientific treatises,composed symphonies, invented games, and the like, that we shall be ina position to attribute intelligence to individual machines on thebasis of their behavior. Of course, we can still use The Turing Testto determine whether an individual machine is intelligent: but ouranswer to the question won’t depend merely upon whether or notthe machine is successful in The Turing Test; there is the further“evolutionary” condition that also must be satisfied.Against Schweizer, it seems worth noting that it is not at all clearthat our reason for granting intelligence to other humans on the basisof their behavior is that we have prior knowledge of the collectivecognitive achievements of human beings.

5.3.4 Further Proposals

Damassino (2020) suggests that it would be better to require testsubjects to produce an enquiry in which performance is assessed alongthree dimensions: (a) comparison with human performance; (b) successin completing the enquiry; and (c) efficiency in completing theenquiry (minimisation of the number of questions asked in completingthe enquiry). The motivation given for this proposal is that, becauseThe Turing Test attracts projects whose primary ambition is to fooljudges, it is concerned with whether or how well test subjects performon their allocated tasks. It seems to us that there is nothing herethat impugns The Turing Test. It does not count against TheTuring Test that public competitions based on it with prizesattached lead to gaming, given that everyone knows that those prizesare being awarded to entries that clearly do not pass The Turing Test.If anything is impugned here, it is the public competitions, ratherthan The Turing Test.

Kulikov (2020) suggests that there is value in consideringPreferential Engagement Tests or Meaningful Engagement Tests. Eventhough computers can now beat the best humans at chess, many peopleprefer to play chess with humans rather than with expert chess-playingcomputers. Perhaps, even if computers could pass The Turing Test,people would prefer to carry on conversations with humans rather thanwith expert conversational computers. We think that this kind ofspeculation relies upon assumptions about what could make for expertconversational partners. If our conversational partners need to beable to update information about their surroundings in realtime—for example, while watching a game of football—thenwe will not think that there is a direct path from GPT-3 to expertconversational partners. If only androids can be expert conversationalpartners, then it is less clear that Preferential Engagement Tests orMeaningful Engagement Tests will track anything other thananthropocentric bias.

5.4 Should the Turing Test be Considered Harmful?

Perhaps the best known attack on the suggestion that The Turing Testprovides an appropriate research goal for AI is due to Hayes and Ford(1995). Among the controversial claims that Hayes and Ford make, thereare at least the following:

Turing suggested the imitation game as a definite goal for programof research.
Turing intended The Turing Test to be a gender test rather than aspecies test.
The task of trying to make a machine that is successful in TheTuring Test is so extremely difficult that no one could seriouslyadopt the creation of such a machine as a research goal.
The Turing Test suffers from the basic design flaw that it setsout to confirm a “null hypothesis”, viz. that there is nodifference in behavior between certain machines and humans.
No null effect experiment can provide an adequate criterion forintelligence, since the question can always arise that the judges didnot look hard enough (and did not raise the right kinds of questions).But, if this question is left open, then there is no stable endpointof enquiry.
Null effect experiments cannot measure anything: The Turing Testcan only test for complete success. (“A man who failed to seemfeminine in 10% of what he said would almost always fail the Imitationgame.”)
The Turing Test is really a test of the ability of the humanspecies to discriminate its members from human imposters. (“Thegender test … is a test of making a mechanicaltransvestite.”)
The Turing Test is circular: what it fails to detect cannot be“intelligence” or“humanity”, since many humanswould fail The Turing Test. Indeed, “since one of the playersmust be judged to be a machine, half the human population would failthe species test”.
The perspective of The Turing Test is arrogant and parochial: itmistakenly assumes that we can understand human cognition withoutfirst obtaining a firm grasp of the basic principles ofcognition.
The Turing Test does not admit of weaker, different, or evenstronger forms of intelligence than those deemed human.

Some of these claims seem straightforwardly incorrect. Consider (h),for example. In what sense can it be claimed that 50% of the humanpopulation would fail “the species test”? If “thespecies test” requires the interrogator to decide which of twopeople is a machine, why should it be thought that the verdict of theinterrogator has any consequences for the assessment of theintelligence of the person who is judged to be a machine? (Remember,too, that one of the conditions for “the speciestest”—as it is originally described by Hayes andFord—is that one of the contestantsis a machine. Whilethe machine can “demonstrate” its intelligence by winningthe imitation game, a person cannot “demonstrate” theirlack of intelligence by failing to win.)

It seems wrong to say that The Turing Test is defective because it isa “null effect experiment”. True enough, there is a sensein which The Turing Test does look for a “null result”: ifordinary judges in the specified circumstances fail to identify themachine (at a given level of success), then there is a givenlikelihood that the machine is intelligent. But the point of insistingon “ordinary judges” in the specified circumstances isprecisely to rule out irrelevant ways of identifying the machine (i.e.ways of identifying the machine that are not relevant to the questionwhether it is intelligent). There might be all kinds of irrelevantdifferences between a given kind of machine and a humanbeing—not all of them rendered undetectable by the experimentalset-up that Turing describes—but The Turing Test will remain agood test provided that it is able to ignore these irrelevantdifferences.

It also seems doubtful that it is a serious failing of The Turing Testthat it can only test for “complete success”. On the onehand, if a man has a one in ten chance of producing a claim that isplainly not feminine, then we can compute the chance that he will bediscovered in a game in which he answersNquestions—and, ifN is sufficiently small, then itwon’t turn out that “he would almost always fail towin”. On the other hand, as we noted at the end of Section 4.4above, if one were worried about the “YES/NO” nature of“The Turing Test”, then one could always get the judges toproduce probabilistic verdicts instead. This change preserves thecharacter of The Turing Test, but gives it scope for greaterstatistical sophistication.

While there are (many) other criticisms that can be made of the claimsdefended by Hayes and Ford (1995), it should be acknowledged that theyare right to worry about the suggestion that The Turing Test providesthe defining goal for research in AI. There are various reasons whyone should be loathe to accept the proposition that the one centralambition of AI research is to produce artificial people. However it isworth pointing out that there is no reason to think that Turingsupposed that The Turing Test defined the field of AI research (andthere is not much evidence that any other serious thinkers havethought so either). Turing himself was well aware that there might benon-human forms of intelligence—cf. (j) above. However, all ofthis remains consistent with the suggestion that it is quiteappropriate to suppose that The Turing Test setsone longterm goal for AI research: one thing that we might well aim to doeventually is to produce artificial people. If—as Hayesand Ford claim—that task is almost impossibly difficult, thenthere is no harm in supposing that the goal is merely anambit goal to which few resources should be committed; but wemight still have good reason to allow that it isa goal.

Others who have argued that we need to “move beyond” TheTuring Test include Hernández-Orallo (2000) (2020) andMarcus (2020).

6. The Chinese Room

There are many different objections to The Turing Test which havesurfaced in the literature during the past fifty years, but which wehave not yet discussed. We cannot hope to canvass all of theseobjections here. However, there is one argument—Searle’s“Chinese Room” argument—that is mentioned so oftenin connection with the Turing Test that we feel obliged to end withsome discussion of it.

InMinds, Brains and Programs and elsewhere, John Searleargues against the claim that “appropriately programmedcomputers literally have cognitive states” (64). Clearly enough,Searle is here disagreeing with Turing’s claim that anappropriately programmed computer could think. There is much that iscontroversial about Searle’s argument; we shall just considerone way of understanding what it is that he is arguingfor.

The basic structure of Searle’s argument is very well known. Wecan imagine a “hand simulation” of an intelligentagent—in the case described, a speaker of a Chineselanguage—in circumstances in which we might well be veryreluctant to allow that there is any appropriate intelligence lyingbehind the simulated behavior. (Thus, what we are invited to supposeis a logical possibility is not so very different from what Blockinvites us to suppose is a logical possibility. However, the argumentthat Searle goes on to develop is rather different from the argumentthat Block defends.)Moreover—and this is really thekey point for Searle’s argument—the “handsimulation” in question is, in all relevant respects, simply aspecial kind of digital computation. So, there is a possibleworld—doubtless one quite remote from the actual world—inwhich a digital computer simulates intelligence but in which thedigital computer does not itself possess intelligence. But, if weconsider any digital computer in the actual world, it will not differfrom the computer in that remote possible world in any way which couldmake it the case that the computer in the actual world is moreintelligent than the computer in that remote possible world. Giventhat we agree that the “hand simulating” computer in theChinese Room is not intelligent, we have no option but to concludethat digital computers are simply not the kinds of things thatcan be intelligent.

So far, the argument that we have described arrives at the conclusionthat no appropriately programmed computer can think. While thisconclusion is not one that Turing accepted, it is important to notethat it is compatible with the claim that The Turing Test is a goodtest for intelligence. This is because, for all that has been argued,it may be that it is notnomically possible to provide any“hand simulation” of intelligence (and, in particular,that it is not possible to simulate intelligence using any kind ofcomputer). In order to turn Searle’s argument—at least inthe way in which we have developed it—into an objection to TheTuring Test, we need to have some reason for thinking that it is atleastnomically possible to simulate intelligence usingcomputers. (If it is nomically impossible to simulate intelligenceusing computers, then the alleged fact that digital computers cannotgenuinely possess intelligence casts no doubt at all on the usefulnessof the Turing Test, since digital computers are nomically disqualifiedfrom the range of cases in which there is mere simulation ofintelligence.) In the absence of reason to believe this, the most thatSearle’s argument yields is an objection to Turing’sconfidently held belief that digital computing machines will one daypass The Turing Test. (Here, as elsewhere, we are supposing that, forany kind of creature C, there is a version of The Turing Test in whichC takes the role of the machine in the specific test that Turingdescribes. This general format for testing for the presence ofintelligence would not necessarily be undermined by the success ofSearle’s Chinese Room argument.)

There are various responses that might be made to the argument that wehave attributed to Searle. One kind of response is to dispute theclaim that there is no intelligence present in the case of the ChineseRoom. (Suppose that the “hand simulation” is embedded in arobot that is equipped with appropriate sensors, etc. Suppose,further, that the “hand simulation” involves updating theprocess of “hand simulation,” etc. If enough details ofthis kind are added, then it becomes quite unclear whether we do wantto say that we still haven’t described an intelligent system.)Another kind of response is to dispute the claim that digitalcomputers in the actual world could not be relevantly different fromthe system that operates in the Chinese Room in that remote possibleworld. (If we suppose that the core of the Chinese Room is a kind ofgiant look-up table, then it may well be important to note thatdigital computers in the actual world do not work with look-up tablesin that kind of way.) Doubtless there are other possible lines ofresponse as well. However, it would take us out of our way to try totake this discussion further. (One good place to look for furtherdiscussion of these matters is Braddon-Mitchell and Jackson(1996).)

7. Brief Notes on Intelligence

There are radically different views about the measurement ofintelligence that have not been canvassed in this article. Our concernhas been to discuss Turing (1950) and its legacy. But, of course, amore wide-ranging discussion would also consider, for example,research on the measurement of intelligence using the mathematical andcomputational resources of Algorithmic Information Theory, KolmogorovComplexity Theory, Minimum Message Length (MML) Theory, and so forth.(For an introduction to this literature, see Hernandez-Orallo and Dowe(2010), and the list of references contained therein. For a moregeneral introduction to research into AI, see Marquis et al.(2020).)

More broadly, there are radically different views about ourconcept--or concepts--of intelligence that have not been canvassed inthis article. There is a dispute, for example, about whether Turing isbest interpreted as working with a response-dependent concept ofintelligence. (Pro: Proudfoot (2013) (2020); contra: Wheeler (2020).)Relatedly, there is a dispute about whether intelligence bears somekind of necessary relationship to symmetrical relations of recognitionbetween agents, as suggested in Mallory (2020) There is also abroader dispute about whether we should think that useful notions ofintelligence are always domain specific, or whether we should rathersuppose that there is something important in the idea of general,domain independent intelligence.

And there are radically different views about the most likely paths tobuilding general intelligence (assuming that there is such a thing asgeneral intelligence). For example, Crosby (2020) suggests that thebest way forwards may be to try to make machines that can pass animalcognition tests, i.e. that can create predictive models of theirenvironment from sensory input. (There are clear precusors to thisline of thought in, for example, Brooks (1990).)

Bibliography

Abramson, D., 2008, “Turing’s Responses to TwoObjections,”Minds and Machines, 18: 147–67.
–––, 2011a, “Descartes’ Influence onTuring,”Studies in History and Philosophy of Science,42: 544–551.
–––, 2011b, “Philosophy of Mind is (inPart) Philosophy of Computer Science,”Minds andMachines, 21: 203–219.
Arnold, T. and Scheutz, M., 2016, “Against the Moral TuringTest: Accountable Design and the Moral Reasoning of AutonomousSystems,”Ethics and Information Technology, 18:103–15.
Barone, P., et al., 2020 “A Minimal Turing Test: ReciprocalSensorimotor Contingencies for Interaction Detection,”Frontiers in Human Neuroscience 14.
Block, N., 1981, “Psychologism and Behaviorism,”Philosophical Review, 90: 5–43.
Boolos, G. and Jeffrey, R., 1980,Computability andLogic, Second Edition, Cambridge: Cambridge UniversityPress.
Bowie, L., 1982, “Lucas’s Number is Finally Up,”Journal of Philosophical Logic, 11: 279–85.
Braddon-Mitchell, D. and Jackson, F., 1996,The Philosophy ofMind and Cognition, Oxford: Blackwell.
Bringsjord, S., Bello, P. and Ferrucci, D., 2001,“Creativity, the Turing Test, and the (Better) LovelaceTest,”Minds and Machines, 11: 3–27.
Brooks, R., 1990, “Elephants Don’t PlayChess,”Robotics and Autonomous Signals, 6:3–15.
Chalmers, D., 1995, “On Implementing a Computation,”Minds and Machines, 4: 391–402.
Churchland, P. M. and Churchland, P. S., 1990, “Could aMachine Think?”Scientific American, 262 (1):32–37.
Clark, A., 1997,Being There: Putting Brain, Body and WorldTogether Again, Cambridge: MIT Press.
Cooper, S. and van Leeuwen, J. (eds.) 2013,Alan Turing: HisWork and Impact, London: Elsevier.
Copeland, J. (ed.), 1999, “A Lecture and Two RadioBroadcasts on Machine Intelligence by Alan Turing,” in K.Furukawa, D. Michie, and S. Muggleton (eds.),MachineIntelligence 15, Oxford: Oxford University Press.
–––, 2000, “The Turing Test,”Minds and Machines, 10: 519–39.
Copeland, J. and Sylvan, R., 1999, “Beyond the UniversalTuring Machine,”Australasian Journal of Philosophy,77: (1): 46–66.
Copeland, J., et al. (eds.), 2017,The Turing Guide,Oxford: Oxford University Press.
Crooke, A., 2002,Confabulating Consciousness, Ph.D.Dissertation, Philosophy Department, Monash University.
Crosby, M., 2020, “Building Thinking Machines by SolvingAnimal Cognition Tasks,”Minds and Machines, 30:589–615.
Cullen, J., 2009, “Imitation Versus Communication: Testingfor Human-Like Intelligence,”Minds and Machines, 19:237–54.
Damassino, N., 2020, “The Questioning TuringTest,”Minds and Machines, 30: 563–87.
–––, and Novelli, N., 2020, “Rethinking,Reworking and Revolutionising the Turing Test,”Minds andMachines, 30: 463–8.
Davidson, D., 1990, “Turing’s Test,” in K. Said,(ed.),Modelling the Mind, Oxford: Oxford University Press,1–11.
Dennett, D., 1985, “Can Machines Think?” in M. Shafto(ed.),How We Know, Cambridge, MA: Harper and Row.
Dietrich, E. (ed.), 1994,Thinking Computers and VirtualPersons: Essays on the Intentionality of Machines, San Diego:Academic Press.
Dreyfus, H & Dreyfus, S., 1986,Mind Over Machine,New York: Free Press.
Epstein, R. et al. 2009,Parsing the Turing TestDordrecht: Springer.
Erion, G., 2001, “The Cartesian Test for Automatism,”Minds and Machines, 11: 29–39.
Feferman, S., 1996, “Penrose’s GödelianArgument,”,Psyche, 2: 21–32.
Floridi, L., Taddeo, M., and Turilli, M., 2008,“Turing’s Imitation Game: Still an Impossible Challengefor all Machines and some Judges,”,Minds and Machines,19: 145–50.
Floridi, L. and Chiriatti, N. (2020) “GPT-3: It’s Nature,Scope, Limits and Consequences,”Minds andMachines, 30: 681–94.
French, R., 1990, “Subcognition and the Limits of the TuringTest,”Mind, 99: 53–65.
French, R., 2000, “The Turing Test: The First Fifty YearsTrends in Cognitive Sciences,” 4: 115–21.
Genova, J., 1994, “Turing’s Sexual GuessingGame,”Social Epistemology, 8: 313–26.
Gerdes, A. and Øhstrøm, P., 2015, “Issues inRobot Ethics Seen Through the Lens of a Moral TuringTest,”Journal of Information, Communication and Ethicsin Society, 13: 98–109.
Gunderson, K., 1964, “Descartes, La Mettrie, Language andMachines,”Philosophy, 39: 193–222.
–––, 1985,Mentality and Machines, 2ndedition, Minneapolis: University of Minnesota Press.
Harnad, S., 1989, “Minds, Machines and Searle,”Journal of Theoretical and Experimental ArtificialIntelligence, 1: 5–25.
–––, 1991, “Other Bodies, Other Minds: AMachine Incarnation of an Old Philosophical Problem,”Mindsand Machines, 1: 43–54.
Harnad, S. and Dror, I., 2006, “DistributedCognition, Cognising, Autonomy and the Turing Test,”Pragmatics and Cognition, 14: 209–13.
Haugeland, J., 1981, “Semantic Engines: An Introduction toMind Design,” in J. Haugeland (ed.),Mind Design:Philosophy, Psychology, Artificial Intelligence, Cambridge: MITPress, 1–34.
Hauser, L., 1993, “Reaping the Whirlwind: Reply toHarnad’s Other Bodies, Other Minds,”Minds andMachines, 3: 219–38.
Hauser, L., 2001, “Look Who’s Moving the GoalpostsNow,”Minds and Machines, 11: 41–51.
Hayes, P., and Ford, K., 1995, “Turing Test ConsideredHarmful,”Proceedings of the Fourteenth International JointConference on Artificial Intelligence, Montreal: Morgan Kaufmann,972–977.
Hernández-Orallo, J., 2000, “Beyond the TuringTest,”Journal of Logic, Language and Information, 9:447–66.
–––, 2020, “Twenty Years Beyond the TuringTest: Moving Beyond the Human Judges,”Minds andMachines, 30: 533–62.
Hernández-Orallo, J. and Dowe, D. L., 2010,“Measuring Universal Intelligence: Towards an AnytimeIntelligence Test,”Artificial Intelligence, 174:1508–39.
Hodges, A., 1983,Alan Turing: The Enigma, London:Burnett with Hutchinson.
Hofstadter, D., 1982, “The Turing Test: A Coffee-HouseConversation,” in D. Hofstadter and D. Dennett (eds.),TheMind’s I: Fantasies and Reflections on Self and Soul,London: Penguin, 69–95.
Kobosko, S., et al., 2013 “Passing an Enhanced Turing Test:Interacting with Lifelike Computer Representations of SpecificIndividuals,”Journal of Intelligent Systems, 22:365–415.
Korukonda, A., 2003, “Taking Stock of the Turing Test: AReview, Analysis and Appraisal of Issues Surrounding ThinkingMachines,”International Journal of Human-ComputerStudies 58: 240–57.
Kulikov, V., 2020, “Preferential Engagement: What can weLearn from Online Chess?,”Minds and Machines, 30:617–36.
Leavitt, D., 2007,The Man Who Knew Too Much: Alan Turing andthe Invention of the Computer London: Phoenix.
Levesque, H., 2017,Commonsense, the Turing Test, and theQuest for Real AI, Cambridge, MA: MIT Press.
Lewis, D., 1969, “Lucas against mechanism,”Philosophy, 44: 231–233.
–––, 1979, “Lucas against mechanismII,”Canadian Journal of Philosophy, 9:373–376.
Lucas, J., 1961, “Minds, Machines and Gödel,”Philosophy, 36: 120–4.
Lupowski, P., 2011, “A Formal Approach to Exploring theInterrogator’s Perspective in the Turing Test,”Logical andLogical Philosophy 20: 139–58.
Lupowski, P. and Jurowska, P., 2019, “Minimum IntelligentSignal Test as an Alternative to the Turing Test,”Diametros, 59: 35–47.
Lyre, H. 2020, “The State Space of ArtificialIntelligence”,Minds and Machines, 30: 325–47.
Mallory, F. 2020, “In Defence of a Reciprocal TuringTest,”Minds and Machines, 30: 659–80.
Marcus, G., 2020 “The Next Decade in AI: Four Steps TowardsRobust Artificial Intelligence,” arXiv:2002.06177.
Marquis, P., et al. (eds.), 2020,A Guided Tour of ArtificialIntelligence Research, Cham: Springer.
Masum, H., Christensen, S., and Oppacher, F., 2003, “TheTuring Ratio: A Framework for Open-Ended Task Metrics,”Journal of Evolution and Technology, 13(2),available online.
McDermott, D., 2014, “On the Claim that a Look-Up TableProgram could Pass the Turing Test,”Minds andMachines, 24: 143–88.
Millican, P. and Clark, A., (eds.), 1999,Machines andThought: The Legacy of Alan Turing, two volumes, Oxford:Clarendon.
Moor, J., 1976, “An Analysis of Turing’s Test,”Philosophical Studies, 30: 249–57.
–––, 2001, “The Status and Future of theTuring Test,”Minds and Machines, 11: 77–93.
___, ed., 2003The Turing Test: The Elusive Standard ofArtificial Intelligence Dordrecht: Springer.
Neufeld, E. and Finnestad, S., 2020a, “In Defense of theTuring Test,”AI and Society 35: 819–27.
Neufeld, E. and Finnestad, S., 2020b, “Imitation Game:Threshold or Watershed?,”Minds and Machines, 30:637–57.
Pautz, A. and Stoljar, D. (eds.), 2019,Blockheads! Essays onNed Block’s Philosophy of Mind and Consciousness,Cambridge, MA: MIT Press.
Penrose, R., 1989,The Emperor’s New Mind, Oxford:Oxford University Press.
Piccinini, G., 2000, “Turing’s Rules for the ImitationGame,”Minds and Machines, 10: 573–85.
Proudfoot, D., 2013, “Rethinking Turing’s Test,”Journal of Philosophy, 110: 391–411.
–––, 2020, “Rethinking Turing’s Test andthe Philosophical Implications,”Minds and Machines,30: 487–512.
Proudfoot, D. and Copeland, J. 2008 “Turing’s Test: APhilosophical and Historical Guide,” in R. Epstein et al.,(eds.),Parsing the Turing Test: Philosophical and MethodologicalIssues, Dordrecht: Springer, 119–38.
Rapaport, W., 2000, “How to Pass a Turing Test: SyntacticSemantics, Natural-Language Understanding, and First-PersonCognition,”Journal of Logic, Language and Information,9: 467–90.
Saygin, A., Cicekli, I., and Akman, V., 2000, “Turing Test:50 Years Later,”Minds and Machines, 10:463–518.
Schweizer, P., 1998, “The Truly Total Turing Test,”Minds and Machines, 8: 263–72.
–––, 2012, “The Externalist Foundation ofa Truly Total Turing Test,”Minds and Machines, 22:191–212.
Searle, J., 1981, “Minds, Brains, and Programs,”Behavioral and Brain Sciences, 3: 417–57.
Shah, H. and Warwick, K., 2010, “Hidden InterlocutorMisidentification in Practical Turing Tests,”Minds andMachines, 203: 441–54.
Shieber, S., 1994, “Lessons from a restricted TuringTest,”Communications of the Association for ComputingMachinery, 37: 70–8. [Preprint available online].
–––, (ed.), 2004,The Turing Test: VerbalBehaviour as the Mark of Intelligence, Cambridge: MIT Press.
–––, 2007, “The Turing Test as InteractiveProof,”Noûs, 41: 686–713.
–––, 2014, “There can be noTuring-Test-Passing Memory Machines,”Philosophers’Imprint, 14: 1–13.
Sparrow, R., 2004, “The Turing Triage Test,”Ethics and Information Technology, 6: 203–13.
Srinivasan, B. and Shah, K., 2019, “Towards a UnifiedFramework for Developing Ethical and Practical TuringTests,”AI and Society, 34: 145–52.
Sterrett, S., 2000, “Turing’s Two Tests forIntelligence,”Minds and Machines, 10:541–59.
Sterrett, S., 2020, “The Genius of the ‘OriginalImitation Game’ Test,”Minds and Machines, 30:469–86.
Traiger, S., 2000, “Making the Right Identification in theTuring Test,”Minds and Machines, 10:561–572.
Turing, A., 1950, “Computing Machinery andIntelligence,”Mind, 59 (236): 433–60.
Turing, A. 1992,The Collected Works of A. M. Turing,edited by P. Furbank, London: North-Holland.
Warwick, K. et al., 2013, “Some Implications of a Sample ofPractical Turing Tests,”Minds and Machines, 23:163–77.
Weizenbaum, J., 1966, “ELIZA-A Computer Program for theStudy of Natural Language Communication Between Men andMachines,”Communications of the ACM, 9:36–45.
Wheeler, M., 2020 “Deceptive Appearances: The Turing Test,Response Dependence, and Intelligence as an Emotional Concept,”Minds and Machines 30: 513–32.
Whitby, B., 1996, “The Turing Test: AI’s Biggest BlindAlley?” in P. Millican and A. Clark (eds.),Machines andThought: The Legacy of Alan Turing, Volume 1, Oxford:Clarendon.
Zdenek, S., 2001, “Passing Loebner’s Turing Test: ACase of Conflicting Discourse Functions,”Minds andMachines, 11: 53–76.

Academic Tools

How to cite this entry.
Preview the PDF version of this entry at theFriends of the SEP Society.
Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO).
Enhanced bibliography for this entryatPhilPapers, with links to its database.

Other Internet Resources

Brown, T., et al., 2020,Language Models are Few-Shot Learners, description of GPT-3 at archive.org.
Chalmers, D., 2020, “GPT-3 and General Intelligence,” blog post a dailynous.com.
Philosophers on GPT-3, at dailynous.com.
Alan Turing Home Page (Andrew Hodges, Wadham College, Oxford).
“Computing machinery and intelligence” by Alan Turing (1950).
The Loebner Prize.
Machine Intelligence Part 1: The Turing Test and Loebner Prize (Ashley Dunn).
Why CAPTCHAS Have Gotten So Difficult(Josh Dzieza).
There is no General AI: Why Turing Machines Cannot Pass the Turing Test (Jobst Landgrebe and Barry Smith).

Acknowledgments

We would like to acknowledge the help of the editors of theEncyclopedia, Jose Hernandez-Orallo, and two anonymousreferees. The advice that we we have received has led to numerousimprovements. We look forward to receiving further suggestions forimprovements from those who’ve read what we have written.

Open access to the SEP is made possible by a world-wide funding initiative.
The Encyclopedia Now Needs Your Support
Please Read How You Can Help Keep the Encyclopedia Free

Browse

About

Support SEP

Mirror Sites

View this site from another server:

USA (Main Site)Philosophy, Stanford University

Info about mirror sites

Library of Congress Catalog Data: ISSN 1095-5054

	How to cite this entry.
	Preview the PDF version of this entry at theFriends of the SEP Society.
	Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO).
	Enhanced bibliography for this entryatPhilPapers, with links to its database.

Movatterモバイル変換