Artificial Intelligence (referred to hereafter by its nickname,“AI”) is the subfield of Computer Science devoted todeveloping programs that enable computers to display behavior that can(broadly) be characterized as intelligent.[1]
Many of the most influential figures in AI’s early days hadambitious goals and views about how to obtain them. JohnMcCarthy’s plan was to use ideas from philosophical logic toformalize commonsense reasoning. During his lifetime he and hisstudents and associates pursued projects with a distinctlyphilosophical flavor. This theme has persisted, but has mostly beenabsorbed into work in knowledge representation. It has become moredirectly concerned with applications; the connections to philosophyand philosophical logic remain, but are more tenuous.
The new insights and theories that have emerged from AI are of greatpotential value in informing and constraining any area ofphilosophical inquiry where reasoning is important—reasoningabout what to do, for instance, or about our own attitudes or theattitudes of others. Although logic in AI grew out of philosophicallogic, in this new setting it has produced new theories and ambitiousprograms that could only have been nurtured by a community devoted tobuilding working, large-scale computational models of rationalagency.
This entry assumes an audience consisting primarily of philosopherswho have little or no familiarity with AI. It concentrates on theissues that arise when logic is used in understanding intelligentreasoning in mechanized reasoning systems. And it provides a selectiveoverview, without attempting to achieve anything like completecoverage.Sections 3 and4 describe two important themes that arose early and have continued tothe present: nonmonotonic logic and reasoning about action and change.The remaining sections sketch selected topics, with references to theprimary literature.
Theoretical computer science developed out of logic, the theory ofcomputation (if this is to be considered a different subject fromlogic), and related areas of mathematics.[2] So most computer scientists are well informed about logic even ifthey aren’t logicians. They are familiar with the idea thatlogic provides techniques for analyzing the inferential properties oflanguages, and with the distinction between a high-level logicalanalysis of a reasoning problem and its implementations. Logic can,for instance, provide a specification for a programming language bymapping programs to the computations that they should license andenabling proofs that these computations conform to certainstandards.
Often, however, the connection between logic and computer programs islooser than this. Certainly, a software application can be said toimplement a logical formalization when it is provably sound andcomplete—but also merely when logical ideas informed parts ofthe software development process. A program that is said to implementa logical model can be incomplete, or even unsound.
Sometimes parts of a working system are inspired by ideas from logicwhile other parts seem logically problematic. These challengingfeatures may suggest improvements to the logical theory. So logicaltheory informs applications, and applications challenge logical theoryand can lead to theoretical innovations. Logic programming providesmany examples of such interactions.
Even limited-objective reasoning systems can call for large, complexbodies of declarative information. It is generally recognized in AIthat it is important to treat declarative representations, along withtheir retrieval and maintenance and the reasoning systems they serviceas separate items, each with its own research problems. The evolutionof expert systems illustrates the point. The earliest expert systemswere based entirely on large systems of procedural rules, with noseparate representation of the background knowledge. But latergeneration expert systems show a greater modularity in their design. Aseparate knowledge representation component is useful for softwareengineering purposes—it is much better to have a singlerepresentation of a general fact, capable of many different uses andmaking the system easier to develop and to modify. This modularity isessential in enabling these systems to deliver explanations as well asmere conclusions.[3]
In response to the need to design this declarative component, asubfield of AI known asknowledge representation emergedduring the 1980s. Conferences devoted to this topic have taken placesince 1989; these provide an in-depth record of research in the field.SeeSection 12. for a list of the proceedings.
Typical presentations at the KR and Reasoning conferences deal withthe following topics.
These topics have little in common with the contents of theJournal of Symbolic Logic, the premier journal of record formathematical logic. But there is substantial overlap withTheJournal of Philosophical Logic, especially in topics such astense logic, epistemic logic, logical approaches to practicalreasoning, and belief change. Of course, there also are differences;very fewJPL publications deal with complexity theory or withpotential applications to automated reasoning.
Founded in 1936, theJSL sought to bring togethermathematicians and philosophers working in logic. Articles in thefirst volume were divided about equally between professionalmathematicians and philosophers, and the early volumes do not show anystrong differences between the two groups as to topic.
This situation changed in the 1960s. The 1969 volume of theJSL contained 39 articles by mathematicians, and only nine byphilosophers. By the early 1970s, many philosophers felt thatphilosophical papers on logic were unlikely to be accepted by theJSL, and that those that were accepted were unlikely to beread by philosophers. At this point, the two groups had divergedmarkedly. Mathematicians pursued the development of an increasinglytechnical and complex body of methods and theorems, and manyphilosophers saw this trend as unilluminating and philosophicallyirrelevant. These divisions led to the founding of theJournal ofPhilosophical Logic in 1972. The list of sample topics in thefirst issue included:
The common thread here is a desire to apply the methods ofmathematical logic to nonmathematical domains. Quantum logic and thelogic of induction, for instance, apply logic to physics and empiricalscience. Other topics in theJPL list concern developments inlogic that might be helpful in addressing nonscientific reasoning.
McCarthy 1959, an early contribution to logical AI, discusses the problem offiguring out how to get to the airport. Here McCarthy proposes arealistic reasoning problem. Its solution may involve many connectedinferences, and though ultimately it may look like a proof—aproof that performing certain actions will produce an outcome in whichsomeone is located at an airport—it will differ from amathematical exercise because it draws on broader and less tractableresources. These include causal knowledge as well as goals andpreferences. Contrastingly, research papers in philosophical logic usereasoning examples toillustrate, rather than tomotivate logical theory and the reasoning examples they citeare simple, isolated inferences.
It would not be far wrong to characterize early work in logical AI asphilosophical logic devoted to a new and ambitious application area.And in fact the first generation of AI logicists[4] read the literaturein philosophical logic and were influenced by it. Subsequently,however, the specialties have diverged. New logical theories haveemerged in logical AI (nonmonotonic logic is the most importantexample) which had not occurred to philosophers. The AIcommunity’s interest in the theoretical analysis of algorithmsand—of course—in useful technology are responsible forother differences. AI researchers are often concerned with ambitiousapplications using unprecedentedly large bodies of data andinferential rules. Their sheer size produces new problems and newmethodologies. And on the other hand, philosophical logicians arephilosophers and as such are often interested in topics (metaphysicaltopics, for instance) that are of no interest to computerscientists.
If philosophical logic and logic in AI continue to diverge, it willprobably be for such methodological reasons. But despite this, thefundamental research goals are the same—logical AI isphilosophical logic constrained by an interest in large-scaleformalization and in feasible, implementable reasoning.
The early influence of philosophical logic on logic in AI wasprofound. The bibliography ofMcCarthy & Hayes 1969, one of the most influential early papers in logical AI, illustratesthe point well. There are 58 citations in the bibliography. Of these,35 refer to the philosophical logic literature. (There are 17 computerscience citations, one mathematical logic citation, one economicscitation, and one psychology citation.) This paper was written at atime when there were hardly any references to logical AI in thecomputer science literature. Naturally, as logical AI has matured anddeveloped as a branch of computer science, the proportion ofcross-disciplinary citations has decreased. A sampling of articlesfrom the first Knowledge Representation conference,Brachmanet al. 1989, held in 1989, shows only 12 philosophical logic citations out of atotal of 522 sampled citations. A sampling of articles fromCohnet al. 1998, shows 23 philosophical logic citations out of a total of 468 sampled.[5]
Despite the dramatic decrease in quantity of explicit citations, thelater literature in logical AI reflects an indirect acquaintance withphilosophical logic, by citing papers in CS venues that were directlyinfluenced by philosophical work. Of course, the influence becomesincreasingly tenuous as time passes, and this trend is accelerated bythe fact that new theoretical topics have been invented in logical AIthat were at best only dimly prefigured in the philosophicalliterature. In Europe, the lines are harder to draw betweenprofessional divisions among logicians. Some European journals,especially theJournal of Logic, Language, and InformationandStudia Logica, are successful in maintaining a focus inlogic while attracting authors from all the disciplines in which logicis represented.
In the final analysis, logic deals with reasoning—and relativelylittle of the reasoning we do is mathematical, while almost all of themathematical reasoning done by nonmathematicians is mere calculation.To have scope as well as rigor, logic needs to maintain itself as asingle discipline, uniting its mathematical and philosophical side.But the needs of Computer Science have added strong unifying motivesfor this unification, providing a novel methodology and relations tonew, rewarding applications.
John McCarthy was, and remains, the most influential figure in logicalAI. McCarthy was one of the founders of AI, and consistently advocatedlogical formalization as the path to human-level AI. All but the mostrecent work in McCarthy’s research program can be found inLifschitz 1990a, which also containsLifschitz 1990b, an introduction to McCarthy’s work. For additional historicalbackground, seeIsrael 1991.
McCarthy’s views were first articulated inMcCarthy 1959 and elaborated and amended inMcCarthy & Hayes 1969. He felt that even if AI implementations do not straightforwardly uselogical reasoning techniques like theorem proving, a logicalformalization will help to understand the reasoning problem itself.The claim is that without a logical account of the reasoning domain,it will not be possible to implement the reasoning itself. This is infact controversial. Many AI researchers see no need for logicalformalization in their work. For instance, the products of machinelearning will typically bear no discernible relation to logic, anddepend on a combination of training corpora and cumulative learningexperience. There will be no obvious way to characterize or understandthem at a declarative, conceptual level, and their relation to logicwill be problematic.
The recommendations ofMcCarthy & Hayes 1969, overlap to a large extent with those of analytic philosophy, but aredirected at a different goal: programmable general intelligence ratherthan conceptual analysis. Similar goals have occurred to a fewphilosophers; see, for instance,Carnap 1956 (pp. 244–247) andPollock 1995.
Assuming that most readers of this article will be interested in therelation between logical AI and philosophical logic, the remainder ofthis article will ignore relations both to philosophy in general andto the feasibility of developing human-level intelligent systems.
McCarthy’s long-term objective was to formalizecommonsensereasoning, the prescientific reasoning that engages humanthinking about everyday problems. We have mentioned a planningproblem: how to get to the airport. Other examples include:
McCarthy’s goal will probably seem outrageous to most philosophers,who are trained to think of common sense as elusive and incoherent.But philosophers invoke common sense in relation to philosophicaldisputes, where its employment is problematic. McCarthy was thinkingof everyday, practical common sense. We couldn’t manage to navigatesimple daily tasks if common sense were not reliable in thesesettings. Formalizing the reasoning that supports these tasks may turnout to be impracticable, but the project itself is neither misguidednor quixotic.
Whether or not formalization is the secret to human-level AI, it hasbeen successful on a smaller-scale—not only in unembodied settings[6] but in online robot planning and execution. It is used in someapproaches to complete, autonomous agents.[7]. It plays an important role in multiagent systems, where communicatingand reasoning about knowledge are critical.[8] And it has illuminated qualitative reasoning about the behavior ofphysical devices.[9]
While a mathematical proof must cover every contingency, practicalreasoning routinely closes its eyes to some possibilities. Consider aplan to get to the airport. It could be impeded by an earthquakes, ameteor strike, or a highway accident. But it’s perfectly reasonable toignore the first two factors, and often even the third can safely beignored. Acceptance of a plan, unlike acceptance of a proof, is risky.In fact, risk and the possibility of unpleasant surprises are featuresof sound commonsense reasoning. This means that the reasoning isnonmonotonic.
Classical logics were designed with mathematics in mind and theirconsequence relations are monotonic. That is, if a set \(T\) offormulas implies a consequence \(B\) then a larger set \(T \cup \{A\}\) will also imply \(B\). A logic is nonmonotonicif its consequence relation lacks this property.Preferredmodels provide a general way to induce a nonmonotonic consequencerelation. Invoke a function that for each \(T\) produces a subset\(\mathcal{M}_T\) of the models of\(T\); in general, we will expect \(\mathcal{M}_T\) to be a proper subsetof these models. We then say that \(T\) implies \(B\) if\(B\) is satisfied by every model in \(\mathcal{M}_T\). As long as we do notsuppose that \(\mathcal{M}_{T} \subseteq \mathcal{M}_{S}\) if\(S \subseteq T\), the implication relation will benonmonotonic.
Improbability is not the only reason for disregarding a contingency.Other reasons include (1) a feeling for what is normal and usual; (2)epistemic excusability—immunity from any blame that may attachto ignoring a possibility; (3) the estimated costs of furtherdeliberation; and (4) inattention and mere cognitive laziness. Some ofthese may be more “rational” than others, but in fact itisn’t easy to locate a boundary between rational and irrationalfactors. And probably no one has succeeded in clarifying anddisentangling these motivating considerations.
In the early stages of its development. many researchers hoped thatnonmonotonic logic would provide a general approach to efficientreasoning about uncertainty. But by the end of the 1980s, fullyquantitative probabilistic reasoning had become not only implementablebut clearly preferable in many sorts of applications to methodsinvolving nonmonotonic logic. Nonmonotonicity is no magic path toefficient reasoning. It can be useful in reasoning about uncertainty.But so can probabilities.
Three influential papers on nonmonotonic logic appeared in 1980:McDermott & Doyle 1980,Reiter 1980, andMcCarthy 1980. In each case, the formalisms presented in these papers emerged from agestation period of several years or more. To set out the historicalinfluences accurately, it would have been necessary to interview theauthors, and this has not been done. However, there seem to have beentwo motivating factors: strategic considerations having to do with thelong-range goals of AI, and much more specific, tacticalconsiderations arising from the analysis of the reasoning systems thatwere being deployed in the 1970s.
Section 3.1 explained why if was generally felt that monotonicity rendersclassical logics unsuitable as a vehicle for formalizing commonsensereasoning.Minsky 1974, which was widely read at the time of its publication, helped tocrystalize this attitude. Minsky’s paper presents an assortment ofchallenges for AI, focusing at the outset on the problem of naturallanguage understanding.[10] He advocates “frame-based” knowledge representation techniques[11] and (conceiving these as an alternative to logic), he throws out anumber of loosely connected challenges for the logical approach,including the following problems: building large-scalerepresentations, reasoning efficiently, representing controlknowledge, and providing for the flexible revision of defeasiblebeliefs. In retrospect, most AI researchers would likely agree thatthese problems are quite general challenges to any research program(including the one Minsky himself advocated at the time) and would addthat logical techniques are an important element in addressing some,perhaps all, of the issues. For instance, a well structured logicaldesign can be a great help in scaling up any computationally usefulbody of knowledge.
Perhaps unintentionally, Minsky’s paper incentivizednonmonotonic logicians by identifying monotonicity as a source of thealleged shortcomings of logic. Although Minsky apparently meant todiscredit logical methods in AI,McDermott & Doyle 1980 andMcCarthy 1980 interpret his criticisms as a challenge to be met by developinglogics that lack the monotonicity property.
The development of nonmonotonic logic also owes much to the needs ofAI applications. In fact, this influence was at least as persuasive asMcCarthy’s strategic considerations, and in many ways moreinfluential on the shape of the formalisms that emerged. Here, wemention three applications that appear to have been important for someof the early nonmonotonic logicians: belief revision, closed-worldreasoning, and planning.
Doyle 1979 presents a “truth maintenance system.” Doyle’s truthmaintenance algorithm answered a general need, providing a mechanismfor updating the “beliefs” of a knowledge repository. Theidea is to keep track of the support of beliefs, and to use the recordof these support dependencies when it is necessary to revise beliefs.
In a TMS, part of the support for a belief can consist in theabsence of some other belief. This introducesnonmonotonicity. For instance, it provides for defaults: beliefs thatare induced by the absence of contrary beliefs.
The TMS algorithm and its refinements had a significant impact on AIapplications, and this created the need for a logical analysis. (Ineven fairly simple cases, it can be hard in the absence of analytictools to see what consequences a TMS should deliver.) This presented anatural and highly specific challenge for those seeking to develop anonmonotonic logic. The TMS also provided the idea thatnonmonotonicity has to do with inferences based on unprovability; thisinsight was important for modal approaches to nonmonotonic logic andfor default logic. And the TMS’s emphasis on interactionsbetween arguments initiated a theme in nonmonotonic logic that remainsimportant to this day.Abstract argumentation is a frameworkfor default reasoning with connections to logic programming thatcontinues to receive much attention. See, for instance,Besnard & Hunter 2008 andRahwan & Simari 2009.
The study of databases in computer science has a logical side; seeMinker 1997 for a survey. This area has interacted with logical AI. The deductivedatabase paradigm was taking shape at about the same time that many AIresearchers were thinking through the problems of nonmonotonic logic,and provided several specific examples of nonmonotonic reasoning thatcalled for analysis. Of these, perhaps the most important is theclosed-world assumption. According to thisassumption—at least as far as simple claims (i.e. positive ornegative literals) are concerned—the system assumes that itknows all that there is to be known. It is the closed world assumptionthat justifies a negative answer to a query “Is there a directflight from Detroit to Bologna?” when the system finds no suchflight in its data. This is another case of inference from the absenceof a proof. A negative is proved, in effect, by the failure of asystematic attempt to prove the positive. This idea, which wasinvestigated in papers such asReiter 1978 andClark 1978, provided a well-defined challenge for nonmonotonic logicians, as wellas suggestions about how to address the challenge.
Rational planning is impossible without the ability to reason aboutthe outcomes of a series of contemplated actions. Predictive reasoningof this sort is local; in a complex world with many dynamic variables,we assume that most of these will be unchanged by the performance ofan action. The problem of how to formalize such “causalinertia” is known as theframe problem.
It is very natural to suppose that inertia holds by default—thatvariables are unchanged by the performance of an action unless thereis a special reason to think that they will change. This suggests thatnonmonotonic temporal formalisms should apply usefully to reasoningabout action and change, and in particular might address the frameproblem.Sandewall 1972, is an early attempt along these lines. Later work in this directionprovides an especially important and instructive case study of the useof logic in AI; seeSection 4.4, for further discussion.
Section 3.2 mentioned three influential approaches to nonmonotonic logic:circumscription (McCarthy),modal approaches (Doyle& McDermott) anddefault logic (Reiter).
InMcCarthy 1993a, McCarthy urged us, when considering the early history ofcircumscription, to take into account a group of three papers:McCarthy1986,1980, and1987. The first paper connects the strategic ideas ofMcCarthy & Hayes 1969 with the need for a nonmonotonic logic, and sketches the logicalideas ofdomain circumscription, the simplest case ofcircumscription. The second paper provides more thorough logicalfoundations, and introduces the more general and powerfulpredicate circumscription approach. The third paper discusseschallenging commonsense examples and techniques for formalizingthem.
All forms of circumscription involve restricting attention to modelsin which certain sets are minimized; for this reason, circumscriptioncan be grouped with the preferred models treatments ofnonmonotonicity. McCarthy’s approach is conservative: it usesclassical second-order logic. Therefore the circumscription literaturecan avoid logical foundations and concentrate on formalizations. Theother varieties of nonmonotonic logic, including default logic and themodal nonmonotonic logics, raise issues that will seem familiar tophilosophical logicians. These have to do with the design of newlogics, the systematic investigation of questions concerning validity,and managing the proliferation of alternative logics.
It is natural to think of nonmonotonic inferences as beinghedged. That is, a nonmonotonic inference may require notmerely the presence of proved conclusions, but theabsence ofother conclusions. The general form of such a default rule is:
An important special case of \(\mathbf{DR}\) is anormaldefault, a simple rule to the effect that \(C\) holds bydefault, conditionally on assumptions\(A_1 ,\ldots ,A_n\). This can beformalized by taking the negation of the conclusion itself to be whatmust be absent.
Adefault theory consists of two components: a set offormulas taken as axioms, and a set of default rules.
At first sight, it is perplexing how to characterize proofs in defaultlogic, because the default account of provability is circular: proofsare defined in terms of chains of correct inferences, but correctinference is defined in terms of (non)provability. Thereforeprovability can’t be characterized inductively,, as in themonotonic case. The early theory ofSandewall 1972 didn’t address this difficulty successfully.McDermott & Doyle 1980 andReiter 1980 propose solutions to this problem. In both cases, the logical task is(1) to develop a formalism in which rules like \(\mathbf{DR}\) canbe expressed, and (2) to define the relation between a combination\(DT\) of nonmonotonic axioms and rules and the theories\(E\) which could count as reasonable consequences of\(DT\). In the terminology that later became standard, we need todefine the relation between a default theory \(DT\) and itsextensions.
This is a radical departure from classical logic, which associates asingle collection of consequences with an axiomatic basis. A defaulttheory can determine many alternative consequence sets, with the logicitself providing no way to choose between them.
In retrospect, we can identify two approaches to nonmonotonic logic:those based onpreference and those based onconflict. Theories of the first sort (like circumscription)involve a relatively straightforward modification of the ordinarymodel-theoretic definition of logical consequence, appealing to apreference relation over models. Theories of the second sort (likedefault logic) require a more radical reworking of logical ideas. Thepossibility of multiple extensions—different possible coherent,inferentially complete conclusion sets that can be drawn from a singleset of premises—means that we have to think of logicalconsequence not as a function taking a set of axioms into its logicalclosure, but as arelation between a set of axioms andalternative logical closures. Since logical consequence is sofundamental, this represents a major theoretical departure. Withmultiple extensions, we can still retrieve a consequence relationbetween a theory and a formula in various ways, the simplest being tosay that \(DT\) nonmonotonically implies \(C\) if \(C\)is a member of every extension of \(DT\). Still, theconflict-based account of consequence provides a much richerunderlying structure than the preferential one.
Reiter approaches the formalization problem conservatively. Thelanguage of default logic is the same as the language of first-orderlogic and its formulas cannot express defaults. But a theory mayinvolve a set ofdefault rules—rules of the form\(\mathbf{DR}\). Adefault theory, then, is a pair\(\text{DT}=\langle W,D\rangle\) consisting of a set \(W\) of(monotonic) axioms and a set \(D\) of default rules.Reiter 1980 provides a fixpoint definition of the extensions of such a theory,and develops the theoretical groundwork for the approach, proving anumber of the basic theorems.
Of these theorems, we mention one in particular, which will be used inSection 4.5, in connection with the Yale Shooting Anomaly. The idea is to take aconjectured extension (which will be a set \(T^*\)) and to usethis set for consistency checks in a proof-like process thatsuccessively applies default rules in \(\langle W,D\rangle\) tostages that begin with \(W\).
We define a default proof process \(T_0,T_1,\ldots\) for \(\langleW,D\rangle\), relative to \(T^*\), as follows.
In other words, as long as we can nonvacuously close the stage we areworking on under an applicable default, we do so; otherwise, we donothing. A theorem of Reiter’s says that, under thesecircumstances:
\(T\) is an extension of \(\langle W, D\rangle\) if andonly if there is a proof process \(T_0,T_1,\ldots\) for \(W, D\), relativeto \(T\), such that \(T =\bigcup\{T_i :0\le i\}\).
Thus, we can show that \(T\) is an extension by (1) constructinga default reasoning process \(\{T_i\}\) from\(\langle W, D\rangle\) that uses \(T\) for consistencychecks, (2) taking the limit \(T'\) of this process, and(3) verifying that in fact \(T' = T.\)
The modal approach invokes a modal operator \(L\), informallyinterpreted as ‘provable’.[12] The essence of McDermott and Doyle’s approach, likeReiter’s, is a fixpoint definition of the extensions of anonmonotonic logic. Incorporating nonmonotonicity in the objectlanguage creates some additional complexities, which in the earlymodal approach show up mainly in proliferation of the logics anddifficulties in evaluating the merits of the alternatives. As betterfoundations for the modal account emerged, it became possible toprove, as was expected, the equivalence of the modal and default logic approaches.[13]
Unlike other early presentations of nonmonotonic logic, Reiter’sshows specific influence from earlier and independent work onnonmonotonicity in logic programming—the work seems to have beenlargely inspired by the need to provide logical foundations for thenonmonotonic reasoning found in deductive databases. The subsequenthistory of nonomonotonic logic is intimately connected with theliterature on logic programming semantics.
Doyle and McDermott’s paper cites the earlier literature inlogicist AI, presenting nonmonotonic logic as part of a program offormalizing commonsense rationality. But this work is also clearlyinfluenced by the need to provide a formal account of truthmaintenance.
Nonmonotonic logic is a complex, robust research field. Providing asurvey of the subject is made difficult by the fact that there aremany different foundational paradigms for formalizing nonmonotonicreasoning, and the relations between these paradigms is not simple. Anadequate account of even a significant part of the field requires asomething like a book-length treatment. A number of books and handbookarticles are available, includingŁukaszewicz 1990,Brewka 1991,Besnard 1992,Marek & Truszczynski 1994,Gabbayet al. 1994,Antoniou 1997,Brewkaet al. 1997,Schlechta 1997,Makinson 2005,Antoniou & Wang 2007,Bochman 2007,Horty 2012,Straßer 2014, andStraßer & Antonelli 2019. The collectionGinsberg 1987 is a useful source for readers interested in the early history of thesubject, and has an excellent introduction.
Section 3.1 explained how preferred models can be used to characterize anonmonotonic consequence relation. This approach to the model theoryof nonmonotonicity was clarified inShoham 1988, five years after the work discussed inSection 3.2. Shoham’s work provides a more general and abstract approach.
Preferential semantics relies on a function \(S\) taking a set\(K\) of models into a subset \(S(K)\) of\(K\). The crucial definition ofpreferential entailmentstipulates that \(A\) is a (nonmonotonic) consequence of\(T\) if every model \(M\) of\(S\)(Models\((T))\) implies \(A\). Shohamcharacterizes \(S(K)\) in terms of a partial order\(\preccurlyeq\) over models: \(S(K)\) is the set of models in\(K\) that are \(\preccurlyeq\)-minimal in \(K\). To ensure that noset can preferentially entail a contradiction unless it classicallyentails a contradiction, infinite descending \(\preccurlyeq\) chains must bedisallowed.
This treatment of nonmonotonicity is similar to the earlier modalsemantic theories of conditionals—the similarities areparticularly evident using presentations of conditional semantics suchasChellas 1975 that associate a set of worlds with the antecedent. Of course, theconsequence relation of the classical conditional logics is monotonic,and conditional semantics uses possible worlds, not models. But theleft-nonmonotonicity of conditionals (the fact that\(A\,\Box{\rightarrow}\,C\) does not imply\([A\wedge B]\,\Box{\rightarrow}\,C)\) creates issuesthat parallel those that arise from a nonmonotonic consequencerelation. Interrelations between conditionals and nonmonotonic logicbecame an important theme in later work in nonmonotonic logic. See,for instance,Gärdenfors & Makinson 1994,Boutilier 1992,Pearl 1994,Gabbay 1995,Delgrande 1998,Arlo-Costa & Shapiro 1992,Alcourrón 1995,Asher 1995,Kern-Isberner 2001,Giordano & Schwind 2004,Lent & Thomason 2015, andCasini & Straccia 2022.
Preference semantics raises an opportunity for formulating and provingrepresentation theorems relating conditions over preference relationsto properties of the abstract consequence relation. This line ofinvestigation began withLehmann & Magidor 1992.
Neither Doyle nor McDermott pursued the modal approach much beyond theinitial stages. With a helpful suggestion from Robert Stalnaker (seeStalnaker 1993), however, Robert C. Moore produced a modal theory that improves inmany ways on the earlier ideas. Moore gives the modal operator of hissystem an epistemic interpretation, based on the conception of adefault rule as one that licenses a conclusion for a reasoning agentunless something that the agent knows blocks the conclusion. InMoore’sautoepistemic logic, an extension \(E\) ofa theory \(T\) is a superset of \(T\) that isstable, i.e., that is deductively closed, and that satisfiesthe following two rules:
It is also usual to impose agroundedness condition onautoepistemic extensions of \(T\), ensuring that every member ofan extension has some reason tracing back to \(T\). Various suchconditions have been considered; the simplest one restricts extensionsto those satisfying
Autoepistemic logic remains a popular approach to nonmonotonic logic,in part because of its usefulness in providing theoretical foundationsfor logic programming. SeeMarek & Truszczynski 1991,Marek & Truszczynski 1989,Konolige 1994,Antoniou 1997,Moore 1993, andDenekeret al. 2003.
Epistemic logic has inspired other approaches to nonmonotonic logic.Like other modal theories of nonmonotonicity, these use modality toreflect consistency in the object language, and so allow default rulesalong the lines of \(\mathbf{DR}\) to be expressed. But instead ofconsistency, these useignorance. SeeHalpern & Moses 1985 andLevesque 1987 for variations on this idea. These theories are explained andcompared to other nonmonotonic logics inMeyer & van der Hoek 1995. In more recent work, Levesque’s ideas are systematicallypresented and applied to the theory of knowledge bases inLevesque & Lakemeyer 2000.
The contours of modern temporal logic were standardized by ArthurPrior during the 1950s and 1960s: see Prior1956,1967,1968.[14]As itwas developed in philosophical logic, tense logic proved to be aspecies of modal logic. Thus, it relativizes the truth-values offormulas toworld-states or temporal stages of the world;these are the tense-theoretic analogues of the timeless possibleworlds of ordinary modal logic. A research program can then beborrowed from modal logic—for instance, working out therelations between axiomatic systems and the corresponding modeltheoretic constraints on temporal orderings. See, for instance,Burgess 1984 andvan Benthem 1983.
Priorian tense logic shares with modal logic an interest in using thefirst-order theory of relations to explain the logical phenomena, anexpectation that the important temporal operators will be quantifiersover world-states, and a rather tenuous connection to realistic,practical specimens of temporal reasoning. Of course, these temporallogics do yield validities, such as
\[A\rightarrow\textit{PFA}\](if \(A\), then it was the case that \(A\) was going to bethe case), which certainly are intuitively valid. But at most, thesecan only play a broadly foundational role in accounting forcommonsense reasoning about time. It is hard to think of realisticexamples of reasoning in which they play a leading part.
Planning problems provide one of the most fruitful showcases forcombining logical analysis with AI applications. On the one handautomated planning enjoys many applications of real practical value,and on the other logical formalizations of planning are genuinelyhelpful in understanding planning problems, and in designingalgorithms.
The classical representation of an AI planning problem, as describedinAmarel 1968, evidently originates in early work of Herbert Simon’s,published in a 1966 CMU technical report,Simon 1966. In such a problem, an agent in an initial world-state is equippedwith a set ofactions, which are thought of as partialfunctions transforming world-states into world-states. Actions arefeasible only in world-states that meet appropriate constraints.(These constraints are now called the “preconditions” ofthe action.) A planning problem then becomes a search for a series offeasible actions that successively transform the initial world-stateinto a desired world-state.
TheSituation Calculus, developed by John McCarthy, is theorigin of most of the later work in formalizing reasoning about actionand change. It was first described in 1969, the earliest generallyaccessible publication on the topic isMcCarthy & Hayes 1969.
Apparently, Priorian tense logic had no influence on Amarel. But thereis no important difference between Amarel’s world-states andthose of Priorian tense logic. The “situations” of theSituation Calculus are these same world-states, under a new name.[15] They resemble possible worlds in modal logic in providing abstractlocations that support a consistent and complete collection of truths.As in tense logic, these locations are ordered, and change isrepresented by variations in truth conditions from one location toanother. The differences, of course, are inspired by the intended useof the Situation Calculus: it is meant to formalize Simon’srepresentation of the planning problem, in which a single agentreasons about scenarios in which sequential actions are performed.[16] Change in the situation calculus is dynamic, driven by theperformance of actions. Therefore the fundamental model theoreticcomponent is
\[\sc{Result}(\ra,\rs,\rs'),\]the relation between an action a, an input situation s in which a isperformed, and an output situation \(\rs'\) immediately subsequent tothe performance of the action. Usually (though this is not absolutelynecessary) the deterministic assumption is made that \(\rs'\) isunique.
All this, of course, presupposes a discrete picture of time. As inother action-driven frameworks, such ss game theory and the theory ofdigital computation, such a picture appears to be indispensible.
In general, actions can be successfully performed only under certainlimited circumstances. This could be modeled by allowing for cases inwhich there is no \(\rs'\) such that\(\sc{Result}(\ra,\rs,\rs')\). Often, however, it is assumedthat \(\sc{Result}\) is in fact a total function, but that incases in which s does not meet the “preconditions” of a,there are no restrictions on the \(\rs'\) satisfying\(\sc{Result}(\ra,\rs,\rs')\). This means that the causaleffects of a will be entirely unconstrained in such cases, and in thepresence of inertial laws “performing” a will leave thingsunchanged.
A planning problem starts with a limited repertoire of actions (wherepreconditions and effects are associated with each action), an initialsituation, and a goal (which can be treated as a formula). A planningproblem is a matter of finding a sequence of actions that will achievethe goal, given the initial situation. That is, given a goal\(G\) and initial situation s, the problem will consist offinding a sequence \(\ra_1, \ldots,\ra_n\) of actions whichwill transform s into a final situation \(\rs_n\) that satisfies\(G\). The planning problem is in effect a search for such asequence of actions. The success conditions for the search can becharacterized in a formalism like the Situation Calculus, which allowsinformation about the results of actions to be expressed.
Nothing has been said up till now about the actual language of theSituation Calculus. The crucial thing is how change is to beexpressed. With tense logic in mind, it would be natural to invoke amodality like \([a]A\), with the truth condition
\[\vDash_{\small{s}}[a]A\text{ iff }\vDash_{\small{\rs'}}A, \text{ where }\vDash \sc{Result}(\ra,\rs)=\rs'.\]This formalization, in the style of dynamic logic, is in fact anattractive alternative to McCarthy’s.
ButMcCarthy & Hayes 1969 deploys a language that is much closer to first-order logic. (Thisformalization style is characteristic of McCarthy’s work; seeMcCarthy 1979.) Actions are treated as individuals. And propositions whose truthvalues can change over time (propositionalfluents) are alsotreated as individuals. Where \(s\) denotes a situation and\(f\) a fluent, \(\Holds(f,\rs)\)says that \(f\)is true in \(\rs\).
Since the pioneering work of the nineteenth and early twentiethcentury logicians, the process of formalizing mathematical domains hasbecome routine. Although (as with set theory) there may becontroversies about what axioms and logical infrastructure best serveto formalize an area of mathematics, the methods of formalization andthe criteria for evaluating them are automatic and (mostly)unexamined. This methodological clarity has not been successfullyextended to other domains; even the formalization of the empiricalsciences presents difficult problems that have not yet been resolved.[17]
The formalization of temporal reasoning, and in particular ofreasoning about actions and plans, is the best-developed successfulextension of modern formalization techniques to domains other thanmathematical theories. This departure has required the creation of newmethodologies. One methodological innovation will emerge inSection 4.5: the development of a library of scenarios for testing the adequacy ofvarious formalisms, and the creation of specialized domains like theblocks-world domain (mentioned above, inSection 4.2) that serve alaboratories for testing ideas. For more on the blocks world, seeGenesereth & Nilsson 1987;Davis 1991. McCarthy’s ideas aboutelaborationtoleranceMcCarthy 1999 provide one interesting attempt to provide a criterion for theadequacy of formalizations. Another idea that has emerged in thecourse of formalizing commonsense domains is the importance of anexplicit ontology; see, for instance,Fikes 1996 andLenat & Guha 1989. Another is the potential usefulness of explicit representations ofcontext; seeGuha, 1991. Another is the use of simulation techniques: see, for instance,Johnstone & Williamson 2007.
To tell whether a plan achieves its goal, you need to see whether thegoal holds in the plan’s final state. Doing this requirespredictive reasoning, a type of reasoning that thetense-logical literature neglected. As in mechanics, predictioninvolves the inference of later states from earlier ones. But (in thecase of simple planning problems at least) change is driven by actionsrather than by differential equations. The investigation of thisqualitative form of temporal reasoning, and of related sorts ofreasoning (e.g., plan recognition, which seeks to infer goals fromobserved actions, and narrative explanation, which seeks to fill inimplicit information in a temporal narrative) is one of the mostimpressive chapters in the brief history of commonsense logicism.
The essence of prediction is the problem of inferring what holds inthe situation that ensues from performing an action, given informationabout the initial situation. The problem is much easier if the agenthas complete knowledge about the initial situation—thisassumption is often unrealistic, but was usual in the classicalplanning formalisms.[18]
A large part of action-driven dynamics has to do with what doesnot change. Take a simple plan to type ‘cat’using a word processor: the natural plan is to first enter‘c’, then enter ‘a’, then enter‘t’. Part of one’s confidence in this plan is thatthe actions are independent: for instance, entering ‘a’does not also erase the ‘c’. The required inference can bethought of as a form ofinertia. TheFrame Problemis the problem of how to formalize the required inertialreasoning.
The Frame Problem was named and introduced inMcCarthy & Hayes1969. Unlike most of the philosophically interesting technicalproblems to emerge in AI, it has attracted the interest ofphilosophers; most of the relevant papers, and background information,can be found inFord & Pylyshyn 1996 andPylyshyn 1987. Both of these volumes document interactions between AI andphilosophy.
The quality of these interactions is discouraging. Like any realisticcommonsense reasoning problem, the Frame Problem is open-ended, andcan depend on a wide variety of circumstances. If you put $20 in awallet, put the wallet in your pocket, and go to the store, you cansafely assume that the $20 will remain in the wallet. But if you leavethe $20 on the counter at the store while shopping, you can’tsafely expect it to be there later. This may account for thetemptation that makes some philosophers[19] want to construe the Frame Problem very broadly, so that very soon itbecomes indiscernible from the problem of formalizing general commonsense in arbitrary domains.[20] Such a broad construal may serve to introduce speculative discussionsconcerning the nature of AI, but it loses all contact with thegenuine, new logical problems in temporal reasoning that have beendiscovered by the AI community.
The purely logical Frame Problem can be solved using monotonic logic,by simply writing explicit axioms stating what doesnotchange when an action is performed. This technique can be successfullyapplied to quite complex formalization problems.[21] Butnonmonotonic solutions to the framework have beenextensively investigated and deployed; these lead to new andinteresting lines of logical development.
Some philosophers (Fodor 1987,Lormand 1996) have felt that contrived propositions will pose specialdifficulties in connection with the Frame Problem. As Shanahan pointsoutShanahan 1997 [p. 24]) Fodor’s “fridgeon” example is readilyformalized in the Situation Calculus and poses no special problems.However, as Lormand suggests, Goodman’s examplesGoodman, 1946 do create problems if they are admitted as fluents; there will beanomalous extensions in which objects change from green to blue inorder to preserve their grueness.
This is one of the few points made by philosophers about the FrameProblem that raises a genuine difficulty for AI formalization. But thedifficulty is peripheral, because the example is unrealistic. Closureproperties (such as closure under boolean operations) are not assumedfor fluents. In fact, it is generally supposed that the fluents chosenin formalizing a planning domain will represent a very limited subsetof the totality of state-dependent functions; typically, it will be arelatively small finite set of variables, representing features of thedomain considered to be important. In particular cases these will bechosen in much the same way that a set of variables is chosen instatistical modeling.
I don’t know of any systematic account in the AI literature offormalization methodology, or, in particular, of how to choose anappropriate set of fluents. But it would certainly be part of such anaccount that all fluents should correspond to projectable predicates,in Goodman’s sense.
Nonmonotonic solutions to the Frame Problem make inertia a default;changes are assumed to occur only if there is some special reason forthem to occur. In action-centered accounts of change, these specialreasons are found in axioms specifying the immediate effects ofactions.
We can illustrate the formalization with Reiter’s default logic.Recall that in Reiter’s theory, defaults are represented asrules, not as axioms; this means that we need to use default ruleschemata to formalize inertia. For each fluent, action, and situation,the inertia schema will include the following rule:
\[\tag*{\textbf{Inertia:}}\leadsto \Holds(f,s) \leftrightarrow \Holds(f, \ttResult(a, s))\]This way of doing things makes any change in the truth value of afluent aprima facie anomaly. But it follows fromReiter’s account of extensions that such defaults are overriddenwhen they conflict with the (monotonic) axioms giving the statedynamics. If, for instance, there is a monotonic causal axiom for theaction move-P4-to-Q4 ensuring that moving a certain pawn to Q4 willlocate the pawn at Q4, the instance
\[\begin{aligned}&\Holds(\mathsf{At}(\mathsf{Q2}, \mathsf{Pawn4}), \mathsf{s_0}) \\&\quad\leftrightarrow\ \Holds(\mathsf{At}(\mathsf{Q2}, \mathsf{Pawn4}),\ttResult(\mathsf{move\text{-}P4\text{-}to\text{-}Q4}, \mathsf{s_0}))\end{aligned}\]of \(\mathbf{IR}\) will be overridden, and there will be noextension in which the pawn remains where it was after performing themove-P4-to-Q4 action.Inertia will then ensure thatthe other pieces stay put.
The frame problem that captured wider attention was taken out ofcontext and in isolation. If one is interested in understanding thephilosophically interesting problems that arise in deployingformalisms like the Situation Calculus, it is best to consider alarger range of problems. These include not only the Frame Problemitself, but also the Qualification Problem, the Ramification Problem,and an assortment of specific challenges such as the scenariosmentioned later in this section. And one has to think about how togeneralize: for instance, how to deal with incomplete information,multiple agents acting concurrently, and continuous change in theenvironment.
TheQualification Problem arises in connection with theformalization of just about any commonsense generalization. Typically,these will involve an open-ended and seemingly unmanageable array ofexceptions. The same phenomenon, under the label ‘the problem ofceteris paribus generalizations’, is familiar fromanalytic philosophy. It also comes up in the semantics ofgeneric constructions found in natural languages.[22]
Nonmonotonic logics make a contribution to this problem by enablingincremental formalization. If a commonsense generalization isformulated as a default, then further qualifications can be addednondestructively. The default axiom is retained, and anexception—which itself may be a default—is added. This ishelpful, even if it doesn’t address deeper problems of a philosophicalnature.
The Qualification Problem was raised inMcCarthy 1986, where it was motivated chiefly by generalizations concerning theconsequences of actions; McCarthy considered in some detail thegeneralization that turning the ignition key in an automobile willstart the car. Much the same point, in fact, can be made aboutvirtually any action, including stacking one block onanother—the standard example used in the early days of theSituation Calculus. A circumscriptive approach to the QualificationProblem is presented inLifschitz 1987; this explicitly introduces the relation between an action and itspreconditions into the formalism, and circumscriptively minimizespreconditions, eliminating from preferred models any “unknownpreconditions” that might render an action inefficacious.
Not every nonmonotonic logic provides graceful mechanisms forqualification. Plain default logic, for instance, does not deliver theintuitively desired conclusions because it provides no way fordefaults to override other defaults. To achieve this effect, one needsa fancy version of the logic in which defaults are prioritized. Thiscan complicate the theory considerably; see, for instance,Asher & Morreau 1991 andHorty 1994. And, asElkan 1995 points out, the Qualification Problem raises computational issues.
Relatively little attention has been given to the QualificationProblem for characterizing actions, in comparison with other problemsin temporal reasoning. In particular, the standard accounts ofunsuccessful actions are somewhat unintuitive. In theformalization ofLifschitz 1987, for instance, actions with unsatisfied preconditions are onlydistinguished from actions whose preconditions all succeed in that theconventional effects of the action will only be ensured when thepreconditions are met. It is as if an action of spending $1,000,000can be performed at any moment—although if you don’t havethe money, no effects in particular will be guaranteed.[23] And there is no distinction between actions that cannot even beattempted (like boarding a plane in London when you are in Sydney),actions that can be attempted, but in which the attempt can beexpected to go wrong (like making a withdrawal when you haveinsufficient funds), actions that can be attempted with reasonablehope of success, and actions that can be attempted with guaranteedsuccess. As J.L. Austin made clear inAustin 1961, the ways in which actions can be attempted, and in which attemptedactions can fail, are a well developed part of commonsense reasoning.Obviously, in contemplating a plan containing actions that may fail,one may need to reason about the consequences of failure. Formalizingthe pathology of actions, providing a systematic theory of ways inwhich actions and the plans that contain them can go wrong, would be auseful addition to planning formalisms, and one that would illuminateimportant themes in philosophy.
The challenge posed by theRamification Problem(characterized first inFinger 1987) is to formalize the indirect consequences of actions, where“indirect” effects are synchronous[24] but causally derivative. If one walks into a room, the direct effectis that one is now in the room. There are also many indirect effects:for instance, that one’s shirt also is now in the room.
You can see from this formulation that a distinction is presupposedbetween direct consequences of actions (ones that attach intrinsicallyto an action and are ensured by its successful performance) and otherconsequences. This assumption is generally accepted without questionin the AI literature on action formalisms. You can make a good casefor its commonsense plausibility—for instance, many of our wordsfor actions (‘to warm’, ‘to lengthen’,‘to fill’) are derived from the effects that areconventionally associated with them. And in these cases, success isentailed: if someone has warmed something, this entails that it becamewarm. But there are complications.Lin 1995 discusses a simple example: a certain suitcase has two locks, and isopen if and only if both locks are open. Then (assuming that actionsare not performed concurrently) opening one lock will open thesuitcase if and only if the other lock is open. Lin’s formalizationtreats opening each lock as an action, with direct consequences. Butopening the suitcase is not an action, it is an indirect effect.
Obviously, the Ramification Problem is intimately connected with theFrame Problem. In approaches that adopt nonmonotonic solutions to theFrame Problem, inertial defaults will need to be overridden byramifications in order to obtain correct results. In Lin’s example,suppose that the left lock of the suitcase is open and the action ofopening the right lock is performed. Then the default conclusion thatthe suitcase remains closed needs somehow to be suppressed.
Some approaches to the Ramification Problem depend on the developmentof theories of commonsense causation and therefore are closely relatedto the causal approaches to reasoning about time and action mentionedinSection 4.6. See, for instance,Giunchigliaet al. 1997,Thielscher 1989,Lin 1995.
Philosophical logicians have been content to illustrate their ideaswith relatively small-scale examples. The formalization of evenlarge-scale mathematical theories is relatively unproblematic.Logicist AI is the first branch of logic to undertake the task offormalizing realistic and nontrivial commonsense reasoning. In doingso, the field has had to invent new methods. An important part of themethodology that has emerged in formalizing action and change is theprominence that is given to challenges, posed in the form ofscenarios. These scenarios represent formalization problemswhich usually involve relatively simple, realistic examples designedto challenge the logical theories in specific ways. Typically, therewill be clear commonsense intuitions about the inferences that shouldbe drawn in these cases. The challenge is to design a logicalformalism that will provide general, well-motivated solutions to thesebenchmark problems.
Among the many scenarios that have been discussed in the literatureare the Baby Scenario, the Bus Ride Scenario, the Chess BoardScenario, the Ferryboat Connection Scenario, the Furniture AssemblyScenario, the Hiding Turkey Scenario, the Kitchen Sink Scenario, theRussian Turkey Scenario, the Stanford Murder Mystery, the StockholmDelivery Scenario, the Stolen Car Scenario, the Stuffy Room Scenario,the Ticketed Car Scenario, the Walking Turkey Scenario, and the YaleShooting Anomaly. Accounts of these can be found inShanahan 1997 andSandewall 1994; see especiallySandewall 1994[Chapters 2 and 7].
Many of these scenarios are designed to test advanced problems thatwill not be discussed here—for instance, challenges dealing withmultiple agents, or with continuous changes. Here, we concentrate onone of the earliest, and probably the most subtle of these scenarios:the Yale Shooting Anomaly, first reported inHanks & McDermott 1985 and published inHanks & McDermott 1986 andHanks & McDermott 1987.
The Yale Shooting Anomaly involves three actions: load, shoot, andwait. A propositional fluent Loaded tracks whether a certain pistol isloaded; another fluent, Alive, tracks whether a certain turkey, Fred,is alive. The load action has no preconditions; its only effect isLoaded. The shoot action has Loaded as its only precondition andNot-Alive as its only effect; the wait action has no preconditions andno effects.
Causal information regarding the axioms is formalized as follows.
| Load: | \(\forall s(\Holds(\Loaded, \ttResult(\load, s))\) |
| Shoot 1: | \(\forall s (\Holds(\Loaded, s)\ \rightarrow\) \(\Holds(\lnot \Alive, \ttResult(\shoot, s)))\) |
| Shoot 2: | \(\forall s (\Holds(\Loaded, s)\ \rightarrow\) \(\Holds(\lnot \Loaded, \ttResult(\shoot, s)))\) |
There is no Wait Axiom.
We will formalize the inertial reasoning in this scenario usingReiter’s default logic. The set \(D\) of defaults for thistheory consists of all instances of the inertial schema\(\mathbf{IR}\). In the initial situation, Fred is alive and thepistol is unloaded.
| IC1: | \(\Holds(\Alive, s_0)\) |
| IC2: | \(\lnot\Holds(\Loaded, s_0)\) |
The monotonic theory \(W\) of the scenario consists of: (1) theaction axiomsLoad,Shoot 1 andShoot 2 and (2) the initial conditions\(\mathbf{IC1}\) and \(\mathbf{IC2}\).
Let \(s_1=\ttResult(\load, s_0),\) \(s_2=\ttResult(\mathsf{wait},s_1)\), and \(s_3=\ttResult(\shoot, s_2)\).
The Yale Shooting Anomaly involves the action sequence load; wait;shoot, passing from \(\rs_0\) to \(\rs_3\), as follows.
| \(\rs_0\) | \(\ttload\) \(\rightarrow\) | \(\rs_1\) | \(\ttwait\) \(\rightarrow\) | \(\rs_2\) | \(\ttshoot\) \(\rightarrow\) | \(\rs_3\) |
It is an anomaly—a challenge to a naive theory ofinertia—because default logic allows an extension according towhich the pistol is unloaded and Fred is alive in the final situation\(s_3\). The anomalous extension is pictured asfollows.
| \(\ttAlive\) \(\lnot\ttLoaded\) \(\rs_0\) | \(\ttload\) \(\rightarrow\) | \(\ttAlive\) \(\ttLoaded\) \(\rs_1\) | \(\ttwait\) \(\rightarrow\) | \(\ttAlive\) \(\lnot\ttLoaded\) \(\rs_2\) | \(\ttshoot\) \(\rightarrow\) | \(\ttAlive\) \(\lnot\ttLoaded\) \(\rs_3\) |
In narrative form, what happens in this extension is this. At first,Fred is alive and the pistol is unloaded. After loading, the pistol isloaded and Fred remains alive. After waiting, the pistol becomesunloaded and Fred remains alive. Shooting is then vacuous because thepistol is unloaded. So finally, after shooting, Fred remains alive andthe pistol remains unloaded.
The best way to see that this is an extension is to work through theproof. Less formally, though, you can see that the expected extensionin which Fred ends up dead violates just one default: the framedefault forAlive is violatedwhen Fred changes state in the last step. But the anomalous extensionalso violates only one default: the frame default forLoadedis violated when the pistol spontaneouslybecomes unloaded while waiting. If you just go by the number ofdefaults that are violated, both extensions are equally good.
A planning algorithm based on a straightforward default logicformalization of causal inertia will be unable to perform as expected.It will be unable to verify a perfectly reasonable commonsense plan tokill Fred and will fail similarly in all but the simplest planningscenarios. So the Yale Shooting Anomaly represents a major obstacle indeveloping an inertia-based theory of predictive reasoning. Aplausible, well-motivated logical solution to the Frame Problem hasrun afoul of a simple, crisp example in which it clearly delivers thewrong results.
Naturally, the literature concerning the Yale Shooting Anomaly isextensive. Surveys of some of this work, with bibliographicalreferences, can be found inShanahan 1997 andMorgenstern 1996.
It is commonly agreed that good solutions need to performsatisfactorily over a large suite of scenarios and to begeneralizable: in particular, they should be deployable even whencontinuous time, concurrent actions, and various kinds of ignoranceare introduced. And it is agreed that they should support multiplereasoning tasks, including not only prediction and plan verificationbut explanation of historical information or a narrative in terms ofactions performed and agent goals.
Here, we mention four approaches: (1) Features and fluents(Sandewall), (2) Motivated Action Theory (Morgenstern and Stein), (3)State Minimization in the Event Calculus (Shanahan) and (4) CausalTheories (Lifschitz and others). The fourth approach is most likely tobe interesting to philosophers and to contain elements that will be oflasting importance regardless of future developments in this area, andis discussed in more detail.
This approach, described inSandewall 1994, uses preference semantics as a way to organize nonmonotonic solutionsto the problems of reasoning about action and change. Rather thanintroducing a single logical framework, Sandewall considers a numberof temporal logics, including ones that use discrete, continuous, andbranching time. The properties of the logics are systematically testedagainst a large suite of test scenarios.
This theory grew out of direct consideration of the problems intemporal reasoning described above inSection 4.5, and especially the Yale Shooting scenario.Morgenstern & Stein 1994 seeks to find a general, intuitively motivated logical framework thatsolves the difficulties. Morgenstern and Stein settle on the idea thatunmotivated actions are to be minimized, where an action can bemotivated directly, e.g. by an axiom, or indirectly, through causalchains. The key technical idea is a (rather complicated) definition ofmotivation in an interval-based temporal logic.
Morgenstern 1996 presents a summary of the theory, along with reasons for rejectingits causal rivals. The most important of these is that accounts basedon the Situation Calculus do not appear to generalize to casesallowing for concurrency and ignorance. She also cites the failure ofearly causal theories to deal with retrodiction.
Baker 1989 works with circumscriptive versions of the Yale Shooting Anomaly.Recall that circumscription uses preferred models in which theextensions of abnormality predicates are minimized. In the course ofthis minimization, certain parameters (including, of course, thepredicates to be minimized) are allowed to vary; the rest are heldconstant. Which parameters vary and which are held constant isdetermined by the application.
In the earliest circumscriptive solutions to the Frame Problem, theinertial rule \(\textbf{CIR}\) is stated using an abnormalitypredicate.
\[\tag*{\textbf{CIR}:}\forall f, s, a [\lnot \texttt{Ab}(f, a, s)\rightarrow [\Holds(f, s)\leftrightarrow \Holds(f, \ttResult(a, s))]]\]This axiom uses a biconditional, so that it can be used forretrodiction; this is typical of the more recent formulations ofcommonsense inertia. An unsophisticated solution to the frame problemminimizes the abnormality predicate while allowing theHoldspredicate to vary andkeeping all other parameters fixed. This succumbs to the Yale ShootingAnomaly in much the same way that default logic does. Circumscriptiondoes not involve multiple extensions, so the anomaly appears asinability to conclude that Fred is dead after the shooting.
In Baker’s reformulation of the problem, separate axioms ensurethe existence of a situation corresponding to each Boolean combinationof fluents, and theResultfunction is allowed to vary, whiletheHolds predicate isheld constant. In this setting, theResultfunction needs to be specified for“counterfactual”actions—in particular, for shootingand for waiting in the Yale Shooting Anomaly. It is this feature thateliminates the incorrect model for that scenario; for details, seeBaker 1989 andShanahan 1997, Chapter 6.
This idea, which Shanahan calls “State-BasedMinimization,” is developed and extended inShanahan 1997, in the context of a temporal logic deriving from the Event CalculusofKowalski & Sergot 1986. Shanahan’s version has the advantage of being closely connectedto logic programming.
Recall that in the anomalous model of the Yale Shooting scenario thegun becomes unloaded after the performance of the wait action, anaction which has no conventional effects. The unloading, then, isuncaused. This suggests a solution that minimizes outcomes that haveno cause.
This strategy was pursued inGeffner 1990 and1992. A similar approach beginning withLifschitz 1987 develops a sustained line of research along these lines, carried outnot only by Lifschitz and his students and colleagues in the TexasAction Group but by some others. For this work and further references,seeThielscher 1989,Gustaffson & Doherty 1996,Baral 1995,Nakashimaet al. 1997,Lifschitz 1997,Giunchiglia & Lifschitz 1998,Lin 1995,Haugh 1987,Lifschitz 1998b,Turner 1999,McCain & Turner 1995,Elkan 1991,McCain & Turner 1997,Thielscher 1996, andGelfond & Lifschitz 1998.
Here, we describe the causal solution presented inTurner 1999. Turner returns to the ideas ofGeffner 1992, but places them in a simpler logical setting and applies them to theformalization of more complex scenarios that illustrate theinteractions of causal inertia with other considerations, especiallythe Ramification Problem.
Ramification is induced by the presence ofstatic laws whichrelate the direct consequences of actions to other changes. Acar-starting scenario illustrates the difficulties. There is oneaction,turn-on, which turns on the ignition; let’ssuppose that this action has no preconditions. There is a fluent\(Ig\) tracking whether the ignition is on, a fluentDead tracking whether the battery is dead, and a fluentRun tracking whether the engine is running. A static law saysthat if the ignition is on and the battery isn’t dead, theengine is running. (Let’s suppose that every other source offailure has already been eliminated in this scenario; the onlypossible reason for not starting is the battery.) We want to considera transition in whichturn-on is performed when the ignitionisn’t on, the battery is not dead, and the car isn’t running.
Of course, we want to infer in such a case that a performance ofturn-on will result in a situation in which the ignition ison, the battery isn’t dead, and the engine is running. Butcontraposed causal laws frustrate this conclusion. The difficulty isthis: we can conclude by contraposing our only static law that if theignition is on and the engine isn’t running, then the battery isdead. This law not only is true in our scenario, but would be used toexplain a failed attempt to start the car. But if it is used forprediction, then performingturn-on will produce a“Murphy’s law” outcome in which the ignition is on, thebattery is dead, and the engine isn’t running. Everything has acause in this unwanted outcome: The battery is dead because of causalinertia and the engine isn’t running because of the contraposedcausal law.
Readers who want to explore in some detail the problems of embedding anonmonotonic solution to the Frame Problem in relatively expressiveaction languages can look toGelfond & Lifschitz 1998. This paper presents an increasingly powerful and sophisticated seriesof action languages incorporating a somewhatad hoc solutionto the Ramification Problem.Turner 1999 is an improvement along these lines.
Turner’s idea is to treat \(\texttt{Caused}\)as a modal operator\([c]\), which isprovided with a nonmonotonic preferred models interpretation.Universal causality prevails in a preferred model: the causedpropositions and the true propositions must coincide. Moreover, thismodel must be unique; it must be the only possibility consistent withthe extensional part of the language.
To understand this idea, it’s helpful to recall that in the possibleworlds interpretation of \(\mathbf{S5}\), worlds can be identifiedwithstate descriptions, i.e. with complete, consistent sets\(I\) of literals (atomic formulas and their negations). Thisallows us to think of a model as a pair \(\langle I,S\rangle\), where \(S\) is a set of interpretations including\(I\). The modal operator\([c]\) is giventhe standard semantics. Where \(S\) is a set of interpretationsand where\(I\in S,S\vDash_I[c]A\)if and only if\(S\vDash_{I'}A\)for all \(I'\in S. \langle I,S\rangle\)satisfies a set of formulas \(T\) if andonly if\(S\vDash_I A\) forall \(A\in T\).
Turner’s preferred models of \(T\) are the pairs\(\langle I, S\rangle\) such that: (1) \(\langle I,S\rangle\) satisfies \(T\), (2) \(S = \{I\}\),and (3) \(\langle I, S\rangle\) is the unique interpretation\(\langle\)I\('\),S\('\)\(\rangle\) meeting conditions (1)and (2). Condition (2) guarantees the “universality ofcausation;” it validates \(A \leftrightarrow [c]A\).Condition (3) “grounds” causality in noncausal information(in the models in which we are interested, this will be a matter ofwhich fluents hold in which situations), in the strongest sense: it isuniquely determined by this information.
Although it is not evident, Turner’s account of preferred modelsturns out to be related to more general nonmonotonic logics, such asdefault logic. ConsultTurner 1999 for details.
The axioms that specify the effects of actions treat these effects ascaused; for instance, the axiom schema for loading would read asfollows:
Causal-Load: \([c]\Holds(\Loaded, \ttResult(\load, s))\)[25]
Ramifications of the immediate effects of actions are also treated ascaused. And there are two nonmonotonic inertial axiom schemata:
\[\begin{aligned}([c] \Holds(f, s)\land {}&\Holds(f, \ttResult(a, s))) \\ &\rightarrow [c]\Holds(f, \ttResult(a, s))\end{aligned}\]and
\[\begin{aligned}([c] \lnot\Holds(f, s)\land {}&\lnot\Holds(f, \ttResult(a, s))) \\&\rightarrow [c]\lnot\Holds(f, \ttResult(a, s))\end{aligned}\]Thus, a true proposition can be caused either because it is the director indirect effect of an action, or because it involves thepersistence of a caused proposition. Initial conditions are alsoconsidered to be caused, by stipulation.
To illustrate the workings of this approach, consider the simplestcase: a language with just one fluent-denoting constant,f,and one action-denoting constant,wait. As in the Yale shootingproblem, there are no axioms forwait;the action can always be performed and has noassociated effects. Let s\(_1\) be the result of performing thewait action in s\(_0)\).
The theory \(T\) contains an initial condition
\[\Holds( \mathsf{f}, \mathsf{s}_0)\]and a statement that the initial condition is caused,
\[[c]\Holds( \mathsf{f}, \mathsf{s}_0),\]Two models of \(T\) satisfy conditions (1) and (2):
\[M_1 = \langle I_1,\{I_1\}\rangle \text{ and }M_2 = \langle I_2,\{I_2\}\rangle,\]where
\[I_1=\{\Holds( \mathsf{f}, \mathsf{s}_0), \Holds( \mathsf{f}, \mathsf{s}_1)\} \text{ and }I_2=\{\Holds( \mathsf{f}, \mathsf{s}_0), \lnot\Holds( \mathsf{f}, \mathsf{s}_1)\}. \]\(M_1\) is the intended model, in which nothingchanges. It satisfies Condition (3), since if\(\langle I_1, S\rangle\) satisfies \(T\) thenit satisfies \([c]\Holds(\mathrm{f}, \mathrm{s}_1)\)by theinertial axiom
\[([c]\Holds(\mathsf{f}, \mathsf{s}_0)\land \Holds(\mathsf{f}, \mathsf{s}_1))\rightarrow [c]\Holds(\mathsf{f}, \mathsf{s}_1)\]Therefore, \(S = \{I_1\}\).
\(M_2\) is an anomalous model, in which the fluentceases spontaneously. This model does not satisfy Condition (3), since\(M_3 = \langle I_2,\{I_1,I_2\}\rangle\) also satisfies \(T\); in particular,it satisfies the inertial axiom for f because it fails to satisfy \(\Holds(\mathsf{f}, \mathsf{s}_1)\).So, while \(M_1\) is apreferred model, \(M_2\) is not.
Turner’s approach avoids the problem of contraposition by givingcausal relations the form
\[[\sc{Background-Conditions} \land \sc{Cause}]\rightarrow [c]\sc{Effect}\]When contraposed, this becomes
\[[\sc{Cause}\land \lnot [c] \sc{Effect}]\rightarrow \lnot \sc{Background-Conditions}\]which does not have the form of a causal law.
The apparent usefulness of a “principle of universalcausality” in accounting for a range of problems in qualitativecommonsense reasoning should be of interest to philosophers. And thecausal theory, as initiated by Geffner and developed by Turner, hasmany interesting detailed features. For instance, while philosophicalwork on causality has concentrated on the causal relation, Taylor’sapproach shows that much can be done with only a nonrelational causalpredicate.
Action-driven dynamics can be used to construct models for conditionallogics.Lent and Thomason 2015 uses Turner’s causal approach to provide such models in therestricted case where the antecedent is the conjunction of an actionexpression and simple nonmodal conditions. An explicit solution to theframe problem provides counterfactual predictions and automaticallyprovides a conditional semantics.
Morgenstern 1996 offers two chief criticisms of the causal approach to reasoning aboutactions: that it does not give an adequate account of explanation[26] and that the Situation Calculus itself is limited in scope. Neithercriticism is fatal; both can be taken as challenges for futureresearch.
For another approach to nonmonotonic causal reasoning, based oninput-output logics (Makinson & van der Torre 2000), seeBochman 2004.
Of course, causal reasoning is an important topic in its own right.For instance, it figures in qualitative reasoning about devices.Herbert Simon’s work in this area goes back to the 1950s: seeSimon 1952;1977;Iwasaki & Simon 1986. Judea Pearl and his students and associates are responsible for themost sustained and successful investigation of causal models andcausal reasoning. Pearl and many of his co-authors are computerscientists, but statisticians and philosophers have also contributedto this research program. We will not discuss causal networks furtherhere. SeeHalpern 2016 andHitchcock 2022.
The precomputational literature in philosophical logic relating tospatial reasoning is relatively sparse. But the need to supportcomputational reasoning about space in application areas such asmotion planning and manipulation in physical space, the indexing andretrieval of images, geographic information systems, diagrammaticreasoning, and the design of high-level graphics programs has led tonew interest in spatial representations and spatial reasoning. Ofcourse, the geometrical tradition provides an exceptionally strongmathematical resource for this enterprise. But as in many otherAI-related areas, it is not clear these theories are appropriate forinforming these applications, and many computer scientists have feltit worthwhile to develop new foundations. Some of this work is closelyrelated to the research in qualitative reasoning mentioned above inSection 2.2, and in some cases has been carried out by the same individuals. Andof course, there also are connections to the mereology literature inphilosophical logic.
The AI literature in spatial reasoning is extensive; for references tosome areas not discussed here, seeStock 1997,Kapur & Mundy 1988,Hammer 1995,Wilson 1998,Osherson & Lasnik 1990,Renz & Nebel 1999,Yeap & Jeffries 1999,Chen 1990,Burger & Bhanu 1992,Allwein & Barwise 1996,Glasgowet al. 1995, andKosslyn 1990. Here, we discuss only one trend, which is closely connected withparallel work in philosophical logic.
Qualitative approaches to space were introduced into the logicalliterature early in the twentieth century by StanisławLeśniewski; seeLeśniewski 1916, which presents the idea of amereology, or qualitativetheory of the part-whole relation between physical individuals. Thisidea of a logical theory of relations among regions remained active inphilosophical logic, even though it attracted relatively fewresearchers. More recent work in the philosophical literature,especiallyCasati & Varzi 1999,Simons 1987,Casati & Varzi 1996,Clarke 1981, andClarke 1985, has influenced the current computational work.
The Regional Connection Calculus (RCC), developed by computerscientists at the University of Leeds, is based on a primitive\(C\) relating regions of space: the intended interpretation of\(C(x, y)\) is that the intersection of theclosures of the values of \(x\) and \(y\) is nonempty. SeeCohnet al. 1997,Cohn 1996) for details and references. The extent of what can bedefined with this simple primitive is surprising, but thetechnicalities quickly become complex; see, for instance,Gotts 1994,Gotts 1996). The work cited inCohnet al. 1997 describes constraint propagation techniques and encodings inintuitionistic propositional logic as ways of supporting implementedreasoning based on RCC and some of its extensions. More recent workbased on RCC addresses representation and reasoning about motion,which of course combines spatial and temporal issues; seeWolter & Zakharyaschev 2000). For more information about qualitative theories of movement, withreferences to other approaches, seeGalton 1997.
Hintikka 1962, the classical source for epistemic logic, takes its cue from modallogic. Thus, the work concentrates on how to model the attitudes of asingle agent with modal operators. Because possible-worlds semanticsaccommodates alternative modal operators, Hintikka discusses at lengththe question of exactly which alternatives are appropriate forknowledge and belief, opting for the modal logic \(\mathbf{S4}\).For more background and information about later developments, seeRendsvig & Symons 2022. AndLaux & Wansing 1995 discusses both the philosophical and computational traditions up to1994.
Epistemic attitudes figure in game theory, as well as logical AI, andwork in both of these application areas either parallels or wasinfluenced by Hintikka’s modal approach. In several papers (includingMcCarthy 1979), John McCarthy recommended an approach to formalizing knowledge thatuses first-order logic, but that quantifies explicitly over suchthings as individual concepts. Here, however, we discuss the approachtaken by most computer scientists, who—unlike McCarthy—usemodal logic, but—unlike Hintikka—concentrate on themultiagent case.
Faginet al. 1995 simplifies the underlying modality, using \(\mathbf{S5}\) forknowledge (or deontic \(\mathbf{S5}\) for belief), butconcentrates on agents’ attitudes about one another’sattitudes. Such logics have direct applications in the analysis ofdistributed systems, dynamic systems in which change iseffected by message actions, which modify the knowledge of agentsaccording to rules determined by acommunications protocol.Multi-Agent epistemic logic is another example of how the need ofapplications provided the inspiration for significant contributions tologic.Faginet al. 1995; is essential reading for anyone seriously interested in this topic.Other applied work in epistemic logic is reported in the proceedingsof a series of conferences initiated in 1986 withHalpern 1986. These conferences record one of the most successful collaborations ofphilosophers with logicians in Computer Science, although the group ofinvolved philosophers has been relatively small. The focus of theconferences has gradually shifted from Computer Science toEconomics.
Computer scientists are used to thinking of reasoning as themanipulation of symbolic representations. And it is mainly due to AIthat limited rationality has become a topic of serious interest,providing a counterbalance to the idealizations of philosophy and economics.[27] You would think, then, that closure of epistemic attitudes underlogical consequence would be highly unpopular in AI. But this is notso; the possible worlds approach to attitudes is not only the leadingtheory in the areas discussed inFaginet al. 1995, but has even been advocated in robotics applications; seeRosenschein & Kaelbling 1995;Rosenschein 1989. Nevertheless, the issue of hyperintensionality has beeninvestigated in the AI literature; seePerlis 1985;Konolige 1986;Lakemeyer 1997;Levesque 1984). Though the work on this topic in AI provides newtheories and some new results, no leading approach has yetemerged.
John McCarthy’s explicit long-term goal—the formalizationof commonsense knowledge—was adopted and pursued by a relativelysmall subcommunity of AI researchers. The work of a much larger group(those involved in knowledge representation, cognitive robotics, andqualitative physics) contributes to specialized projects that supportthe larger goal. Anything remotely like a formalization of commonsense is so far from being accomplished that—if it is achievableat all—guessing when we could expect the task to be completed ishopeless. But at least the effort has yielded a better sense of how todevelop a workable methodology for formalizing commonsense examplesand domains, and of how to divide the larger problem up into moremanageable parts.
The first book-length treatment of this topic,Davis 1991, divides the general problem into the following subtopics.
The first four of these topics overlaps with qualitative physics. Formore information on this related subfield, consultWeld & de Kleer 1990,Davis 2008, andForbus 2008.
Item 6 is the most extensively studied of Davis’s seven.Section 4 discussed the early phases of this work. There is a robust subsequenthistory of research on planning and goal formation, with the laterwork blending into work on planning architectures for autonomousagents. Items 5 and 7 are underresearched. Although artificialsocieties and architectures for artificial minds have been intensivelystudied, there has been relatively little work on the formalization ofcommonsense psychology and commonsense interpersonal reasoning.However, seeDavis 1991 andHobbs & Gordon 2005.
For a book-length treatment of the commonsense challenge, seeMueller, 2006. More than half of the book is devoted to reasoning about actions andchange. There are short chapters on space and mental states, and alonger treatment of nonmonotonic reasoning.
Research in computer science is almost entirely driven by theavailability of funding. The formalization of commonsense reasoningwas never heavily funded, but until John McCarthy’s death in 2011small amounts of funding were available. There were regular meetingsof the commonsense interest group in 1998, 2001, 2003, 2005, 2007, and2009. Many of the papers presented at the 2003 conference werecollected in expanded form in 2004, in Volume 153 ofArtificialIntelligence. Davis & Morgenstern 2004, the introduction to this collection, provides a useful survey andappreciation of research in the formalization of common sense and themechanization of commonsense reasoning. TheCommon Sense Problem Page is still maintained, but activity in this field has been slow from2010 until the present, except for related knowledge representationresearch.
Borrowing the idea from other areas of computer science, thecommonsense community has sought to develop suites of “benchmarkproblems”: to publicize problems that are difficult but notimpossibly difficult and to encourage the creation of solutions andtheir comparison. Probably the best-documented problem to date isErnest Davis’ “egg-cracking problem.” This isformulated as follows inthe Common Sense Problem Page.
A cook is cracking a raw egg against a glass bowl. Properly performed,the impact of the egg against the edge of the bowl will crack theeggshell in half. Holding the egg over the bowl, the cook willseparate the two halves of the shell with his fingers, enlarging thecrack, and the contents of the egg will fall gently into the bowl. Theend result is that the entire contents of the egg will be in the bowl,with the egg unbroken, and that the two halves of the shell are in thecook’s fingers.
Variants: What happens if: The cook brings the egg to impactvery quickly? Very slowly? The cook lays the egg in the bowl andexerts steady pressure with his hand? The cook, having cracked theegg, attempts to peel it off its contents like a hard-boiled egg? Thebowl is made of looseleaf paper? Of soft clay? The bowl is smallerthan the egg? The bowl is upside down? The cook tries this procedurewith a hard-boiled egg? With a coconut? With an M&M?
Along with the problem itself three solutions are posted:Shanahan 2004,Lifschitz 1998a, and a version ofMorgenstern 2001. Comparing the solutions is instructive—similarities outweighdifferences. All the authors think of this as a planning problem, anduse versions of the Situation Calculus or the Event Calculus in theformalization. Each axiomatization is modular, with, for instance,separate modules devoted to the relevant geometrical and materialproperties. Each author provides a “proof of concept” forthe formalization by showing that the axioms support a proof of thecorrectness of a plan to crack the egg in the simple case. None of theauthors considers all of Davis’ elaborations of the problem, butthe axioms are framed with elaboration in mind and some elaborationsare considered. It isn’t clear whether any of the authorsactually implemented their formalization (for instance, using atheorem prover, an animation, or a robot controller).
The egg-cracking example raises the problem of how to evaluatemoderately large formalizations of commonsense problems. Morgensternand Shanahan express this issue explicitly. Morgenstern suggests thatthe important criteria are (1) Epistemological adequacy(correspondence to intuitive reasoning, as experienced by people whoengage in it), (2) Faithfulness to the real world, (3) Reusability,and (4) Elaboration tolerance. The first two of these criteria may betoo subjective to be very useful. To these, Shanahan adds (5)Usability. More important, however, in the long run, would be theautomatization of testing and evaluation, by generating scenarios andtesting them with real-world or simulated robotic agents.
Any even moderately successful attempt to formalize common sense willsoon encounter unprecedented problems of scale, creating challengessimilar to those that software engineering tries to address. Evenfairly small programs and systems of axioms are difficult tocomprehend and can produce unexpected results. Creating andmaintaining them may require teams of developers, precipitatingorganizational issues, as well as issues having to do with theintegration of modules, the maintenance and testing of large systems,and the generation of axioms from disparate knowledge sources.Although the need for large-scale software systems has provided bestpractices for enterprises of this kind it might well turn out that,even with ample funding, human expertise would be inadequate for thistask.
Two ways to automate the creation of formalizations can be imagined.(1) The large-scale ontologies created by the knowledge representationcommunity could be mined for axioms, or (2) axioms could be createddirectly from corpora using machine learning techniques. The firstmethod would entail unprecedented difficulties having to do withknowledge integration. Techniques for rendering the products ofmachine learning explainable[28] provide some hope for the second method, but the outputs of thesetechniques are not at all like logical axioms and the task ofconverting them appears to be challenging, to say the least.
All this contrasts sharply with the methodology of philosophicalanalysis. Analyses are far smaller in scale, are not formalized withimplementations in mind, and little or no attention is paid to theirintegration. Philosophers have never chosen a specific domaincomparable to the planning domain and mounted a sustained attempt toformalize it, along with a companion effort to develop appropriatelogics.
It is easy to suspect that many of the topics that have preoccupiedanalytic philosophy exhibit the sort of complexity that emerged, forinstance, from the attempts of AI researchers to formalize reasoningabout actions and their effects. If AI researchers were able todevelop and partially automate a formalization methodology forproblems like those listed inthe Common Sense Problem Page, this would certainly be a tremendous advance over what analyticphilosophers have been able to achieve. But perhaps philosophers cancongratulate themselves that this has proved to be such a difficultchallenge.
Traditionally, the task of representing large amounts of domaininformation for general-purpose reasoning has been one of the mostimportant areas of knowledge representation. Systems that exploit theintuitive taxonomic organization of domains are useful for thispurpose; taxonomic hierarchies not only help to organize the processof knowledge acquisition, but provide a useful connection torule-based reasoning.[29]
For domains in which complex definitions are a natural way to organizeinformation, knowledge engineering services based on definitions ofconcepts have been extremely successful. Like variable-free versionsof first-order logic (see, for instance,Quine 1960), these systems are centered on concepts or first-order predicates, andprovide a number of mechanisms for their definition. The fundamentalalgorithm associated with thesetaxonomic logics is aclassifier which inputs a system of definitions and outputs theentailment relations between defined and primitive concepts. Forbackground on these systems, seeWoods & Schmolze 1992 andBrachmanet al. 1991.
The simplest taxonomic logics can be regarded as subsystems offirst-order logic with complex predicates. But they have been extendedin many ways, and the issues raised by many of these extensionsoverlap in many cases with topics in philosophical logic.
Much more complex logical issues arise when the organization of adomain into hierarchies is allowed to have exceptions. One way toapproach this topic is to explore how to make a taxonomic logicnonmonotonic; butnonmonotonic inheritance is a topic in itsown right. Although there are strong affinities to nonmonotonic logic,nonmonotonic inheritance relies more heavily on graph-basedrepresentations than on traditional logical ideas, and seems toprovide a much finer-grained approach to nonmonotonic reasoning thatraises entirely new issues, and which quickly becomes problematic. Forthis reason, systems of nonmonotonic inheritance tend to beexpressively weak, and their relations to the more powerfulnonmonotonic logic has never been fully clarified. For background onthis topic, seeThomason 1992 andHorty 1994.
In the tradition in philosophical logic dealing with contextualeffects on the interpretation of expressions, as well as in the morerecent tradition in dynamic logic, context is primarily formalized asan assignment of values to variables, and the language is designed tomake explicit reasoning about context either very limited or outrightimpossible.
Concern in AI about the representation of large and apparentlyheterogeneous domains and about the integration of disparate knowledgesources, as well as interests in formalizing common sense of the sortdiscussed inSection 2.2, above, have led to interest in the AI community in formalizinglanguages that take context into account more explicitly.
InMcCarthy 1993b, McCarthy recommends the study of languages containing a construct
\[\textit{ist}(c, \phi),\]where \(\textit{ist}\) is read “is-true.” This is analogous tothe \(\Holds\) construct of the situation calculus—but now\(c\) stands for a context, and \(\phi\) is a possibly complexpropositional representation, which many (including McCarthy) take torefer to a sentence.
There are analogies here both to modal logic and to languages with anexplicit truth predicate. But the applications that are envisioned fora logic of context create opportunities and problems that are in manyways new. Work on the logic of context subsequent to McCarthy’soriginal suggestion, includesMcCarthy & Buvac 1998,Guha 1991, and some of the papers in the conference volumesAkmanet al. 2001 andBouquetet al. 1999. For extensions of Richard Montague’s Intensional Logicmotivated by McCarthy’s suggestions, seeThomason 2003 and2005.
For some reason, work on the explicit formalization of contexthasn’t been pursued intensively by the computational communitybeyond this point, but for an application to information integration,seeSnidaro 2019.
Philosophical interest in context, and especially in the interactionof context with propositional attitudes and modals, continues to bestrong; but the very general logical frameworks for context thatMcCarthy envisioned have yet not been taken up by philosophers.
There is reason to hope that the combination of logical methods withplanning applications in AI can enable the development of a far morecomprehensive and adequate theory of practical reasoning than hasheretofore been possible. As with many problems having to do withcommon sense reasoning, the scale and complexity of the formalizationsthat are required are beyond the traditional techniques ofphilosophical logic. However, with computational methods ofimplementing and testing the formalizations and with areas such ascognitive robotics providing laboratories for developing and testingideas, we can hope to radically advance a problem that has seen littleprogress since it was first proposed by Aristotle: how to devise aformalization of practical reasoning that is genuinely applicable torealistic problems.
The classical work in deontic logic that was begun by von Wright (seevon Wright 1983) is one source of ideas; see (Horty 2001 andvan der Torre 1997). In fact, as the more recent work in deontic logic shows, nonmonotoniclogic provides a natural and useful supplement to the classicaldeontic logic. One recent work (Horty 2012) seeks to base deontic logic on a prioritized version ofReiter’s default logic.
An even more robust account of practical reasoning begins to emergewhen these ideas are supplemented with work on the foundations ofplanning and reasoning about action that were discussed inSection 4, above. But this development can be pursued even further, by extendingthe formalism to include preferences and intentions.[30]
Ultimately, what is needed is a model of an intelligent reasoning andacting agent. Developing such a model need not be entirely a matter oflogic, but according to one school of thought, logic has a centralrole to play in it; see, for instance,Baral & Gelfond 2000,Wobckeet al. 1998,Burkhardet al. 1998),Wooldridge 2000,Thielscher 2005, andLevesque & Lakemeyer 2008.
Minker 2000b is a comprehensive collection of survey papers and originalcontributions to the field of logic-based AI, with extensivereferences to the literature. Jack Minker’s introduction,Minker 2000a, is a useful orientation to the field. This volume is a good beginningpoint for readers who wish to pursue this topic further.Brachman & Levesque 2004 provides an introduction to the field of knowledge representation intextbook form.Davis 1991 andMueller 2006 are book-length treatments of the challenging problem of formalizingcommonsense reasoning.Straßer & Antonelli 2012 is a good entry point for readers interested in nonmonotonic logic,andShanahan 2009 is a useful discussion of the frame problem.Wooldridge 2000 deals with logical formalizations of rational agents.
The proceedings of the Knowledge Representation and Reasoningconferences provide the best detailed record of logical research in AIfrom 1989 to the present:Brachmanet al. 1989,Allenet al. 1991,Nebelet al. 1992,Doyleet al. 1994,Aielloet al. 1996,Cohnet al. 1998,Cohnet al. 2000,Fenselet al. 2002,Duboiset al. 2004,Dohertyet al. 2006,Brewka & Lang 2008,Linet al. 2010,Eiteret al. 2012,Baralet al. 2014,Baralet al. 2016,Thielscheret al. 2018,Calvaneseet al. 2020,Bienvenuet al. 2021, andKern-Isberneret al. 2022.
How to cite this entry. Preview the PDF version of this entry at theFriends of the SEP Society. Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entryatPhilPapers, with links to its database.
conditionals |frame problem |logic: non-monotonic |reasoning: automated
I am grateful to John McCarthy, who read an early draft of thisarticle and provided extensive and helpful comments.
Copyright © 2024 by
Richmond Thomason<rthomaso@umich.edu>
View this site from another server:
The Stanford Encyclopedia of Philosophy iscopyright © 2024 byThe Metaphysics Research Lab, Department of Philosophy, Stanford University
Library of Congress Catalog Data: ISSN 1095-5054