Movatterモバイル変換

Analogical Reasoning

Abstract. Logical and analogical reasoning are sometimes viewed as mutuallyexclusive alternatives, but formal logic is actually a highlyconstrained and stylized method of using analogies.Before any subject can be formalized to the stage where logic can beapplied to it, analogies must be used to derive an abstractrepresentation from a mass of irrelevant detail.After the formalization is complete, every logical step— of deduction, induction, or abduction —involves the application of some version of analogy.This paper analyzes the relationships between logical and analogicalreasoning and describes a highly efficient analogy enginethat uses conceptual graphs as the knowledge representation.The same operations used to process analogies can be combinedwith Peirce's rules of inference to support an inference engine.Those operations, called thecanonical formation rulesfor conceptual graphs, are widely used in CG systemsfor language understanding and scene recognition as wellas analogy finding and theorem proving. The same algorithmsused to optimize analogy finding can be used to speed upall the methods of reasoning based on the canonical formation rules.

This paper was published in the proceedings of the InternationalConference on Conceptual Structures in Dresden, Germany,in July 2003: A. Aldo, W. Lex, & B. Ganter, eds. (2003)Conceptual Structures for Knowledge Creation and Communication,LNAI 2746, Springer-Verlag, pp. 16-36.

1. Analogy and Perception

Before discussing the use of analogy in reasoning, it is importantto analyze the concept of analogy and its relationship to othercognitive processes. General-purpose dictionaries are usuallya good starting point for conceptual analysis, but they seldom gointo sufficient depth to resolve subtle distinctions. A typicaldictionary lists synonyms for the wordanalogy, such assimilarity,resemblance, andcorrespondence.Then it adds more specialized word senses, such asa similarityin some respects of things that are otherwise dissimilar,a comparison that determines the degree of similarity,oran inference based on resemblance or correspondence.In AI, analogy-finding programs have been writtensince the 1960s, but they often use definitions of analogythat are specialized to a particular application.

The VivoMind Analogy Engine (VAE), which is described in Section 3,is general enough to be used in any application domain. Therefore,VAE leads to fundamental questions about the nature of analogythat have been debated in the literature of cognitive science.One three-party debate has addressed many of those issues:

Thesis: For the Structure Mapping Engine (SME), Falkenheimer, Forbus, andGentner (1989) defined analogy as the recognition that "one thingis like another" if there is a mapping from a conceptual structurethat describes the first one to a conceptual structure that describesthe second. Their implementation in SME has been applied to a widevariety of practical applications and to psychological studies thatcompare the SME approach to the way people address the same problems.
Antithesis: In their critique of SME, Chalmers, French, and Hofstadter (1992)consider analogy to be an aspect of a more general cognitive functioncalledhigh-level perception (HLP), by which an organism constructsa conceptual representation of a situation. They "argue that perceptualprocesses cannot be separated from other cognitive processes even inprinciple, and therefore that traditional artificial-intelligence modelscannot be defended by supposing the existence of a 'representationmodule' that supplies representations ready-made." They criticizethe "hand-coded rigid representations" of SME and insist that"content-dependent, easily adaptable representations" must be"an essential part of any accurate model of cognition."
Synthesis: In summarizing the debate, Morrison and Dietrich (1995) observedthat the two positions represent different perspectives on related,but different aspects of cognition: SME employs structure mappingas "a general mechanism for all kinds of possible comparison domains"while "HLP views analogy as a process from the bottom up; as arepresentation-building process based on low-level perceptual processesinteracting with high-level concepts." In their response to thecritics, Forbus et al. (1998) admitted that a greater integrationwith perceptual mechanisms is desirable, but they repeated their claimthat psychological evidence is "overwhelmingly" in favor of structuremapping "as a model of human analogical processing."

The VAE approach supports Point #3: a comprehensive theoryof cognition must integrate the structure-building processesof perception with the structure-mapping processes of analogy.For finding analogies, VAE uses a high-speed implementation ofstructure mapping, but its algorithms are based on low-level operationsthat are also used to build the structures. In the first implementation,the conceptual structures were built during the process of parsing andinterpreting natural language. More recently, the same low-leveloperations have been used to build conceptual structures from sensorydata and from percept-like patterns used in scene recognition.VAE demonstrates that perception, language understanding, and structuremapping can be based on the same kinds of operations.

This paper discusses the interrelationships between logical andanalogical reasoning, analyzes the underlying cognitive processesin terms of Peirce's semiotics and his classification of reasoning,and shows how those processes are supported by VAE.The same graph operations that support analogical reasoningcan also be used to support formal reasoning.Instead of being mutually exclusive, logical reasoningis just a more cultivated variety of analogical reasoning.For many purposes, especially in language understanding, the analogicalprocesses provide greater flexibility than the more constrainedand less adaptable variety used in logic. But since logical andanalogical reasoning share a common basis, they can be effectivelyused in combination.

2. Logical and Analogical Reasoning

In developing formal logic, Aristotle took Greek mathematics as his model.Like his predecessors Socrates and Plato, Aristotle was impressed withthe rigor and precision of geometrical proofs.His goal was to formalize and generalize those proof procedures andapply them to philosophy, science, and all other branches of knowledge.Yet not all subjects are equally amenable to formalization.Greek mathematics achieved its greatestsuccesses in astronomy, where Ptolemy's calculations remained thestandard of precision for centuries. But other subjects, such asmedicine and law, depend more on deep experience thanon brilliant mathematical calculations. Significantly, two of the mostpenetrating criticisms of logic were writtenby the physician Sextus Empiricus in the second century AD andby the legal scholar Ibn Taymiyya in the fourteenth century.

Sextus Empiricus, as his nickname suggests, was an empiricist.By profession, he was a physician; philosophically, he was anadherent of the school known as the Skeptics.Sextus maintained that all knowledge must come from experience.As an example, he cited the following syllogism:

Every human is an animal.
Socrates is human.
Therefore, Socrates is an animal.

Sextus admitted that this syllogism represents a valid inference pattern,but he questioned the source of evidence for the major premiseEvery human is an animal. A universal proposition that purportsto cover every instance of some category must be derived by inductionfrom particulars. If the induction is incomplete, then the universalproposition is not certain, and there might be some human who is notan animal. But if the induction is complete, then the particularinstance Socrates must have been examimed already, and the syllogism isredundant or circular. Since every one of Aristotle's valid formsof syllogisms contains at least one universal affirmative or universalnegative premise, the same criticisms apply to all of them: the conclusion must be either uncertain or circular.

The Aristotelians answered Sextus by claiming that universal propositionsmay be true by definition: since the type Human is definedas rational animal, the essence of human includes animal;therefore, no instance of human that was not an animal could exist.This line of defense was attacked by the Islamic jurist and legal scholarTaqi al-Din Ibn Taymiyya. Like Sextus, Ibn Taymiyya agreed that theform of a syllogism is valid, but he did not accept Aristotle'sdistinction between essence and accident (Hallaq 1993). Accordingto Aristotle, the essence of human includes both rational and animal.Other attributes, such as laughing or being a featherless biped, might beunique to humans, but they areaccidental attributes thatcould be different without changing the essence. Ibn Taymiyya, however,maintained that the distinction between essence and accident wasarbitrary. Human might just as well be defined as laughing animal,with rational as an accidental attribute.

Denouncing logic would be pointless if no other method of reasoningwere possible. But Ibn Taymiyya had an alternative:the legal practice of reasoning by cases and analogy.In Islamic law, a new case isassimilated to one ormore previous cases that serve as precedents. The mechanism ofassimilation is analogy, but the analogy must be guided by a causethat is common to the new case as well as the earlier cases.If the same cause is present in all the cases, then the earlierjudgment can be transferred to the new case. As an example,it is written in the Koran that grape wine is prohibited,but nothing is said about date wine. The judgment for date winewould be derived in four steps:

Given case: Grape wine is prohibited.
New case: Is date wine prohibited?
Cause: Grape wine is prohibited because it is intoxicating;date wine is also intoxicating.
Judgment: Date wine is also prohibited.

In practice, the reasoning may be more complex. Several previous casesmay have a common cause but different judgments. Then the analysismust determine whether there are mitigating circumstances that affectthe operation of the cause. But the principles remain the same:analogy guided by rules of evidence and relevance determines the commoncause, the effect of the mitigating circumstances, and the judgment.

Besides arguing in favor of analogy, Ibn Taymiyya also replied tothe logicians who claimed that syllogistic reasoning is certain,but analogy is merely probable. He admitted that logicaldeduction is certain when applied to purely mental constructionsin mathematics. But in any reasoning about the real world, universalpropositions can only be derived by induction, and induction must beguided by the same principles of evidence and relevance used in analogy.Figure 1 illustrates Ibn Taymiyya's argument: Deduction proceeds from atheory containing universalpropositions. But those propositions must have earlier been derived byinduction using the methods of analogy. The only difference isthat induction produces a theory as intermediate result, which is thenused in a subsequent process of deduction. By using analogy directly,legal reasoning dispenses with the intermediate theory and goes straightfrom cases to conclusion. If the theory and the analogy are basedon the same evidence, they must lead to the same conclusions.

Figure 1: Comparison of logical and analogical reasoning

The question in Figure 1 represents some known aspects of a new case,which has unknown aspects to be determined. In deduction,the known aspects are compared (by a version of structure mappingcalledunification) with the premises of some implication.Then the unknown aspects, which answer the question,are derived from the conclusion of the implication.In analogy, the known aspects of the new case are compared with thecorresponding aspects of the older cases. The case that gives thebest match may be assumed as the best source of evidence for estimatingthe unknown aspects of the new case. The other cases show alternativepossibilities for those unknown aspects; the closer the agreementamong the alternatives, the stronger the evidence for the conclusion.

Both Sextus Empiricus and Ibn Taymiyya admitted that logical reasoningis valid, but they doubted the source of evidence for universalpropositions about the real world. What they overlooked was thepragmatic value of a good theory: a small group of scientists canderive a theory by induction, and anyone else can apply it withoutredoing the exhaustive analysis of cases. The two-step processof induction followed by deduction has proved to be most successfulin the physical sciences, which include physics, chemistry,molecular biology, and the engineering practices they support.The one-step process of case-based reasoning, however, is moresuccessful in fields outside the so-called "hard" sciences,such as business, law, medicine, and psychology.Even in the "soft" sciences, which are rife with exceptions, a theorythat is successful most of the time can still be useful. Many casesin law or medicine can be settled by the direct application of somegeneral principle, and only the exceptions require an appealto a long history of cases. And even in physics, the hardestof the hard sciences, the theories may be well established,but the question of which theory to apply to a given problemusually requires an application of analogy. In both scienceand daily life, there is no sharp dichotomy between subjects amenableto strict logic and those that require analogical reasoning.

The informal arguments illustrated in Figure 1 are supportedby an analysis of the algorithms used for logical reasoning.Following is Peirce's classification of the three kinds of logicalreasoning and the way that the structure-mapping operationsof analogy are used in each of them:

Deduction. A typical rule used in deduction ismodus ponens: given an assertionp and an axiom of the formp impliesq, deduce the conclusionq.In most applications, the assertionp is not identicalto thep in the axiom, and structure mapping is necessarytounify the twops before the rule can be applied.The most time-consuming task is not the application of a single rule,but the repeated use of analogies for finding patternsthat may lead to successful rule applications.
Induction. When every instance ofp is followed by an instance ofq,induction is peformed by assuming thatp impliesq.Since theps andqs are rarely identical in everyoccurrence, a form of analogy calledgeneralization is usedto derive the most general implication that subsumes all the instances.
Abduction. The operation of guessing or forming an initial hypothesis is whatPeirce called abduction. Given an assertionq and an axiomof the formp impliesq, the guess thatpis a likely cause or explanation forq is an act of abduction.The operation of guessingp uses the least constrainedversion of analogy, in which some parts of the matching graphsmay be more generalized while other parts are more specialized.

As this discussion indicates, analogy is a prerequisite for logicalreasoning, which is a highly disciplined method of using repeatedanalogies. In both human reasoning and computer implementations,the same underlying operations can be used to support both.

3. Analogy Engine

The VivoMind Analogy Engine (VAE), which was developed by Majumdar,is a high-performance analogy finder that uses conceptual graphsfor the knowledge representation. Like SME, structure mappingis used to find analogies. Unlike SME, the VAE algorithmscan find analogies in time proportional to (N log N),where N is the number of nodes in the current knowledge base or context.SME, however, requires time proportional to N³(Forbus et al. 1995). A later version called MAC/FAC reducedthe time by using a search engine to extract the most likely databefore using SME to find analogies (Forbus et al. 2002).With its greater speed, VAE can find analogiesin the entire WordNet knowledge base in just a few seconds,even though WordNet contains over 10⁵ nodes.For that size, one second with an (N log N) algorithmwould correspond to 30 years with an N³ algorithm.

VAE can process CGs from any source: natural languages, programming languages, and any kind of informationthat can be represented in graphs, such as organic moleculesor electric-power grids. In an application to distributedinteracting agents, VAE processes both English messages andsignals from sensors that monitor the environment.To determine an agent's actions, VAE searches for analogies to whathumans did in response to similar patterns of messages and signals.To find analogies, VAE uses three methods of comparison,which can be used separately or in combination:

Matching type labels. Method #1 compares nodes that have identical labels,labels that are related as subtype and supertype such asCat andAnimal, or labels that have a commonsupertype such asCat andDog.
Matching subgraphs. Method #2 compares subgraphs with possibly different labels. This matchsucceeds when two graphs are isomorphic (indepedent of the labels)or when they can be made isomorphic by combining adjacent nodes.
Matching transformations. If the first two methods fail, Method #3 searchesfor transformations that can relate subgraphs of one graphto subgraphs of the other.

These three methods of matching graphs were inspired by Peirce'scategories of Firstness, Secondness, and Thirdness (Sowa 2000).The first compares two nodes by what they contain in themselvesindependent of any other nodes; the second compares nodes by theirrelationships to other nodes; and the third compares the mediatingtransformations that may be necessary to make the graphs comparable.To illustrate the first two methods, the following table showsan analogy found by VAE when comparing the background knowledgein WordNet for the concept typesCat andCar:

Analogy of Cat to Car
Cat	Car
head	hood
eye	headlight
cornea	glass plate
mouth	fuel cap
stomach	fuel tank
bowel	combustion chamber
anus	exhaust pipe
skeleton	chassis
heart	engine
paw	wheel
fur	paint

Figure 2: An analogy discovered by VAE

As Figure 2 illustrates, there is an enormous amount of backgroundknowledge stored in lexical resources such as WordNet.It is not organized in a form that is precise enough for deduction,but it is adequate for the more primitive method of analogy.

Since there are many possible paths through all the definitions andexamples of WordNet, most comparisons generate multiple analogies.To evaluate the evidence for any particular mapping,aweight of evidence is computed by using heuristicsthat estimate the closeness of the match.For Method #1 of matching type labels, the closest match resultsfrom identical labels. If the labels are not identical,the weight of evidence decreases with the distance between the labelsin the type hierarchy:

Identical type labels, such asCat toCat.
Subtype to supertype, such asCat toAnimal.
Siblings of the same supertype, such asCat toDog.
More distant cousins, such asCat toTree.

For Method #2 of matching subgraphs, the closest match resultsfrom finding that both graphs, in their entirety, are isomorphic.The weight of evidence decreases as their common subgraphs become smalleror if the graphs have to be modified in order to force a match:

Match isomorphic graphs.
Match two graphs that have isomorphic subgraphs (the largerthe subgraphs, the stronger the evidence for the match).
Combine adjacent nodes to make the subgraphs isomorphic.

The analogy shown in Figure 2 received a high weight of evidencebecause VAE found many matching labels and large matching subgraphsin corresponding parts of a cat and parts of a car:

Some of the corresponding parts have similar functions: fur and paint are outer coverings;heart and engine are internal parts that have a regular beat;skeleton and chassis are structures to which other parts are attached;paw and wheel perform a similar function, and there are four of each.
The longest matching subgraph is the path from mouth to stomachto bowel to anus of a cat, which matches the path from fuel capto fuel tank to combustion chamber to exhaust pipe of a car.The stomach of a cat and the fuel tank of a car areanalogous because they are both subtypes ofContainer.The bowel and the combustion chamber perform analogous functions.The mouth and the fuel cap are consideredinput orifices, and the anus and the exhaust pipe are outputs.The weight of evidence is somewhat reduced because adjustmentsmust be made to ignore nodes that do not match: the esophagusof a cat does not match anything in WordNet's description of a car,and the muffler of a car does not match anything in its descriptionof a cat.
A shorter subgraph is the path from head to eyes to corneaof a cat, which matches the path from hood to headlightsto glass plate of a car. The head and the hood are bothin the front. The eyes are analogous to the headlights becausethere are two of each and they are related to light, even thoughthe relationships are different. The cornea and the glass plate arein the front, and they are both transparent.

Each matching label and each structural correspondence contributesto the weight of evidence for the analogy, dependingon the closeness of the match and the exactnessof the correspondence.

As the cat-car comparison illustrates, analogy is a versatile methodfor using informal, unstructured background knowledge.But analogies are also valuable for comparing the highly formalizedknowledge of one axiomatized theory to another.In the process of theory revision, Niels Bohr used an analogybetween gravitational force and electrical force to derivea theory of the hydrogen atom as analogous to the earth revolvingaround the sun. Method #3 of analogy,which finds matching transformations, can also be usedto determine the precise mappings required for transformingone theory or representation into another. As an example,Figure 3 shows a physical structure that could be represented by manydifferent data structures.

Figure 3: A physical structure to be represented by data

Programmers who use different tools, databases, or programming languagesoften use different, but analogous representations for the samekinds of information. LISP programmers, for example, prefer to uselists, while FORTRAN programmers prefer vectors. Conceptual graphsare a highly general representation, which can represent any kindof data stored in a digital computer, but the types of conceptsand relations usually reflect the choices made by the originalprogrammer, which in turn reflect the options availablein the original programming tools. Figure 4 showsa representation for Figure 3 that illustrates the typical choicesused with relational databases.

Figure 4: Two structures represented in a relational database

On the left of Figure 4 are two structures: a copy of Figure 3and an arch constructed from three blocks. On the right are twotables: the one labeledObjects lists the identifiersof all the objects in both tables with their shapes and colors;the one labeledSupports lists each object that supports(labeledSupporter) and the object supported(labeledSupportee).As Figure 4 illustrates, a relational database typically scattersthe information about a single object or structure of objectsinto multiple tables. For the structure of pyramids and blocks,each object is listed once in theObjects table,and one or more times in either or both columnsof theSupports table.Furthermore, information about the two disconnected structures shownon the left is intermixed in both tables. When all the informationabout the structure at the top left is extracted from both tablesof Figure 4, it can be mapped to the conceptual graph of Figure 5.

Figure 5: A CG derived from the relational DB

In Figure 5, each row of the table labeledObjectsis represented by a conceptual relation labeledObjects,and each row of the table labeledSupports is representedby a conceptual relation labeledSupports.The type labels of the concepts are mostly derived from the labelson the columns of the two tables in Figure 4. The only exceptionis the labelEntity, which is used instead ofID.The reason for that exception is thatID is a metalevelterm about the representation language; it is not a term thatis derived from the entities in the domain of discourse.The concept[Entity: E], for example, saysthatE is an instance of typeEntity.The concept[ID: "E"], however, would say thatthe character string"E" is an instance of typeID.The use of the labelEntity instead ofID avoidsmixing the metalevel with the object level. Such mixing of levelsis common in most programs, since the computer ignores any meaningthat might be associated with the labels. In logic, however,the fine distinctions are important, and CGs mark them consistently.

When natural languages are translated to CGs, the distinctionsmust be enforced by the semantic interpreter. Figure 6 showsa CG that represents the English sentence,A red pyramid A,a green pyramid B, and a yellow pyramid C support a blue block D,which supports an orange pyramid E. The conceptual relationslabeledThme andInst represent the case relationstheme and instrument. The relations labeledAttrrepresent the attribute relation between a concept of some entityand a concept of some attribute of that entity. The type labelsof concepts are usually derived from nouns, verbs, adjectives,and adverbs in English.

Figure 6: A CG derived from an English sentence

Although the two conceptual graphs represent equivalent information,they look very different. In Figure 5, the CG derived from therelational database has 15 concept nodes and 9 relation nodes.In Figure 6, the CG derived from English has 12 concept nodesand 11 relation nodes. Furthermore, no type label on any nodein Figure 5 is identical to any type label on any nodein Figure 6. Even though some character strings are similar,their positions in the graphs cause them to be treated as distinct.In Figure 5,orange is the name of an instance oftypeColor; and in Figure 6,Orange is the labelof a concept type. In Figure 5,Supports is the labelof a relation type; and in Figure 6,Support is notonly the label of a concept type, it also lacks the final S.

Because of these differences, the strict method of unificationcannot show that the graphs are identical or even related.Even the more relaxed methods of matching labels or matching subgraphsare unable to show that the two graphs are analogous.Method #3 of analogy, however, can find matching transformationsthat can translate Figure 5 into Figure 6 or vice-versa.When VAE was asked to compare those two graphs, it foundthe two transformations shown in Figure 7. Each transformationdetermines a mapping between a type of subgraph in Figure 5and another type of subgraph in Figure 6.

Figure 7: Two transformations discovered by VAE

The two transformations shown in Figure 7 define a versionofgraph grammar for parsing one kind of graph and mappingit to the other. The transformation at the top of Figure 7can be applied to the five subgraphs containing the relationsof typeObjects in Figure 5 and relate them to thefive subgraphs containing the relations of typeAttrin Figure 6. That same transformation could be appliedin reverse to relate the five subgraphs of Figure 6 to thefive subgraphs of Figure 5. The transformation at the bottomof Figure 7 could be applied from right to left in order to mapFigure 6 to Figure 5. When applied in that direction,it would map three different subgraphs, which happen to containthree common nodes: the subgraph extendingfrom[Pyramid: A] to[Block: D];the one from[Pyramid: B] to[Block: D];and the one from[Pyramid: C] to[Block: D].When applied in the reverse direction, it would map three subgraphsof Figure 5 that contained only one common node.

The transformations shown in Figure 7 have a high weightof evidence because they are used repeatedly in exactly the same way.A single transformation of one subgraph to another subgraph withno matching labels would not contribute anything to the weightof evidence. But if the same transformation is applied twice,then its likelihood is greatly increased. Transformationsthat can be applied three times or five times to relateall the nodes of one graph to all the nodes of another graphhave a likelihood that comes close to being a certainty.

Of the three methods of analogy used in VAE, the first two— matching labels and matching subgraphs —are also used in SME. Method #3 of matching transformations,which only VAE is capable of performing, is more complex becauseit depends on analogies of analogies.Unlike the first two methods, which VAE can perform in (N log N) time,Method #3 takes polynomial time,and it can only be applied to much smaller amounts of data.In practice, Method #3 is usually applied to small partsof an analogy in which most of the mapping is done by the first twomethods and only a small residue of unmatched nodes remains to be mapped.In such cases, the number N is small, and the mapping can be donequickly. Even for mapping Figure 5 (with N=9) to Figure 6 (with N=11),the Method #3 took a few seconds, whereas the time for Methods#1 and #2 on graphs of such size would be less than a millisecond.

Each of the three methods of analogy determines a mapping of one CGto another. The first two methods determine a node-by-node mappingof CGs, where some or all of the nodes of the first CG may have differenttype labels from the corresponding nodes of the other. Method #3determines a more complex mapping, which comprises multiple mappingsof subgraphs of one CG to subgraphs of the other. These methods can beapplied to CGs derived from any source, including natural languages,logic, or programming languages.

In one major application, VAE was used to analyze the programsand documentation of a large corporation, which had systemsin daily use that were up to forty years old (LeClerc & Majumdar 2002).Although the documentation specified how the programs were supposedto work, nobody knew what errors, discrepancies, andobsolete business procedures might be buried in the code.The task required an analysis of 100 megabytes of English,1.5 million lines of COBOL programs, and several hundredcontrol-language scripts, which called the programs and specifiedthe data files and formats. Over time, the English terminology,computer formats, and file names had changed. Some of the formatchanges were caused by new computer systems and business practices,and others were required by different versions of federal regulations.In three weeks of computation on a 750 MHz Pentium III,VAE combined with the Intellitex parser was able to analyzethe documentation and programs, translate all statementsthat referred to files, data, or processes in any of the three languages(English, COBOL, and JCL) to conceptual graphs, and use the CGsto generate an English glossary of all processes and data,to define the specifications for a data dictionary,to create dataflow diagrams of all processes, and to detectinconsistencies between the documentation and the implementation.

4. Inference Engine

Most theorem provers use a tightly constrained version of structuremapping calledunification, which forces two structures to becomeidentical. Relaxing constaints in one direction converts unificationto generalization, and relaxing them in another direction leadsto specialization. With arbitrary combinations of generalizationand specialization, there is a looser kind of similarity, which,if there is no limit on the extent, could map any graph to any other.When Peirce's rules of inference are redefined in termsof generalization and specialization, they support an inference procedurethat can use exactly the same algorithms and data structures designedfor the VivoMind Analogy Engine. The primary difference betweenthe analogy engine and the inference engine is in the strategy thatschedules the algorithms and determines which constraints to enforce.

When Peirce invented the implication operator for Boolean algebra,he observed that the truth value of the antecedent is always less thanor equal to the truth value of the consequent.Therefore, the symbol ≤ may be used to represent implication: p≤q means that the truth value ofp is lessthan or equal to the truth value ofq. That same symbolmay be used for generalization: if a graph or formulapis true in fewer cases than another graph or formulaq,thenp is more specialized andq is more generalized.Figure 8 shows a generalization hierarchyin which the most general CG is at the top.Each dark line in Figure 8 represents the ≤ operator: the CG above is a generalization, and the CG below is a specialization.

Figure 8: A generalization hierarchy of CGs

The top CG says that an animate being is the agentof some act that has an entity as the theme of the act.Below it are two specializations: a CG for a robot washinga truck, and a CG for an animal chasing an entity. The CG for an animalchasing an entity has three specializations: a human chasing a human,a cat chasing a mouse, and the dog Macula chasing a Chevrolet.The two graphs at the bottom represent the most specializedsentences: The cat Yojo is vigorously chasing a brown mouse,andthe cat Tigerlily is chasing a gray mouse.

The operations on conceptual graphs are based on combinationsof sixcanonical formation rules, which performthe structure-building operations of perception andthe structure-mapping operations of analogy.Logically, each rule has one of threepossible effects on a CG: the rule can make it more specialized,more generalized, or logically equivalent but with a modified shape.Each rule has an inverse rule that restores a CG to its original form.The inverse of specialization is generalization, the inverse ofgeneralization is specialization, and the inverse of equivalenceis another equivalence.

All the graphs in Figure 8 belong to theexistential-conjunctivesubset of logic, whose only operators are the existential quantifier∃ and the conjunction ∧. For this subset, the canonicalformation rules take the forms illustrated in Figures 5, 6, and 7.These rules are fundamentally graphical: they are easier to show than to describe.Sowa (2000) presented the formal definitions, which specifythe details of how the nodes and arcs are affected by each rule.

Figure 9: Copy and simplify rules

Figure 9 shows the first two rules: copy andsimplify.At the top is a CG for the sentence "The cat Yojo is chasing a mouse."The down arrow represents two applications of the copy rule.The first copies the Agnt relation, and the second copies the subgraph→(Thme)→[Mouse].The two copies of the concept[Mouse] at the bottom of Figure 9are connected by a dotted line called acoreference link;that link, which corresponds to an equal sign = in predicate calculus,indicates that both concepts must refer to the same individual.Since the new copies do not add any information, they may be erasedwithout losing information. The up arrow represents the simplify rule,which performs the inverse operation of erasing redundant copies.The copy and simplify rules are calledequivalence rulesbecause any two CGs that can be transformed from one to the otherby any combination of copy and simplify rules are logically equivalent.The two formulas in predicate calculus that are derived fromFigure 9 are also logically equivalent. The top CG maps to thefollowing formula:

(∃x:Cat)(∃y:Chase)(∃z:Mouse)(name(x,'Yojo')∧ agnt(y,x) ∧ thme(y,z)),

In the formula that corresponds to the bottom CG, the equalityz=w represents the coreference link that connectsthe two copies of [Mouse]:

(∃x:Cat)(∃y:Chase)(∃z:Mouse)(∃w:Mouse)(name(x,'Yojo')∧ agnt(y,x) ∧ agnt(y,x) ∧ thme(y,z) ∧ thme(y,w)∧ z=w).

By the inference rules of predicate calculus, either of these twoformulas can be derived from the other.

Example of restrict and unrestrict rules.

Figure 10: Restrict and unrestrict rules

Figure 10 illustrates therestrict andunrestrict rules.At the top is a CG for the sentence "A cat is chasing an animal."Two applications of the restrict rule transform itto the CG for "The cat Yojo is chasing a mouse."The first step is arestriction by referent of the concept[Cat], which represents some indefinite cat, to the more specificconcept[Cat: Yojo], which represents a particular cat namedYojo. The second step is arestriction by type of the concept[Animal] to a concept of the subtype[Mouse].Two applications of the unrestrict rule perform the inversetransformation of the bottom graph to the top graph. The restrict ruleis aspecialization rule, and the unrestrict rule isageneralization rule. The more specialized graph impliesthe more general one: if the cat Yojo is chasing a mouse, it followsthat a cat is chasing an animal. The same implication holds for thecorresponding formulas in predicate calculus. The more general formula

(∃x:Cat)(∃y:Chase)(∃z:Animal)(agnt(y,x)∧ thme(y,z))

is implied by the more specialized formula

(∃x:Cat)(∃y:Chase)(∃z:Mouse)(name(x,'Yojo')∧ agnt(y,x) ∧ thme(y,z)).

Figure 11: Join and detach rules

Figure 11 illustrates thejoin anddetach rules.At the top are two CGs for the sentences "Yojo is chasing a mouse"and "A mouse is brown." The join rule overlays the two identical copiesof the concept[Mouse], to form a single CG for the sentence"Yojo is chasing a brown mouse." The detach rule performs the inverseoperation. The result of join is a more specialized graph thatimplies the one derived by detach. The same implication holds for thecorresponding formulas in predicate calculus. The conjunction ofthe formulas for the top two CGs

(∃x:Cat)(∃y:Chase)(∃z:Mouse)(name(x,'Yojo')∧ agnt(y,x) ∧ thme(y,z))
(∃w:Mouse)(∃v:Brown)attr(w,v)

is implied by the formula for the bottom CG

(∃x:Cat)(∃y:Chase)(∃z:Mouse)(∃v:Brown)(name(x,'Yojo')∧ agnt(y,x) ∧ thme(y,z) ∧ attr(z,v)).

These rules can be applied to full first-order logicby specifying how they interact with negation. In CGs,each negation is represented by a context that has an attached relationof typeNeg or its abbreviation by the symbol ¬ or ~.Apositive context is nested in an even number of negations(possibly zero). Anegative context is nestedin an odd number of negations. The following four principlesdetermine how negations affect the rules:

Equivalence rules. An equivalence rule remains an equivalence rule in any context,positive or negative.
Specialization rules. In a negative context, a specialization rule becomes a generalizationrule; but in a positive context, it remains a specialization rule.
Generalization rules. In a negative context, a generalization rule becomes a specializationrule; but in a positive context, it remains a generalization rule.
Double negation. Adouble negation is a nest of two negations in which no conceptor relation node occurs between the inner and the outer negation.(It is permissibe for an arc of a relation or a coreference link to crossthe space between the two negations, but only if one endpoint is insidethe inner negation and the other endpoint is outside the outer negation.)Then drawing or erasing a double negation around any CG or any subgraphof a CG is an equivalence operation.

In short, a single negation reverses the effect of generalization andspecialization rules, but it has no effect on equivalence rules.Since drawing or erasing a double negation adds or subtractstwo negations, it has no effect on any rule.

By handling the syntactic details of conceptual graphs, thecanonical formation rules enable the rules of inferenceto be stated in a form that is independent of the graph notation.For each of the six rules, there is an equivalent rule for predicatecalculus or any other notation for classical FOL.To derive equivalent rules for other notations,start by showing the effect of each rule on the existential-conjunctivesubset (no operators other than ∃ and ∧).To handle negation, add one to the negation count for eachsubgraph or subformula that is governed by a ~ symbol.For other operators (∀, ⊃, and ∨),count the number of negations in their definitions.For example,p⊃q is defined as~(p∧~q); therefore,the subformulap is nested inside one additional negation, andthe subformulaq is nested inside two additional negations.

When the CG rules are applied to other notations, some extensionsmay be necessary. For example, theblankor empty graph is a well-formed EG or CG, which is always true.In predicate calculus, the blank may be represented by a constantformula T, which is defined to be true. The operation of erasinga graph would correspond to replacing a formula by T.When formulas are erased or inserted, an accompanying conjunctionsymbol must also be erased or inserted in some notations.Other notations, such as the Knowledge Interchange Format (KIF),are closer to CGs because they only require one conjunction symbolfor an arbitrarily long list of conjuncts.In KIF, the formula(and), which is an empty list of conjuncts,may be used as a synonym for the blank graph or T.Discourse representation structures (DRSs) are even closer to EGsand CGs because they do not use any symbol for conjunction; therefore,the blank may be considered a DRS that is always true.

Peirce's rules, which he stated in terms of existential graphs, forma sound and complete system of inference for first-order logicwith equality. If the wordgraph is considered a synonymforformula orstatement, the following adaptationof Peirce's rules can be applied to any notation for FOL, includingEGs, CGs, DRS, KIF, or the many variations of predicate calculus.These rules can also be applied to subsets of FOL,such as description logics and Horn-clause rules.

Erasure. In a positive context, any graph or subgraphu may be replacedby a generalization ofu; in particular,u may be erased(i.e. replaced by the blank, which is a generalization of every graph).
Insertion. In a negative context, any graph or subgraphu may be replacedby a specialization ofu; in particular, any graph may beinserted (i.e. it may replace the blank).
Iteration. If a graph or subgraphu occurs in a context C,another copy ofu may be inserted in the same context C orin any context nested in C.
Deiteration. Any graph or subgraphu that could have beenderived by iteration may be erased.
Equivalence. Any equivalence rule (copy, simplify, or double negation)may be performed on any graph or subgraph in any context.

These rules, which Peirce formulated in several equivalent variantsfrom 1897 to 1909, form an elegant and powerful generalizationof the rules ofnatural deduction by Gentzen (1935).Like Gentzen's version, the only axiom is the blank.What makes Peirce's rules more powerful is the option of applyingthem in any context nested arbitrarily deep.That option shortens many proofs, and it eliminatesGentzen's bookkeeping for making and discharging assumptions.For further discussion and comparison, see MS 514 (Peirce 1909)and the commentary that shows how other rules of inferencecan be derived from Peirce's rules.

Unlike most proof procedures, which are tightly bound to a particularsyntax, this version of Peirce's rules is stated in notation-independentterms of generalization and specialization.In this form, they can even be applied to natural languages.For any language, the first step is to show how each syntax ruleaffects generalization, specialization, and equivalence.In counting the negation depth for natural languages, it is importantto recognize the large number of negation words, such asnot, never,none, nothing, nobody, ornowhere. But many other wordsalso contain implicit negations, which affectany context governed by those words.Verbs likeprevent ordeny, for example,introduce a negation into any clause or phrase in their complement.Many adjectives also have implicit negations: a stuffed bear,for example, lacks essential properties of a bear, such as being alive.After the effects of these features on generalization and specializationhave been taken into account in the syntactic definition of the language,Peirce's rules can be applied to a natural language as easily asto a formal language.

5. A Semeiotic Foundation for Cognition

Peirce developed his theory of signs orsemeioticas a species-independent theory of cognition.He considered it true of any "scientific intelligence," by whichhe meant any intelligence that is "capable of learning from experience."Peirce was familiar with Babbage's mechanical computer; he was thefirst person to suggest that such machines should be based on electicalcircuits rather than mechanical linkages; and in 1887, hepublished an article on "logical machines" in theAmerican Journalof Psychology, which even today would be a respectable commentaryon the possibilities and difficulties of artificial intelligence.His definition ofsign is independent of any implementationin proteins or silicon:

I define a sign as something, A, which brings something, B,its interpretant, into the same sort of correspondence with something, C,its object, as that in which itself stands to C.In this definition I make no more reference to anythinglike the human mind than I do when I define a line as the placewithin which a particle lies during a lapse of time. (1902, p. 235)

As a close friend of William James, Peirce was familiar with theexperimental psychology of his day, which he considered a valuable studyof what possibilities are realized in any particular species.But he considered semeiotic to be a more fundamental,implementation-independent characterization of cognition.

Peirce defined logic as "the study of the formal laws of signs"(1902, p. 235), which implies that it is based on the same kindsof semeiotic operations as all other cognitive processes.He reserved the wordanalogy for what is called"analogical reasoning" in this paper. For the kinds of structuremapping performed by SME and VAE, Peirce used the termdiagrammatic reasoning, which he described as follows:

The first things I found out were that all mathematical reasoning isdiagrammatic and that all necessary reasoning is mathematical reasoning,no matter how simple it may be. By diagrammatic reasoning, I meanreasoning which constructs a diagram according to a precept expressed ingeneral terms, performs experiments upon this diagram, notes theirresults, assures itself that similar experiments performed upon anydiagram constructed according to the same precept would have thesame results, and expresses this in general terms. This was a discoveryof no little importance, showing, as it does, that all knowledge withoutexception comes from observation. (1902, pp. 91-92)

This short paragraph summarizes many themes that Peirce developedin more detail in his other works. In fact, it summarizes the majorthemes of this article:

By a diagram, Peirce meant any abstract pattern of signs.He included the patterns of algebra and Aristotle's syllogismsas diagrammatic, but he also said that his existential graphs were"more diagrammatic".
His "experiments" upon a diagram correspond to various AIprocedures, such as generate and test, backtracking, breadth-firstparallel search, and path following algorithms, all of whichare performed on data structures that correspond to Peirce'snotion of "diagram".
The first sentence of the paragraph applies the term "mathematical"to any kind of deduction, including Aristotle's syllogismsand any informal logic that may be used in a casual conversation.
The last sentence, which relates observation to diagrammaticreasoning, echoes the themes of the first section of this paper,which emphasized the need for an integration of perceptionwith the mechanisms of analogy. Peirce stated that point even moreforcefully: "Nothing unknown can ever become known except throughits analogy with other things known" (1902, p. 287).

In short, the operations of diagrammatic reasoning or structure mappingform the bridge from perception to all forms of reasoning, rangingfrom the most casual to the most advanced. Forbus et al. (2002)applied the termreasoning from first principles to logic,not to analogical reasoning. But Peirce, who invented two of the mostwidely used notations for logic, recognized that the underlying semeioticmechanisms were more fundamental. The VivoMind implementationconfirms Peirce's intuitions.

For natural language understanding, the constrainedoperations of unification and generalization are important,but exceptions, metaphors, ellipses, novel word senses,and the inevitable errors require less constrained analogies.When the VAE algorithms are used in the semantic interpreter, there areno exceptions: analogies are used at every step, and the onlydifference between unification, generalization, and looser similaritiesis the nature of the constraints on the analogy.

References

Chalmers, D. J., R. M. French, & D. R. Hofstadter (1992) "High-level perception, representation, and analogy: A critique of artificialintelligence methodology,"Journal of Experimental & TheoreticalArtificial Intelligence4, 185-211.

Falkenhainer, B., Kenneth D. Forbus, Dedre Gentner (1989)"The Structure mapping engine: algorithm and examples,"Artificial Intelligence41, 1-63.

Forbus, Kenneth D., Dedre Gentner, & K. Law (1995) "MAC/FAC:A Model of Similarity-Based Retrieval,"Cognitive Science19:2, 141-205.

Forbus, Kenneth D., Dedre Gentner, Arthur B. Markman, &Ronald W. Ferguson (1998) "Analogy just looks like high levelperception: Why a domain-general approach to analogical mappingis right,"Journal of Experimental & TheoreticalArtificial Intelligence10:2, 231-257.

Forbus, Kenneth D., T. Mostek, & R. Ferguson (2002) "An analogyontology for integrating analogical processing and first-principlesreasoning,"Proc. IAAI-02 pp. 878-885.

Gentzen, Gerhard (1935) "Untersuchungen über das logischeSchließen," translated as "Investigations into logicaldeduction" inThe Collected Papers of Gerhard Gentzen,ed. and translated by M. E. Szabo, North-Holland Publishing Co.,Amsterdam, 1969, pp. 68-131.

Hallaq, Wael B. (1993)Ibn Taymiyya Against the GreekLogicians, Clarendon Press, Oxford.

LeClerc, André, & Arun Majumdar (2002) "Legacy revaluation andthe making of LegacyWorks,"Distributed Enterprise Architecture5:9, Cutter Consortium, Arlington, MA.

Morrison, Clayton T., & Eric Dietrich (1995) "Structure-mappingvs. high-level perception: the mistaken fight over the explanationof Analogy,"Proc. 17th Annual Conference of the Cognitive ScienceSociety, pp. 678-682. Available athttp://babs.cs.umass.edu/~clayton/CogSci95/SM-v-HLP.html

Peirce, Charles Sanders (1887) "Logical machines,"AmericanJournal of Psychology, vol. 1, Nov. 1887, pp. 165-170.

Peirce, Charles S. (1902)Logic, Considered as Semeiotic, MS L75,edited by Joseph Ransdell,http://members.door.net/arisbe/menu/library/bycsp/l75/l75.htm

Peirce, Charles Sanders (1909) Manuscript 514, with commentaryby J. F. Sowa, available at http://www.jfsowa.com/peirce/ms514.htm

Sowa, John F. (2000)Knowledge Representation:Logical, Philosophical, and Computational Foundations,Brooks/Cole Publishing Co., Pacific Grove, CA.

[8]ページ先頭