The name "Cyc" (from "encyclopedia") is a registered trademark owned by Cycorp.CycL has a publicly released specification, and dozens of HL (Heuristic Level) modules were described in Lenat and Guha's textbook,[1] but the Cyc inference engine code and the full list of HL modules are Cycorp-proprietary.[2]
The project began in July 1984 byDouglas Lenat as a project of theMicroelectronics and Computer Technology Corporation (MCC), a research consortium started by two United States–based corporations "to counter a then ominous Japanese effort in AI, the so-called 'fifth-generation' project."[3] The US passed theNational Cooperative Research Act of 1984, which for the first time allowedUS companies to "collude" on long-term research. Since January 1995, the project has been under active development by Cycorp, where Douglas Lenat was theCEO.
Cyc's ontology grew to about 100,000 terms in 1994, and as of 2017, it contained about 1,500,000 terms. The Cyc knowledge base involving ontological terms was largely created by hand axiom-writing; it was at about 1 million in 1994, and as of 2017, it is at about 24.5 million.
In 2008, Cyc resources were mapped to manyWikipedia articles.[7] Cyc is presently connected toWikidata.
Theknowledge base is divided intomicrotheories. Unlike the knowledge base as a whole, each microtheory must be free from monotonic contradictions. Each microtheory is a first-class object in the Cyc ontology; it has a name that is a regular constant. The concept names in Cyc are CycLterms orconstants.[6] Constants start with an optional#$ and are case-sensitive. There are constants for:
Individual items known asindividuals, such as#$BillClinton or#$France.
Collections, such as#$Tree-ThePlant (containing all trees) or#$EquivalenceRelation (containing allequivalence relations). A member of a collection is called aninstance of that collection.[1]
Functions, which produce new terms from given ones. For example,#$FruitFn, when provided with an argument describing a type (or collection) of plants, will return the collection of its fruits. By convention, function constants start with an upper-case letter and end with the stringFn.
Truth functions, which can apply to one or more other concepts and return either true or false. For example,#$siblings is the sibling relationship, true if the two arguments aresiblings. By convention, truth function constants start with a lowercase letter.
For every instance of the collection#$ChordataPhylum (i.e., for everychordate), there exists a female animal (instance of#$FemaleAnimal), which is its mother (described by the predicate#$biologicalMother).[1]
The Cyc inference engine separates theepistemological problem from theheuristic problem. For the latter, Cyc used acommunity-of-agents architecture in which specialized modules, each with its own algorithm, became prioritized if they could make progress on the sub-problem.
The first version of OpenCyc was released in spring 2002 and contained only 6,000 concepts and 60,000 facts. The knowledge base was released under theApache License. Cycorp stated its intention to release OpenCyc under parallel, unrestricted licences to meet the needs of its users. TheCycL and SubL interpreter (the program that allows users to browse and edit the database as well as to draw inferences) was released free of charge, but only as a binary, withoutsource code. It was made available forLinux andMicrosoft Windows. The open source Texai[9] project released theRDF-compatible content extracted from OpenCyc.[10] The user interface was in Java 6.
ASemantic Web version of OpenCyc was available starting in 2008, but ending sometime after 2016.[12]
OpenCyc 4.0 was released in June 2012.[13] OpenCyc 4.0 contained 239,000 concepts and 2,093,000 facts; however, these are mainlytaxonomic assertions.
4.0 was the last released version, and around March of 2017, OpenCyc was shutdown for the purported reason that "because such “fragmenting” led to divergence, and led to confusion amongst its users and the technical community generally that that OpenCyc fragmentwas Cyc.".[14]
In July 2006, Cycorp released theexecutable of ResearchCyc 1.0, a version of Cyc aimed at the research community, at no charge. (ResearchCyc was in beta stage of development during all of 2004; a beta version was released in February 2005.) In addition to the taxonomic information, ResearchCyc includes more semantic knowledge; it also includes a large lexicon,English parsing and generation tools, andJava-based interfaces for knowledge editing and querying. It contains a system forontology-based data integration.
In 2001,GlaxoSmithKline was funding the Cyc, though for unknown applications.[15] In 2007, theCleveland Clinic has used Cyc to develop anatural-language query interface of biomedical information oncardiothoracic surgeries.[16] A query is parsed into a set ofCycL fragments with open variables.[17] TheTerrorism Knowledge Base was an application of Cyc that tried to contain knowledge about "terrorist"-related descriptions. The knowledge is stored as statements in mathematical logic. The project lasted from 2004 to 2008.[18][19]Lycos used Cyc for search term disambiguation, but stopped in 2001.[20] CycSecure was produced in 2002,[21] a network vulnerability assessment tool based on Cyc, with trials at the USSTRATCOM Computer Emergency Response Team.[22]
One Cyc application has the stated aim to help students doing math at a 6th grade level.[23] The application, called MathCraft,[24] was supposed to play the role of a fellow student who is slightly more confused than the user about the subject. As the user gives good advice, Cyc allows the avatar to make fewer mistakes.
The Cyc project has been described as "one of the most controversial endeavors of the artificial intelligence history".[25]Catherine Havasi, CEO of Luminoso, says that Cyc is the predecessor project toIBM's Watson.[26] Machine-learning scientistPedro Domingos refers to the project as a "catastrophic failure" for the unending amount of data required to produce any viable results and the inability for Cyc to evolve on its own.[27]
Gary Marcus, a cognitive scientist and the cofounder of an AI company called Geometric Intelligence, says "it represents an approach that is very different from all the deep-learning stuff that has been in the news."[28] This is consistent with Doug Lenat's position that "Sometimes theveneer of intelligence is not enough".[29]
This is a list of some of the notable people who work or have worked on Cyc either while it was a project at MCC (where Cyc was first started) or Cycorp.
Alan Belasco et al. (2004)."Representing Knowledge Gaps Effectively". In: D. Karagiannis, U. Reimer (Eds.):Practical Aspects of Knowledge Management, Proceedings of PAKM 2004, Vienna, Austria, December 2–3, 2004. Springer-Verlag, Berlin Heidelberg.
Douglas Lenat and R. V. Guha (1990).Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project. Addison-Wesley.ISBN0-201-51752-3.
James Masters and Z. Güngördü (2003). ."Structured Knowledge Source Integration: A Progress Report" In:Integration of Knowledge Intensive Multiagent Systems. Cambridge, Massachusetts, USA, 2003.
Cynthia Matuszek et al. (2006)."An Introduction to the Syntax and Content of Cyc.". In:Proc. of the 2006 AAAI Spring Symposium on Formalizing and Compiling Background Knowledge and Its Applications to Knowledge Representation and Question Answering. Stanford, 2006
Stephen Reed and D. Lenat (2002)."Mapping Ontologies into Cyc". In:AAAI 2002 Conference Workshop on Ontologies For The Semantic Web. Edmonton, Canada, July 2002.
Michael Witbrock et al. (2004)."Automated OWL Annotation Assisted by a Large Knowledge Base". In:Workshop Notes of the 2004 Workshop on Knowledge Markup and Semantic Annotation at the 3rd International Semantic Web Conference ISWC2004. Hiroshima, Japan, November 2004, pp. 71–80.