This articlerelies largely or entirely on asingle source. Relevant discussion may be found on thetalk page. Please helpimprove this article byintroducing citations to additional sources. Find sources: "LOLITA" – news ·newspapers ·books ·scholar ·JSTOR(March 2024) |
LOLITA is anatural language processing system developed byDurham University between 1986 and 2000. The name is anacronym for "Large-scale, Object-based,LinguisticInteractor,Translator and Analyzer".
LOLITA was developed by Roberto Garigliano and colleagues between 1986 and 2000. It was designed as a general-purpose tool for processing unrestricted text that could be the basis of a wide variety ofapplications. At its core was asemantic network containing some 90,000 interlinked concepts. Text could beparsed and analysed then incorporated into the semantic net, where it could be reasoned about (Long and Garigliano, 1993). Fragments of the semantic net could also be rendered back toEnglish orSpanish.
Several applications were built using the system, including financial information analysers and information extraction tools for Darpa’s “Message Understanding Conference Competitions” (MUC-6 andMUC-7). The latter involved processing originalWall Street Journal articles, to perform tasks such as identifying key job changes in businesses and summarising articles. LOLITA was one of some systems worldwide to compete in all sections of the tasks. A system description and an analysis of the MUC-6 results were written by Callaghan (Callaghan, 1998).
LOLITA was an early example of a substantial application written in afunctional language: it consisted of around 50,000 lines ofHaskell, with around 6000 lines ofC. It is also a complex and demanding application, in which many aspects of Haskell were invaluable in development.
LOLITA was designed to handle unrestricted text, so that ambiguity at various levels was unavoidable and significant.Laziness was essential in handling the explosion ofsyntactic ambiguity resulting from a largegrammar, and it was much used with semantic ambiguity too. The system used multiple "domain specificembeddedlanguages" forsemantic and pragmatic processing and for generation of natural language text from the semantic net. Also, important was the ability to work with complex abstractions and toprototype new analysisalgorithms quickly.[1]
Later systems based on the same design include Concepts and SenseGraph.