Indexed languages are a class offormal languages discovered byAlfred Aho ;[ 1] they are described byindexed grammars and can be recognized bynested stack automata .[ 2]
Indexed languages are aproper subset ofcontext-sensitive languages .[ 1] They qualify as anabstract family of languages (furthermore a full AFL) and hence satisfy many closure properties. However, they are not closed under intersection or complement.[ 1]
The class of indexed languages haspractical importance innatural language processing as a computationally affordable [citation needed ] generalization ofcontext-free languages , sinceindexed grammars can describe many of the nonlocal constraints occurring in natural languages.
Gerald Gazdar (1988)[ 3] and Vijayashanker (1987)[ 4] introduced amildly context-sensitive language class now known as linear indexed grammars (LIG).[ 5] Linear indexed grammars have additional restrictions relative to indexed grammars. LIGs areweakly equivalent (generate the same language class) astree adjoining grammars .[ 6]
The following languages are indexed, but are not context-free:
{ a n b n c n d n | n ≥ 1 } {\displaystyle \{a^{n}b^{n}c^{n}d^{n}|n\geq 1\}} [ 3] { a n b m c n d m | m , n ≥ 0 } {\displaystyle \{a^{n}b^{m}c^{n}d^{m}|m,n\geq 0\}} [ 2] These two languages are also indexed, but are not even mildly context sensitive under Gazdar's characterization:
{ a 2 n | n ≥ 0 } {\displaystyle \{a^{2^{n}}|n\geq 0\}} [ 2] { w w w | w ∈ { a , b } + } {\displaystyle \{www|w\in \{a,b\}^{+}\}} [ 3] On the other hand, the following language is not indexed:[ 7]
{ ( a b n ) n | n ≥ 0 } {\displaystyle \{(ab^{n})^{n}|n\geq 0\}} Hopcroft andUllman tend to consider indexed languages as a "natural" class, since they are generated by several formalisms, such as:[ 9]
Hayashi[ 14] generalized thepumping lemma to indexed grammars.Conversely, Gilman[ 7] gives a "shrinking lemma" for indexed languages.
^a b c d Aho, Alfred (1968)."Indexed grammars—an extension of context-free grammars" .Journal of the ACM .15 (4):647– 671.doi :10.1145/321479.321488 .S2CID 9539666 .^a b c Partee, Barbara ;ter Meulen, Alice ; Wall, Robert E. (1990).Mathematical Methods in Linguistics . Kluwer Academic Publishers. pp. 536– 542.ISBN 978-90-277-2245-4 .^a b c Gazdar, Gerald (1988). "Applicability of Indexed Grammars to Natural Languages". In Reyle, U.; Rohrer, C. (eds.).Natural Language Parsing and Linguistic Theories . Studies in Linguistics and Philosophy. Vol. 35. Springer Netherlands. pp. 69– 94.doi :10.1007/978-94-009-1337-0_3 .ISBN 978-94-009-1337-0 . ^ Vijayashanker, K. (1987).A study of tree adjoining grammars (Thesis).ProQuest 303610666 . ^ Kallmeyer, Laura (2010).Parsing Beyond Context-Free Grammars . Springer. p. 31.ISBN 978-3-642-14846-0 . ^ Kallmeyer, Laura (16 August 2010).Parsing Beyond Context-Free Grammars . Springer. p. 32.ISBN 978-3-642-14846-0 . ^a b Gilman, Robert H. (1996). "A Shrinking Lemma for Indexed Languages".Theoretical Computer Science .163 (1– 2):277– 281.arXiv :math/9509205 .doi :10.1016/0304-3975(96)00244-7 .S2CID 14479068 . ^ Hopcroft, John ;Ullman, Jeffrey (1979).Introduction to automata theory, languages, and computation . Addison-Wesley. p. 390.ISBN 978-0-201-02988-8 .^ Introduction to automata theory, languages, and computation,[ 8] Bibliographic notes, p.394-395 ^ Aho, Alfred V. (July 1969)."Nested Stack Automata" .Journal of the ACM .16 (3):383– 406.doi :10.1145/321526.321529 .S2CID 685569 . ^ Fischer, Michael J. (October 1968). "Grammars with macro-like productions".9th Annual Symposium on Switching and Automata Theory (Swat 1968) . pp. 131– 142.doi :10.1109/SWAT.1968.12 . ^ Greibach, Sheila A. (March 1970). "Full AFLs and nested iterated substitution".Information and Control .16 (1):7– 35.doi :10.1016/s0019-9958(70)80039-0 . ^ Maibaum, T.S.E. (June 1974)."A generalized approach to formal languages" .Journal of Computer and System Sciences .8 (3):409– 439.doi :10.1016/s0022-0000(74)80031-0 . ^ Hayashi, Takeshi (1973)."On derivation trees of indexed grammars: an extension of the {$uvwxy$}-theorem" .Publications of the Research Institute for Mathematical Sciences .9 (1):61– 92.doi :10.2977/prims/1195192738 .
Each category of languages, except those marked by a* , is aproper subset of the category directly above it. Any language in each category is generated by a grammar and by an automaton in the category in the same line.