This articleneeds additional citations forverification. Please helpimprove this article byadding citations to reliable sources. Unsourced material may be challenged and removed. Find sources: "Devanagari transliteration" – news ·newspapers ·books ·scholar ·JSTOR(February 2011) (Learn how and when to remove this message) |
Devanagari transliteration is the process of representing text written inDevanagari script—an Indic script used forClassical Sanskrit and many other Indic languages, includingHindi,Marathi andNepali— in Roman script preserving pronunciation and spelling conventions. There are several somewhat similar methods oftransliteration from Devanagari to theRoman script (a process sometimes calledromanisation), including the influential and losslessIAST notation.[1] Romanised Devanagari is also calledRomanagari.[2]
TheInternational Alphabet of Sanskrit Transliteration (IAST) is a subset of theISO 15919 standard, used for the transliteration ofSanskrit,Prakrit andPāḷi into Roman script with diacritics.IAST is a widely used standard. It usesdiacritics to disambiguate phonetically similar but not identical Sanskrit glyphs. For example, dental and retroflex consonants are disambiguated with an underdot: dental द=d and retroflex ड=ḍ. An important feature of IAST is that it is losslessly reversible, i.e., IAST transliteration may be converted back to correct Devanāgarī or to some other South Asian scripts without ambiguity (except for scripts used for Dravidian languages, which require distinguishing long and short mid vowels). Many Unicode fonts fully support IAST display and printing.
Although the Roman script has long been the basis of standard systems of transliteration of Indian languages, and has been advocated for general use on various grounds by linguists such asJ R Firth andS. K. Chatterji, it has been seriously employed only for one language—Konkani.[3]
TheHunterian system is the "national system of romanisation in India" and the one officially adopted by theGovernment of India.[4][5][6]
The Hunterian system was developed in the nineteenth century byWilliam Wilson Hunter, then Surveyor General of India.[7] When it was proposed, it immediately met with opposition from supporters of the earlier practiced non-systematic and often distorting "Sir Roger Dowler method" (an early corruption ofSiraj ud-Daulah) of phonetic transcription, which climaxed in a dramatic showdown in an India Council meeting on 28 May 1872 where the new Hunterian method carried the day. The Hunterian method was inherently simpler and extensible to several Indic scripts because it systematisedgrapheme transliteration, and it came to prevail and gain government and academic acceptance.[7] Opponents of the grapheme transliteration model continued to mount unsuccessful attempts at reversing government policy until the turn of the century, with one critic calling appealing to "the Indian Government to give up the whole attempt at scientific (i.e. Hunterian) transliteration, and decide once and for all in favour of a return to the old phonetic spelling."[8]
Over time, the Hunterian method extended in reach to cover several Indic scripts, includingBurmese andTibetan.[9][10] Provisions forschwa deletion in Indo-Aryan languages were also made where applicable, e.g. the Hindiकानपुर is transliterated askānpur (and notkānapura) but the Sanskritक्रम is transliterated askrama (and notkram). The system has undergone some evolution over time. For instance, long vowels were marked with anaccentdiacritic in the original version, but this was later replaced in the 1954 Government of India update with amacron.[11] Thus,जान (life) was previously romanised asján but began to be romanised asjān. The Hunterian system has faced criticism over the years for not producing phonetically accurate results and being "unashamedly geared towards an English-language receiver audience."[11] Specifically, the lack of differentiation betweenretroflex anddental consonants (e.g.द andड are both represented byd) has come in for repeated criticism and inspired several proposed modifications of Hunterian, including using a diacritic below retroflexes (e.g. makingद=d andड=ḍ, which is more readable but requires diacritic printing) or capitalising them (e.g. makingद=d andड=D, which requires no diacritic printing but is less readable because it mixes small and capital letters in words).[12]
TheNational Library at Kolkata romanisation, intended for the romanisation of allIndic scripts, is an extension ofIAST. It differs from IAST in the use of the symbols ē and ō forए andओ (e and o are used for the short vowels present in many Indian languages), the use of 'ḷ' for the consonant (inKannada)ಳ, and the absence of symbols forॠ,ऌ andॡ.
A standardtransliteration convention not just for Devanagari,[13] but for all South-Asian languages was codified in the ISO 15919 standard of 2001, providing the basis for modern digital libraries that conform to International Organization for Standardization (ISO) norms. ISO 15919 defines the common Unicode basis for Roman transliteration of South-Asian texts in a wide variety of languages/scripts.
ISO 15919 transliterations are platform-independent texts so that they can be used identically on all modern operating systems and software packages, as long as they comply with ISO norms. This is a prerequisite for all modern platforms so that ISO 15919 has become the new standard for digital libraries and archives for transliterating all South Asian texts.[original research?]
ISO 15919[14] usesdiacritics to map the much larger set ofBrahmic graphemes to the Latin script. The Devanagari-specific portion is nearly identical to the academic standard,IAST: "International Alphabet of Sanskrit Transliteration", and toALA-LC, the United States Library of Congress standard.[15]
Another standard,United Nations Romanization Systems for Geographical Names (UNRSGN), was developed by the United Nations Group of Experts[16] on Geographical Names (UNGEGN)[17] and covers many Brahmic scripts. There are some differences[18] between ISO 15919 and UNRSGN.
Compared toIAST,Harvard-Kyoto looks much simpler.It does not contain any of thediacritic marks that IAST contains.Instead of diacritics, Harvard-Kyoto usescapital letters.The use of capital letters makes typing in Harvard-Kyoto much easier than in IAST but produces words with capital letters inside them.
ITRANS is an extension ofHarvard-Kyoto. The ITRANS transliteration scheme was developed for theITRANS software package, a pre-processor forIndic scripts. The user inputs in Roman letters and the ITRANS preprocessor converts the Roman letters into Devanāgarī (or other Indic scripts). The latest version ofITRANS is version 5.30 released in July 2001.[citation needed]
The disadvantage of the aboveASCII schemes is case-sensitivity, implying that transliterated names may not be capitalised. This difficulty is avoided with the system developed in 1996 by Frans Velthuis forTeX, loosely based on IAST, in which case is irrelevant.
WX notation is a transliteration scheme for representing Indian languages in ASCII. This scheme originated at IIT Kanpur for computational processing of Indian languages, and is widely used among the natural language processing (NLP) community in India. The notation (though unidentified) is used, for example, in a textbook on NLP from IIT Kanpur.[1] The salient features of this transliteration scheme are: Every consonant and every vowel has a single mapping into Roman. Hence it is a prefix code,[2] advantageous from a computation point of view. Typically the small case letters are used for un-aspirated consonants and short vowels while the capital case letters are used for aspirated consonants and long vowels. While the retroflexed voiceless and voiced consonants are mapped to 't, T, d and D', the dentals are mapped to 'w, W, x and X'. Hence the name of the scheme "WX", referring to the idiosyncratic mapping. Ubuntu Linux provides a keyboard support for WX notation.
SLP1 (Sanskrit Library Phonetic) is a case-sensitive scheme initially used bySanskrit Library[19] which was developed by Peter Scharf and (the late) Malcolm Hyman, who first described it in appendix B of their book Linguistic Issues in Encoding Sanskrit.[20]The advantage of SLP1 over other encodings is that a single ASCII character is used for each Devanagari letter, a peculiarity that eases reverse transliteration.[21]
Hinglish refers to the non-standardised Romanised Hindi used online, and especially on social media. In India, Romanised Hindi is the dominant form of expression online. In an analysis ofYouTube comments, Palakodety et al., identified that 52% of comments were in Romanised Hindi, 46% in English, and 1% inDevanagari Hindi.[22]
Other less popular ASCII schemes includeWX notation, Vedatype and the 7-bit ISO 15919. WX notation is a transliteration scheme for representing Indian languages in ASCII. It originated at IIT Kanpur for computational processing of Indian languages and is widely used among the natural language processing (NLP) community in India. This scheme is described inNLP PaniniArchived 26 November 2013 at theWayback Machine (Appendix B). It is similar to, but not as versatile as, SLP1, as far as the coverage of Vedic Sanskrit is concerned. Comparison of WX with other schemes is found inHuet (2009), App A.. Vedatype is another scheme used for encoding Vedic texts atMaharishi University of Management. An online transcoding utility across all these schemes is provided at theSanskrit Library.ISO 15919 includes a so-called "limited character set" option to replace the diacritics by prefixes, so that it is ASCII-compatible. A pictorial explanation ishere fromAnthony Stone.
The following is a comparison[23] of the major transliteration[24] methods used for Devanāgarī.
| Devanāgarī | IAST | ISO 15919 | Monier-Williams72 | Harvard-Kyoto | ITRANS | Velthuis | SLP1 | WX | Hunterian |
|---|---|---|---|---|---|---|---|---|---|
| अ | a | a | a | a | a | a | a | a | a |
| आ | ā | ā | ā | A | A/aa | aa | A | A | a |
| इ | i | i | i | i | i | i | i | i | i |
| ई | ī | ī | ī | I | I/ii | ii | I | I | i |
| उ | u | u | u | u | u | u | u | u | u |
| ऊ | ū | ū | ū | U | U/uu | uu | U | U | u |
| ए | e | ē | e | e | e | e | e | e | e |
| ऐ | ai | ai | ai | ai | ai | ai | E | E | ai |
| ओ | o | ō | o | o | o | o | o | o | o |
| औ | au | au | au | au | au | au | O | O | au |
| ऋ | ṛ | r̥ | ṛi | R | RRi/R^i | .r | f | q | ri |
| ॠ | ṝ | r̥̄ | ṛī | RR | RRI/R^I | .rr | F | Q | ri |
| ऌ | ḷ | l̥ | lṛi | lR | LLi/L^i | .l | x | L | |
| ॡ | ḹ | l̥̄ | lṛī | lRR | LLI/L^I | .ll | X | LY | |
| अं | ṁ | ṁ | ṉ/ṃ | M | M/.n/.m | .m | M | M | n, m |
| अः | ḥ | ḥ | ḥ | h | H | H | .h | H | H |
| अँ | m̐ | m̐ | .N | ~ | az | ||||
| ऽ | ' | ’ | ' | .a | .a | ' | Z |
TheDevanāgarī standalone consonant letters are followed by an implicitshwa (/Ə/). In all of the transliteration systems, that /Ə/ must be represented explicitly using an 'a' or any equivalent ofshwa.
| Devanāgarī | IAST | ISO 15919 | Monier-Williams72 | Harvard-Kyoto | ITRANS | Velthuis | SLP1 | WX | Hunterian |
|---|---|---|---|---|---|---|---|---|---|
| क | ka | ka | ka | ka | ka | ka | ka | ka | k |
| ख | kha | kha | kha | kha | kha | kha | Ka | Ka | kh |
| ग | ga | ga | ga | ga | ga | ga | ga | ga | g |
| घ | gha | gha | gha | gha | gha | gha | Ga | Ga | gh |
| ङ | ṅa | ṅa | n·a | Ga | ~Na | "na | Na | fa | n |
| च | ca | ca | ća | ca | cha | ca | ca | ca | ch |
| छ | cha | cha | ćha | cha | Cha | chha | Ca | Ca | chh |
| ज | ja | ja | ja | ja | ja | ja | ja | ja | j |
| झ | jha | jha | jha | jha | jha | jha | Ja | Ja | jh |
| ञ | ña | ña | ṅa | Ja | ~na | ~na | Ya | Fa | n |
| ट | ṭa | ṭa | ṭa | Ta | Ta | .ta | wa | ta | t |
| ठ | ṭha | ṭha | ṭha | Tha | Tha | .tha | Wa | Ta | th |
| ड | ḍa | ḍa | ḍa | Da | Da | .da | qa | da | d |
| ढ | ḍha | ḍha | ḍha | Dha | Dha | .dha | Qa | Da | dh |
| ण | ṇa | ṇa | ṇa | Na | Na | .na | Ra | Na | n |
| त | ta | ta | ta | ta | ta | ta | ta | wa | t |
| थ | tha | tha | tha | tha | tha | tha | Ta | Wa | th |
| द | da | da | da | da | da | da | da | xa | d |
| ध | dha | dha | dha | dha | dha | dha | Da | Xa | dh |
| न | na | na | na | na | na | na | na | na | n |
| प | pa | pa | pa | pa | pa | pa | pa | pa | p |
| फ | pha | pha | pha | pha | pha | pha | Pa | Pa | ph |
| ब | ba | ba | ba | ba | ba | ba | ba | ba | b |
| भ | bha | bha | bha | bha | bha | bha | Ba | Ba | bh |
| म | ma | ma | ma | ma | ma | ma | ma | ma | m |
| य | ya | ya | ya | ya | ya | ya | ya | ya | y |
| र | ra | ra | ra | ra | ra | ra | ra | ra | r |
| ल | la | la | la | la | la | la | la | la | l |
| व | va | va | va | va | va/wa | va | va | va | v, w |
| श | śa | śa | ṡa | za | sha | "sa | Sa | Sa | sh |
| ष | ṣa | ṣa | sha | Sa | Sha | .sa | za | Ra | sh |
| स | sa | sa | sa | sa | sa | sa | sa | sa | s |
| ह | ha | ha | ha | ha | ha | ha | ha | ha | h |
| ळ | ḻa | ḷa | La | La | .la | La | lY |
| Devanāgarī | ISO 15919 | Harvard-Kyoto | ITRANS | Velthuis | SLP1 | WX | Hunterian |
|---|---|---|---|---|---|---|---|
| क्ष | kṣa | kSa | kSa/kSha/xa | k.sa | kza | kRa | ksh |
| त्र | tra | tra | tra | tra | tra | wra | tr |
| ज्ञ | jña | jJa | GYa/j~na | j~na | jYa | jFa | gy, jñ |
| श्र | śra | zra | shra | "sra | Sra | Sra | shr |
| Devanāgarī | ISO 15919 | ITRANS | WX | Hunterian |
|---|---|---|---|---|
| क़ | qa | qa | kZa | q |
| ख़ | k͟ha | Ka | KZa | kh |
| ग़ | ġa | Ga | gZa | gh |
| ज़ | za | za | jZa | z |
| फ़ | fa | fa | PZa | f |
| ड़ | ṛa | .Da/Ra | dZa | r |
| ढ़ | ṛha | .Dha/Rha | DZa | rh |
The table below shows just the differences between ISO 15919 and IAST forDevanagari transliteration.
| Devanagari | ISO 15919 | IAST | Comment |
|---|---|---|---|
| ए / े | ē | e | To distinguish between long and short 'e' inDravidian languages, 'e' now representsऎ / ॆ (short). Note that the use ofē is considered optional in ISO 15919, and usinge forए (long) is acceptable for languages that do not distinguish long and short e. |
| ओ / ो | ō | o | To distinguish between long and short 'o' in Dravidian languages, 'o' now representsऒ / ॊ (short). Note that the use ofō is considered optional in ISO 15919, and usingo forओ (long) is acceptable for languages that do not distinguish long and short o. |
| ऋ / ृ | r̥ | ṛ | In ISO 15919, ṛ is used to representड़. |
| ॠ / ॄ | r̥̄ | ṝ | For consistency with r̥ |
| ऌ / ॢ | l̥ | ḷ | In ISO 15919, ḷ is used to representळ. |
| ॡ / ॣ | l̥̄ | ḹ | For consistency with l̥ |
| ◌ं | ṁ | ṁ | ISO 15919 has two options about anusvāra. (1) In the simplified nasalisation option, an anusvāra is always transliterated asṁ. (2) In the strict nasalization option, anusvāra before a class consonant is transliterated as the class nasal—ṅ before k, kh, g, gh, ṅ;ñ before c, ch, j, jh, ñ;ṇ before ṭ, ṭh, ḍ, ḍh, ṇ;n before t, th, d, dh, n;m before p, ph, b, bh, m.ṃ is sometimes used to specifically representGurmukhi Tippiੰ. |
| ṅ ñ ṇ n m | |||
| ◌ँ | m̐ | m̐ | Vowel nasalisation is transliterated as a tilde above the transliterated vowel (over the second vowel in the case of a digraph such as aĩ, aũ), except in Sanskrit. |
| ळ | ḻ | ḷ | Used in Vedic Sanskrit only and not found in the Classical variant |
Devanāgarī consonants include an "inherent a" sound, called theschwa, that must be explicitly represented with an "a" character in the transliteration. Many words and names transliterated from Devanāgarī end with "a", to indicate the pronunciation in the originalSanskrit. Thisschwa is obligatorily deleted in several modernIndo-Aryan languages, likeHindi,Punjabi,Marathi and others. This results in differing transliterations for Sanskrit and schwa-deleting languages that retain or eliminate the schwa as appropriate:
Some words may keep the final a, generally because they would be difficult to say without it:
Because of this, some words ending in consonant clusters are altered in various modern Indic languages as such:Mantra=mantar. Shabda=shabad. Sushumna=sushumana.
MostIndian languages make a distinction between the retroflex and dental forms of the dental consonants. In formal transliteration schemes, the standard Roman letters are used to indicate the dental form, and the retroflex form is indicated by special marks, or the use of other letters. E.g., inIAST transliteration, the retroflex forms are ṇ, ṭ, ḍ and ṣ.
In most informal transcriptions the distinction between retroflex and dental consonants is not indicated. However, many capitalise retroflex consonants on QWERTY keyboard in informal messaging. That generally obviates the need for transliteration.
Where the letter "h" appears after aplosive consonant in Devanāgarī transliteration, it always indicatesaspiration. Thus "ph" is pronounced as thep in "pit" (with a small puff of air released as it is said), never as theph in "photo" (IPA /f/). (On the other hand, "p" is pronounced as thep in "spit" with no release of air.) Similarly "th" is an aspirated "t", neither theth of "this" (voiced, IPA /ð/) nor theth of "thin" (unvoiced, IPA /θ/).
The aspiration is generally indicated in both formal and informal transliteration systems.
As English is widely used as a professional and higher-education language in India, availability of Devanagari keyboards is dwarfed by English keyboards. Similarly, software and user interfaces released and promoted in India are in English, as is much of the computer education available there. Due to low awareness of Devanagari keyboard layouts, many Indian users type Hindi in the Roman script.
Before Devanagari was added toUnicode, many workarounds were used to display Devanagari on the Internet, and many sites and services have continued using them despite widespread availability of Unicode fonts supporting Devanagari. Although there are several transliteration conventions on transliterating Hindi to Roman, most of these are reliant on diacritics. As most Indians are familiar with the Roman script through the English language (which traditionally does not use diacritics), these transliteration systems are much less widely known. Most such "Romanagari" is transliterated arbitrarily to imitate English spelling, and thus results in numerous inconsistencies.
It is also detrimental to search engines, which do not classify Hindi text in the Roman script as Hindi. The same text may also not be classified as English.
Regardless of the physical keyboard's layout, it is possible toinstall Unicode-based Hindi keyboard layouts on most modern operating systems. There are many online services available that transliterate text written in Roman to Devanagari accurately, using Hindi dictionaries for reference, such asGoogle transliteration or Microsoft Indic Language Input Tool. This solution is similar toinput method editors, which are traditionally used to input text in languages that use complex characters, most notably those that use logographies.
Early Sanskrit texts were originally transmitted by memorisation and repetition. Post-Harappan India had no system for writing Indic languages until the creation (in the 4th-3rd centuries BCE) of theKharoshti andBrahmi scripts. These writing systems, though adequate forMiddle Indic languages, were not well-adapted to writing Sanskrit. However, later descendants of Brahmi were modified so that they could record Sanskrit in exacting phonetic detail. The earliest physical text in Sanskrit is a rock inscription by theWestern Kshatrapa rulerRudradaman, written c. 150 CE inJunagadh,Gujarat. Due to the remarkable proliferation of different varieties of Brahmi in the Middle Ages, there is today no single script used for writing Sanskrit; rather, Sanskrit scholars can write the language in a form of whatever script is used to write their local language. However, since the late Middle Ages, there has been a tendency to useDevanagari for writing Sanskrit texts for a widespread readership.
Western scholars in the 19th century adopted Devanagari for printed editions of Sanskrit texts. Theeditio princeps of theRigveda byMax Müller was in Devanagari. Müller's London typesetters competed with their Petersburg peers working onBöhtlingk's and Roth's dictionary in cutting all the required ligature types.
From its beginnings, Western Sanskrit philology also felt the need for a romanised spelling of the language.[citation needed]Franz Bopp in 1816 used a romanisation scheme, alongside Devanagari, differing from IAST in expressing vowel length by a circumflex (â, î, û), and aspiration by aspiritus asper (e.g. bʽ for IAST bh). The sibilants IAST ṣ and ś he expressed with spiritus asper and lenis, respectively (sʽ, sʼ).Monier-Williams in his 1899 dictionary used ć, ṡ and sh for IAST c, ś and ṣ, respectively.
From the late 19th century, Western interest in typesetting Devanagari decreased.[citation needed]Theodor Aufrecht published his 1877 edition of the Rigveda in romanised Sanskrit, andArthur Macdonell's 1910Vedic grammar (and 1916Vedic grammar for students) likewise do without Devanagari (while his introductorySanskrit grammar for students retains Devanagari alongside romanised Sanskrit). Contemporary Western editions of Sanskrit texts appear mostly in IAST.
... With the passage of time, there has emerged a practically uniform system of transliteration of Devanagari and allied alphabets. Nevertheless, no single system of Romanization has yet developed ...
... ISO 15919 ... There is no evidence of the use of the system either in India or in international cartographic products ... The Hunterian system is the actually used national system of romanization in India ...
... In India the Hunterian system is used, whereby every sound in the local language is uniformly represented by a certain letter in the Roman alphabet ...
... The Hunterian system of transliteration, which has international acceptance, has been used ...
... phonetic or 'Sir Roger Dowler method' ... The Secretary of State and the great majority of his counselors gave an unqualified support to the Hunterian system ...
... the Indian Government to give up the whole attempt at scientific (i.e. Hunterian) transliteration, and decide once and for all in favour of a return to the old phonetic spelling ...
... There does exist a system df transcribing Burmese words in roman letters, one that is called the 'Government', or the 'Hunterian' method ...
... The Hunterian system has rules for transliteration into English the names form Hindi, Urdu, Arabic, Burmese, Chinese and Tibetan origin. These rules are described in Chapter VI, Survey of India, Handbook of Topographical Mapping ...
... In the late 19th century sources, the system marks long vowels with an acute accent, and renders the letters k and q both as k. However, when the system was again published in 1954, alterations had been made. Long vowels were now marked with a macron4 and the q-k distinction was maintained ...
... Suggested by . Mr. GS Oberoi, Director, Survey of India, in lieu of the existing table 'Hunterian System of Transliteration' which does not distinguish betweenद andड,र andड़,त andट ...
... ISO 15919 ... There is no evidence of the use of the system either in India or in international cartographic products ... The Hunterian system is the actually used national system of romanization in India ...