![]() | This article needs editing tocomply with Wikipedia'sManual of Style. Please helpimprove the content.(June 2024) (Learn how and when to remove this message) |
Arabic alphabet |
---|
ابتثجحخدذرزسشصضطظعغفقكلمنهوي |
Arabic script |
TheArabic script has numerousdiacritics, which include consonant pointing known asiʻjām (إِعْجَام,IPA:[ʔiʕdʒæːm]), and supplementary diacritics known astashkīl (تَشْكِيل,IPA:[t̪æʃkiːl]). The latter include the vowel marks termedḥarakāt (حَرَكَات,IPA:[ħæɾækæːt̪];sg.حَرَكَة,ḥarakah,IPA:[ħæɾækæ]).
The Arabic script is a modifiedabjad, where all letters are consonants, leaving it up to the reader to fill in the vowel sounds. Short consonants and long vowels are represented by letters, but short vowels andconsonant length are not generally indicated in writing.Tashkīl is optional to represent missing vowels and consonant length. Modern Arabic is always written with thei‘jām—consonant pointing—but only religious texts, children's books and works for learners are written with the fulltashkīl—vowel guides and consonant length. It is, however, not uncommon for authors to add diacritics to a word or letter when the grammatical case or the meaning is deemed otherwise ambiguous. In addition, classical works and historical documents rendered to the general public are often rendered with the fulltashkīl, to compensate for the gap in understanding resulting from stylistic changes over the centuries.
Moreover, tashkīl can change the meaning of the entire word, for example, the words: (دِين), meaning (religion), and (دَين), meaning (debt). Even though they have the same letters, their meanings are different because of the tashkīl. In sentences without tashkīl, readers understand the meaning of the word by simply using context.
The literal meaning ofتَشْكِيلtashkīl is 'formation'. As the normal Arabic text does not provide enough information about the correct pronunciation, the main purpose oftashkīl (andḥarakāt) is to provide a phonetic guide or a phonetic aid; i.e. show the correct pronunciation for children who are learning to read or foreign learners.
The bulk of Arabic script is written withoutḥarakāt (or short vowels). However, they are commonly used in texts that demand strict adherence to exact pronunciation. This is true, primarily, of theQur'an⟨ٱلْقُرْآن⟩ (al-Qurʾān) andpoetry. It is also quite common to addḥarakāt tohadiths⟨ٱلْحَدِيث⟩ (al-ḥadīth; plural:al-ḥādīth) and theBible. Another use is in children's literature. Moreover,ḥarakāt are used in ordinary texts in individual words when an ambiguity of pronunciation cannot easily be resolved from context alone. Arabic dictionaries with vowel marks provide information about the correct pronunciation to both native and foreign Arabic speakers. In art andcalligraphy,ḥarakāt might be used simply because their writing is consideredaesthetically pleasing.
An example of a fullyvocalised (vowelised orvowelled) Arabic from theBismillah:
بِسْمِ ٱللَّٰهِ ٱلرَّحْمَٰنِ ٱلرَّحِيمِ
bismi l-lāhi r-raḥmāni r-raḥīm
In the name of God, the All-Merciful, the Especially-Merciful.
Some Arabic textbooks for foreigners now useḥarakāt as a phonetic guide to make learning reading Arabic easier. The other method used in textbooks is phoneticromanisation of unvocalised texts. Fully vocalised Arabic texts (i.e. Arabic texts withḥarakāt/diacritics) are sought after by learners of Arabic. Some online bilingual dictionaries also provideḥarakāt as a phonetic guide similarly to English dictionaries providing transcription.
Theḥarakātحَرَكَات, which literally means 'motions', are the short vowel marks. There is some ambiguity as to whichtashkīl are alsoḥarakāt; thetanwīn, for example, are markers for both vowels and consonants.
Thefatḥah⟨فَتْحَة⟩ is a small diagonal line placedabove a letter, and represents a short/a/ (like the /a/ sound in the English word "cat"). The wordfatḥah itself (فَتْحَة) meansopening and refers to the opening of the mouth when producing an/a/. For example, withdāl (henceforth, the base consonant in the following examples):⟨دَ⟩/da/.
When afatḥah is placed before a plain letter⟨ا⟩ (alif) (i.e. one having no hamza or vowel of its own), it represents a long/aː/ (close to the sound of "a" in the English word "dad", with an open front vowel /æː/, not back /ɑː/ as in "father"). For example:⟨دَا⟩/daː/. Thefatḥah is not usually written in such cases. When a fathah is placed before the letter ⟨ﻱ⟩ (yā’), it creates an/aj/ (as in "lie"); and when placed before the letter ⟨و⟩ (wāw), it creates an/aw/ (as in "cow").
Although paired with a plain letter creates an open front vowel (/a/), often realized as near-open (/æ/), the standard also allows for variations, especially under certain surrounding conditions. Usually, in order to have the more central (/ä/) or back (/ɑ/) pronunciation, the word features a nearby back consonant, such as the emphatics, as well asqāf, orrā’. A similar "back" quality is undergone by other vowels as well in the presence of such consonants, however not as drastically realized as in the case offatḥah.[1][2][3]
Fatḥahs are encodedU+0618 ؘARABIC SMALL FATHA,U+064E َARABIC FATHA,U+FE76 ﹶARABIC FATHA ISOLATED FORM, orU+FE77 ﹷARABIC FATHA MEDIAL FORM.
A similar diagonal linebelow a letter is called akasrah⟨كَسْرَة⟩ and designates a short/i/ (as in "me", "be") and its allophones [i, ɪ, e, e̞, ɛ] (as in "Tim", "sit"). For example:⟨دِ⟩/di/.[4]
When akasrah is placed before a plain letter⟨ﻱ⟩ (yā’), it represents a long/iː/ (as in the English word "steed"). For example:⟨دِي⟩/diː/. Thekasrah is usually not written in such cases, but ifyā’ is pronounced as a diphthong/aj/,fatḥah should be written on the preceding letter to avoid mispronunciation. The wordkasrah means 'breaking'.[1]
Kasrahs are encodedU+061A ؚARABIC SMALL KASRA,U+0650 ِARABIC KASRA,U+FE7A ﹺARABIC KASRA ISOLATED FORM, orU+FE7B ﹻARABIC KASRA MEDIAL FORM.
Theḍammah⟨ضَمَّة⟩ is a small curl-like diacritic placed above a letter to represent a short /u/ (as in "duke", shorter "you") and its allophones [u, ʊ, o, o̞, ɔ] (as in "put", or "bull"). For example:⟨دُ⟩/du/.[4]
When aḍammah is placed before a plain letter⟨و⟩ (wāw), it represents a long/uː/ (like the 'oo' sound in the English word "swoop"). For example:⟨دُو⟩/duː/. Theḍammah is usually not written in such cases, but ifwāw is pronounced as a diphthong/aw/,fatḥah should be written on the preceding consonant to avoid mispronunciation.[1]
The wordḍammah (ضَمَّة) in this context meansrounding, since it is the only rounded vowel in the vowel inventory of Arabic.
Ḍammahs are encodedU+0619 ؙARABIC SMALL DAMMA,U+064F ُARABIC DAMMA,U+FE78 ﹸARABIC DAMMA ISOLATED FORM, orU+FE79 ﹹARABIC DAMMA MEDIAL FORM.
Thesuperscript (or dagger)alif⟨أَلِف خَنْجَرِيَّة⟩ (alif khanjarīyah), is written as short vertical stroke on top of a letter. It indicates a long/aː/ sound for whichalif is normally not written. For example:⟨هَٰذَا⟩ (hādhā) or⟨رَحْمَٰن⟩ (raḥmān).
The daggeralif occurs in only a few words, but they include some common ones; it is seldom written, however, even in fully vocalised texts. Most keyboards do not have daggeralif. The wordAllah⟨الله⟩ (Allāh) is usually produced automatically by enteringalif lām lām hāʾ. The word consists ofalif + ligature of doubledlām with ashaddah and a daggeralif abovelām, followed byha'.
This sectiondoes notcite anysources. Please helpimprove this section byadding citations to reliable sources. Unsourced material may be challenged andremoved.(April 2023) (Learn how and when to remove this message) |
Themaddah⟨مَدَّة⟩ is atilde-shaped diacritic, which can only appear on top of analif (آ) and indicates aglottal stop/ʔ/ followed by a long/aː/.
In theory, the same sequence/ʔaː/ could also be represented by twoalifs, as in *⟨أَا⟩, where a hamza above the firstalif represents the/ʔ/ while the secondalif represents the/aː/. However, consecutivealifs are never used in the Arabic orthography. Instead, this sequence must always be written as a singlealif with amaddah above it, the combination known as analif maddah. For example:⟨قُرْآن⟩/qurˈʔaːn/.
In Quranic writings, amaddah is placed on any other letter to denote the name of the letter, though some letters may take on a daggeralif. For example:⟨لٓمٓصٓ⟩ (lām-mīm-ṣād) or⟨يـٰسٓ⟩ (yāʼ-sīn)
Thewaṣlah⟨وَصْلَة⟩,alif waṣlah⟨أَلِف وَصْلَة⟩ orhamzat waṣl⟨هَمْزَة وَصْل⟩ looks like the head of a smallṣād on top of analif⟨ٱ⟩ (also indicated by analif⟨ا⟩ without ahamzah). It means that thealif is not pronounced when its word does not begin a sentence. For example:⟨بِٱسْمِ⟩ (bismi), but⟨ٱمْشُوا۟⟩ (imshū notmshū). This is because in Arabic, the first consonant in a word must always be followed by a vowel sound: If the second letter from thewaṣlah has a kasrah, the alif-waslah makes the sound /i/. However, when the second letter from it has a dammah, it makes the sound /u/.
It occurs only in the beginning of words, but it can occur after prepositions and the definite article. It is commonly found in imperative verbs, the perfective aspect of verb stems VII to X and theirverbal nouns (maṣdar). Thealif of the definite article is considered awaṣlah.
It occurs in phrases and sentences (connected speech, not isolated/dictionary forms):
Like the superscript alif, it is not written in fully vocalized scripts, except for sacred texts, like the Quran and Arabized Bible.
Thesukūn⟨سُكُونْ⟩ is a circle-shaped diacritic placed above a letter ( ْ). It indicates that the letter to which it is attached is not followed by a vowel, i.e.,zero-vowel.
It is a necessary symbol for writing consonant-vowel-consonant syllables, which are very common in Arabic. For example:⟨دَدْ⟩ (dad).
Thesukūn may also be used to help represent a diphthong. Afatḥah followed by the letter⟨ﻱ⟩ (yā’) with asukūn over it (ـَيْ) indicates the diphthongay (IPA/aj/). Afatḥah, followed by the letter⟨ﻭ⟩ (wāw) with asukūn, (ـَوْ) indicates/aw/.
Sukūns are encodedU+0652 ْARABIC SUKUN,U+FE7E ﹾARABIC SUKUN ISOLATED FORM, orU+FE7F ﹿARABIC SUKUN MEDIAL FORM.
Thesukūn may have also an alternative form of the small high head ofḥāʾ (U+06E1 ۡARABIC SMALL HIGH DOTLESS HEAD OF KHAH), particularly in some Qurans. Other shapes may exist as well (for example, like a small comma above ⟨ʼ⟩ or like acircumflex ⟨ˆ⟩ innastaʿlīq).[5]
The three vowel diacritics may be doubled at the end of a word to indicate that the vowel is followed by the consonantn. They may or may not be consideredḥarakāt and are known astanwīn⟨تَنْوِين⟩, or nunation. The signs indicate, from left to right,-an, -in, -un.
These endings are used as non-pausal grammatical indefinite case endings inLiterary Arabic orclassical Arabic (triptotes only). In a vocalised text, they may be written even if they are not pronounced (seepausa). Seei‘rāb for more details. In many spoken Arabic dialects, the endings are absent. Many Arabic textbooks introduce standard Arabic without these endings. The grammatical endings may not be written in some vocalized Arabic texts, as knowledge ofi‘rāb varies from country to country, and there is a trend towards simplifying Arabic grammar.
The sign⟨ـً⟩ is most commonly written in combination withalif⟨ـًا⟩,tā’ marbūṭah⟨ةً⟩,alif hamzah⟨أً⟩, or stand-alonehamzah⟨ءً⟩.Alif should always be written (except for words ending intā’ marbūṭah, hamzah or diptotes) even ifan is not. Grammatical cases andtanwīn endings in indefinite triptote forms:
Theshadda orshaddah⟨شَدَّة⟩ (shaddah), ortashdid⟨تَشْدِيد⟩ (tashdīd), is a diacritic shaped like a small written Latin "w".
It is used to indicategemination (consonant doubling or extra length), which is phonemic in Arabic. It is written above the consonant which is to be doubled. It is the onlyḥarakah that is commonly used in ordinary spelling to avoidambiguity. For example:⟨دّ⟩/dd/;madrasah⟨مَدْرَسَة⟩ ('school') vs.mudarrisah⟨مُدَرِّسَة⟩ ('teacher', female). Note that when the doubled letter bears a vowel, it is the shaddah that the vowel is attached to, not the letter itself:⟨دَّ⟩/dda/,⟨دِّ⟩/ddi/.
Shaddahs are encodedU+0651 ّARABIC SHADDA,U+FE7C ﹼARABIC SHADDA ISOLATED FORM, orU+FE7D ﹽARABIC SHADDA MEDIAL FORM.
Thei‘jām (إِعْجَام; sometimes also callednuqaṭ)[6] are the diacritic points that distinguish various consonants that have the same form (rasm), such as⟨ص⟩/sˤ/,⟨ض⟩/dˤ/. Typicallyi‘jām are not considered diacritics but part of the letter.
Early manuscripts of theQuran did not use diacritics either for vowels or to distinguish the different values of therasm. Vowel pointing was introduced first, as a red dot placed above, below, or beside therasm, and later consonant pointing was introduced, as thin, short black single or multiple dashes placed above or below therasm. Thesei‘jām became black dots about the same time as theḥarakāt became small black letters or strokes.
Typically, Egyptians do not use dots under finalyā’ (ي), which looks exactly likealif maqṣūrah (ى) in handwriting and in print. This practice is also used in copies of themuṣḥaf (Qurʾān) scribed by‘Uthman Ṭāhā. The same unification ofyā andalif maqṣūrā has happened inPersian, resulting in whatthe Unicode Standard calls "Arabic Letter Farsi Yeh", that looks exactly the same asyā in initial and medial forms, but exactly the same asalif maqṣūrah in final and isolated forms.
At the time when thei‘jām was optional, unpointed letters were ambiguous. To clarify that a letter would lacki‘jām in pointed text, the letter could be marked with a small v- orseagull-shaped diacritic above, also a superscript semicircle (crescent), a subscript dot (except in the case of⟨ح⟩; three dots were used with⟨س⟩), or a subscript miniature of the letter itself. A superscript stroke known asjarrah, resembling a longfatħah, was used for a contracted (assimilated)sin. Thus⟨ڛ سۣ سۡ سٚ⟩ were all used to indicate that the letter in question was truly⟨س⟩ and not⟨ش⟩.[7] These signs, collectively known as‘alāmātu-l-ihmāl, are still occasionally used in modernArabic calligraphy, either for their original purpose (i.e. marking letters withouti‘jām), or often as purely decorative space-fillers. The smallک above thekāf in its final and isolated forms⟨ك ـك⟩ was originally an‘alāmatu-l-ihmāl that became a permanent part of the letter. Previously this sign could also appear above the medial form ofkāf, when that letter was written without the stroke on itsascender. Whenkaf was written without that stroke, it could be mistaken forlam, thuskaf was distinguished with a superscriptkaf or a small superscripthamza (nabrah), andlam with a superscriptl-a-m (lam-alif-mim).[8]
Although not always considered a letter of the alphabet, thehamzaهَمْزة (hamzah,glottal stop), often stands as a separate letter in writing, is written in unpointed texts and is not considered atashkīl. It may appear as a letter by itself or as a diacritic over or under analif,wāw, oryā.
Which letter is to be used to support thehamzah depends on the quality of the adjacent vowels and its location in the word;
Consider the following words:⟨أَخ⟩/ʔax/ ("brother"),⟨إسْماعِيل⟩/ʔismaːʕiːl/ ("Ismael"),⟨أُمّ⟩/ʔumm/ ("mother"). All three of above words "begin" with a vowel opening the syllable, and in each case,alif is used to designate the initial glottal stop (theactual beginning). But if we considermiddle syllables "beginning" with a vowel:⟨نَشْأة⟩/naʃʔa/ ("origin"),⟨أَفْئِدة⟩/ʔafʔida/ ("hearts"—notice the/ʔi/ syllable; singular⟨فُؤاد⟩/fuʔaːd/),⟨رُؤُوس⟩/ruʔuːs/ ("heads", singular⟨رَأْس⟩/raʔs/), the situation is different, as noted above. See the comprehensive article onhamzah for more details.
Diacritics not used in Modern Standard Arabic but in other languages that use the Arabic script, and sometimes to write Arabic dialects, include (the list is not exhaustive):
Description | Unicode | Example | Language(s) | Notes |
---|---|---|---|---|
Bars and lines | ||||
diagonal bar above | گ | Arabic (Iraq),Balti,Burushaski, Kashmiri,Kazakh, Khowar,Kurdish, Kyrgyz,Persian, Sindhi,Urdu, Uyghur | ||
horizontal bar above | ![]() | Pashto | ||
vertical line above | ئۈ | Uyghur |
| |
Dots | ||||
2 dots (vertical) | ݭ ݙ | |||
4 dots | ڐ ٿ ڐ ڙ | Sindhi, Old Hindustani | ||
dot below | U+065C ٜARABIC VOWEL SIGN DOT BELOW | ٜ بٜ | African languages[10] |
|
Variants of standard Arabic diacritics | ||||
wavy hamza | ٲ اٟ | Kashmiri |
| |
curly dammah above | ◌ࣥ | Rohingya |
| |
Rohingya |
| |||
double dammah above | ◌ࣱ | Rohingya |
| |
inverted and regular curly dammahs above | ◌ࣨ | Rohingya |
| |
Tildes | ||||
diagonal tilde shape above | ◌ࣤ | Rohingya |
| |
diagonal tilde shape below | ◌ࣦ | Rohingya |
| |
Arabic letters | ||||
miniature Arabic letter hah (initial form) ﺣ above | ◌ۡ | Rohingya |
| |
miniature Arabic letter tah ط above | ݲ | Urdu | ||
Eastern Arabic numerals[13] | ||||
Eastern Arabic numeral 2: ٢ above | U+0775,U+0778,U+077A | ݵ ݸ ݺ | Burushaski |
|
Eastern Arabic numeral 3: ٣ above | U+0776,U+0779,U+077B | ݶ ݹ ݻ | Burushaski |
|
Urdu number 4: ۴ above or below | U+0777,U+077C,U+077D | ݷ ݼ ݽ | Burushaski |
|
Other shapes | ||||
Nūn ġuṇnā, "u" shape above | ن٘ | Urdu |
| |
"v" shape above | ۆ ێ ئۆ | Azerbaijani,Turkmen,Kurdish,Kazakh,Uyghur | ||
inverted "v" shape above | یٛ | Azerbaijani,Turkmen | ||
dotted fatha | ◌ࣵ | Wolof | Latin à | |
circle with fatha | ◌ࣴ | Wolof | Latin ë | |
less than sign - below | ◌ࣹ | Wolof | Latin e | |
greater than sign - below | ◌ࣺ | Wolof | Latin é | |
less than sign - above | ◌ࣷ | Wolof | Latin o | |
greater than sign - above | ◌ࣸ | Wolof | Latin ó | |
ring | ګ | Pashto |
| |
Other shapes | ||||
"fish" shape above | دࣤ࣬ دࣥ࣬ دࣦ࣯ | Rohingya | Ṭāna, e.g.دࣤ࣬ / دࣥ࣬ / دࣦ࣯ written above or below other diacritics to mark along rising tone (/˨˦/).[14][15] | |
Various | Urdu |
|
Historically Arabic script has been adopted and used by many tonal languages, examples includeXiao'erjing forMandarin Chinese as well asAjami script adopted for writing various languages of Western Africa. However, the Arabic script never had an inherent way of representing tones until it was adapted for theRohingya language. TheRohingya Fonna are 3 tone markers which are part of the standardized and accepted orthographic convention of Rohingya. It remains the only known instance of tone markers within theArabic script.[14][15]
Tone markers act as "modifiers" of vowel diacritics. In simpler words, they are "diacritics for the diacritics". They are written "outside" of the word, meaning that they are written above the vowel diacritic if the diacritic is written above the word, and they are written below the diacritic if the diacritic is written below the word. They are only ever written where there are vowel diacritics. This is important to note, as without the diacritic present, there is no way to distinguish between tone markers andI‘jām i.e. dots that are used for purpose of phonetic distinctions of consonants.
Hārbāy
TheHārbāy as it is called in Rohingya, is a single dot that's placed on top ofFatḥah andḌammah, orcurly Fatḥah andcurly Ḍammah (vowel diacritics unique to Rohinghya), or their respectiveFatḥatan andḌammatan versions, and it's placed underneathKasrah orcurly Kasrah, or their respectiveKasratan version. (e.g.دً࣪ / دٌ࣪ / دࣨ࣪ / دٍ࣭) This tone marker indicates ashort high tone (/˥/).[14][15]
Ṭelā
TheṬelā as it is called in Rohingya, is two dots that are placed on top ofFatḥah andḌammah, orcurly Fatḥah andcurly Ḍammah, or their respectiveFatḥatan andḌammatan versions, and it's placed underneathKasrah orcurly Kasrah, or their respectiveKasratan version. (e.g.دَ࣫ / دُ࣫ / دِ࣮) This tone marker indicates along falling tone (/˥˩/).[14][15]
Ṭāna
TheṬāna as it is called in Rohingya, is a fish-like looping line that is placed on top ofFatḥah andḌammah, orcurly Fatḥah andcurly Ḍammah, or their respectiveFatḥatan andḌammatan versions, and it's placed underneathKasrah orcurly Kasrah, or their respectiveKasratan version. (e.g.دࣤ࣬ / دࣥ࣬ / دࣦ࣯) This tone marker indicates along rising tone (/˨˦/).[14][15]
According to tradition, the first to commission a system ofḥarakāt wasAli who appointedAbu al-Aswad al-Du'ali for the task. Abu al-Aswad devised a system of dots to signal the three short vowels (along with their respective allophones) of Arabic. This system of dots predates thei‘jām, dots used to distinguish between different consonants.
Abu al-Aswad's system of Harakat was different from the system we know today. The system used red dots with each arrangement or position indicating a different short vowel.
A dot above a letter indicated the vowela, a dot below indicated the voweli, a dot on the side of a letter stood for the vowelu, and two dots stood for thetanwīn.
However, the early manuscripts of the Qur'an did not use the vowel signs for every letter requiring them, but only for letters where they were necessary for a correct reading.
The precursor to the system we know today is Al Farahidi's system.al-Farāhīdī found that the task of writing using two different colours was tedious and impractical. Another complication was that thei‘jām had been introduced by then, which, while they were short strokes rather than the round dots seen today, meant that without a color distinction the two could become confused.
Accordingly, he replaced theḥarakāt with small superscript letters: small alif, yā’, and wāw for the short vowels corresponding to the long vowels written with those letters, a smalls(h)īn forshaddah (geminate), a smallkhā’ forkhafīf (short consonant; no longer used). His system is essentially the one we know today.[17]
The process of automatically restoring diacritical marks is called diacritization or diacritic restoration. It is useful to avoid ambiguity in applications such asArabic machine translation,text-to-speech, andinformation retrieval. Automatic diacritization algorithms have been developed.[18][19] ForModern Standard Arabic, thestate-of-the-art algorithm has aword error rate (WER) of 4.79%. The most common mistakes are propernouns andcase endings.[20] Similar algorithms exist for othervarieties of Arabic.[21]
{{cite book}}
: CS1 maint: ignored ISBN errors (link)