This page uses notation for orthographic or other linguistic analysis. For the meaning of how⟨ ⟩,| |,/ /, and[ ]are used here, seethis page.
Adiacritic (alsodiacritical mark,diacritical point,diacritical sign, oraccent) is aglyph added to aletter or to a basic glyph. The term derives from theAncient Greekδιακριτικός (diakritikós, "distinguishing"), fromδιακρίνω (diakrínō, "to distinguish"). The worddiacritic is anoun, though it is sometimes used in anattributive sense, whereasdiacritical is only anadjective. Some diacritics, such as theacute⟨ó⟩,grave⟨ò⟩, andcircumflex⟨ô⟩ (all shown above an 'o'), are often calledaccents. Diacritics may appear above or below a letter or in some other position such as within the letter or between two letters.
The main use of diacritics inLatin script is to change the sound-values of the letters to which they are added. Historically, English has used thediaeresis diacritic to indicate the correct pronunciation of ambiguous words, such as "coöperate", without which the <oo> letter sequence could be misinterpreted to be pronounced/ˈkuːpəreɪt/. Other examples are the acute and grave accents, which can indicate that a vowel is to be pronounced differently than is normal in that position, for example not reduced to /ə/ or silent as in the case of the two uses of the letter e in the nounrésumé (as opposed to the verbresume) and the help sometimes provided in the pronunciation of some words such asdoggèd,learnèd,blessèd, and especially words pronounced differently than normal in poetry (for examplemovèd,breathèd).
Most other words with diacritics in English areborrowings from languages such asFrench to better preserve the spelling, such as the diaeresis onnaïve andNoël, theacute fromcafé, thecircumflex in the wordcrêpe, and thecedille infaçade. All these diacritics, however, are frequently omitted in writing, and English is the only major modernEuropean language that does not have diacritics in common usage.[a]
InLatin-script alphabets in other languages, diacritics may distinguish betweenhomonyms, such as theFrenchlà ("there") versusla ("the"), which are both pronounced/la/. InGaelic type, a dot over a consonant indicateslenition of the consonant in question. In otherwriting systems, diacritics may perform other functions.Vowel pointing systems, namely theArabicharakat and theHebrewniqqud systems, indicate vowels that are not conveyed by the basic alphabet. TheIndicvirama ( ् etc.) and the Arabicsukūn ( ـْـ ) mark the absence of vowels.Cantillation marks indicateprosody. Other uses include theEarly Cyrillictitlo stroke ( ◌҃ ) and the Hebrewgershayim ( ״ ), which, respectively, markabbreviations oracronyms, and Greek diacritical marks, which showed that letters of the alphabet were being used asnumerals. InVietnamese and theHanyu Pinyin official romanization system for Mandarin in China, diacritics are used to mark thetones of the syllables in which the marked vowels occur.
Inorthography andcollation, a letter modified by a diacritic may be treated either as a new, distinct letter or as a letter–diacritic combination. This varies from language to language and may vary from case to case within a language.
In some cases, letters are used as "in-line diacritics", with the same function as ancillary glyphs, in that they modify the sound of the letter preceding them, as in the case of the "h" in the English pronunciation of "sh" and "th".[2] Such letter combinations are sometimes even collated as a single distinct letter. For example, the spelling sch was traditionally often treated as a separate letter in German. Words with that spelling were listed after all other words spelled with s in card catalogs in the Vienna public libraries, for example (before digitization).
The tilde, dot, comma,titlo, apostrophe, bar, and colon are sometimes diacritical marks, but also have other uses.
Not all diacritics occur adjacent to the letter they modify. In theWali language of Ghana, for example, an apostrophe indicates a change of vowel quality, but occurs at the beginning of the word, as in the dialects’Bulengee and’Dolimi. Because ofvowel harmony, all vowels in a word are affected, so the scope of the diacritic is the entire word. Inabugida scripts, like those used to writeHindi andThai, diacritics indicate vowels, and may occur above, below, before, after, or around the consonant letter they modify.
Thetittle (dot) on the letter⟨i⟩ or the letter⟨j⟩, of the Latin alphabet originated as a diacritic to clearly distinguish⟨i⟩ from theminims (downstrokes) of adjacent letters. It first appeared in the 11th century in the sequenceii (as iningeníí), then spread toi adjacent tom, n, u, and finally to all lowercaseis. The⟨j⟩, originally a variant ofi, inherited the tittle. The shape of the diacritic developed from initially resembling today's acute accent to a long flourish by the 15th century. With the advent ofRoman type it was reduced to the round dot we have today.[3]
Several languages of eastern Europe use diacritics on both consonants and vowels, whereas in western Europedigraphs are more often used to change consonant sounds. Most languages in Europe use diacritics on vowels, aside from English where there are typically none (withsome exceptions).
(ــًــٍــٌـ)tanwīn (تنوين) symbols: Serve a grammatical role inArabic. The sign ـً is most commonly written in combination withalif, e.g.ـًا.
(ــّـ)shadda: Gemination (doubling) of consonants.
(ٱ)waṣla: Comes most commonly at the beginning of a word. Indicates a type ofhamza that is pronounced only when the letter is read at the beginning of the talk.
(آ)madda: A written replacement for ahamza that is followed by an alif, i.e. (ءا). Read as a glottal stop followed by a long/aː/, e.g.ءاداب، ءاية، قرءان، مرءاة are written out respectively asآداب، آية، قرآن، مرآة. This writing rule does not apply when the alif that follows ahamza is not a part of the stem of the word, e.g.نتوءات is not written out asنتوآت as the stemنتوء does not have an alif that follows itshamza.
(ــٰـ)superscriptalif (also "short" or "dagger alif": A replacement for an original alif that is dropped in the writing out of some rare words, e.g.لاكن is not written out with the original alif found in the word pronunciation, instead it is written out asلٰكن.
ḥarakāt (In Arabic:حركات also calledتشكيلtashkīl):
(ــَـ)fatḥa (a)
(ــِـ)kasra (i)
(ــُـ)ḍamma (u)
(ــْـ)sukūn (no vowel)
Theḥarakāt or vowel points serve two purposes:
They serve as a phonetic guide. They indicate the presence of short vowels (fatḥa,kasra, orḍamma) or their absence (sukūn).
For nouns, Theḍamma is for the nominative,fatḥa for the accusative, andkasra for the genitive.
For verbs, theḍamma is for the imperfective,fatḥa for the perfective, and thesukūn is for verbs in the imperative orjussive moods.
Vowel points ortashkīl should not be confused with consonant points oriʿjam (إعجام) – one, two or three dots written above or below a consonant to distinguish between letters of the same or similarform.
The diacritics 〮 and〯 , known as Bangjeom (방점; 傍點), were used to mark pitch accents inHangul forMiddle Korean. They were written to the left of a syllable in vertical writing and above a syllable in horizontal writing.
Devanagari scripts (from Brahmic family) compound letters, which are vowels combined with consonants, have diacritics. Here,क (k) is shown with vowel diacritics. That is:ा, ि, े, ु, ौ ़, ः, etc.
A dot above and a dot below a letter represent[a], transliterated asa oră,
Two diagonally-placed dots above a letter represent[ɑ], transliterated asā orâ orå,
Two horizontally-placed dots below a letter represent[ɛ], transliterated ase orĕ; often pronounced[ɪ] and transliterated asi in theEast Syriac dialect,
Two diagonally-placed dots below a letter represent[e], transliterated asē,
A dot underneath theBeth represent a soft[v] sound, transliterated asv
A tilde (~) placed underGamel represent a[dʒ] sound, transliterated asj
The letterWaw with a dot below it represents[u], transliterated asū oru,
The letterWaw with a dot above it represents[o], transliterated asō oro,
The letterYōḏ with a dot beneath it represents[i], transliterated asī ori,
A semicircle underPeh represents an[f] sound, transliterated asf orph.
In addition to the above vowel marks, transliteration of Syriac sometimes includesə,e̊ or superscripte (or often nothing at all) to represent an original Aramaicschwa that became lost later on at some point in the development of Syriac.[4] Some transliteration schemes find its inclusion necessary for showing spirantization or for historical reasons.[5][6]
Some non-alphabetic scripts also employ symbols that function essentially as diacritics.
Non-pureabjads (such asHebrew andArabic script) andabugidas use diacritics for denotingvowels. Hebrew and Arabic also indicate consonant doubling and change with diacritics; Hebrew andDevanagari use them for foreign sounds. Devanagari and related abugidas also use a diacritical mark called avirama to mark the absence of a vowel. In addition, Devanagari uses the moon-dotchandrabindu ( ँ) for vowel nasalization.
Unified Canadian Aboriginal Syllabics use several types of diacritics, including the diacritics with alphabetic properties known as Medials and Finals. Although long vowels originally were indicated with a negative line through the Syllabic glyphs, making the glyph appear broken, in the modern forms, adot above is used to indicate vowel length. In some of the styles, aring above indicates a long vowel with a [j] off-glide. Another diacritic, the "inner ring" is placed at the glyph's head to modify [p] to [f] and [t] to [θ]. Medials such as the "w-dot" placed next to the Syllabics glyph indicates a [w] being placed between the syllable onset consonant and the nucleus vowel. Finals indicate the syllable coda consonant; some of the syllable coda consonants in word medial positions, such as with the "h-tick", indicate the fortification of the consonant in the syllable following it.
Different languages use different rules to put diacritic characters inalphabetical order. For example, French and Portuguese treat letters with diacritical marks the same as the underlying letter for purposes of ordering and dictionaries. TheScandinavian languages and theFinnish language, by contrast, treat the characters with diacritics⟨å⟩,⟨ä⟩, and⟨ö⟩ as distinct letters of the alphabet, and sort them after⟨z⟩. Usually⟨ä⟩ (a-umlaut) and⟨ö⟩ (o-umlaut) [used in Swedish and Finnish] are sorted as equivalent to⟨æ⟩ (ash) and⟨ø⟩ (o-slash) [used in Danish and Norwegian]. Also,aa, when used as an alternative spelling to⟨å⟩, is sorted as such. Other letters modified by diacritics are treated as variants of the underlying letter, with the exception that⟨ü⟩ is frequently sorted as⟨y⟩.
Languages that treat accented letters as variants of the underlying letter usually alphabetize words with such symbols immediately after similar unmarked words. For instance, in German where two words differ only by an umlaut, the word without it is sorted first in German dictionaries (e.g.schon and thenschön, orfallen and thenfällen). However, when names are concerned (e.g. in phone books or in author catalogues in libraries), umlauts are often treated as combinations of the vowel with a suffixed⟨e⟩; Austrian phone books now treat characters with umlauts as separate letters (immediately following the underlying vowel).
In Spanish, the grapheme⟨ñ⟩ is considered a distinct letter, different from⟨n⟩ and collated between⟨n⟩ and⟨o⟩, as it denotes a different sound from that of a plain⟨n⟩. But the accented vowels⟨á⟩,⟨é⟩,⟨í⟩,⟨ó⟩,⟨ú⟩ are not separated from the unaccented vowels⟨a⟩,⟨e⟩,⟨i⟩,⟨o⟩,⟨u⟩, as the acute accent in Spanish only modifiesstress within the word or denotes a distinction betweenhomonyms, and does not modify the sound of a letter.
For a comprehensive list of the collating orders in various languages, seeCollating sequence.
Modern computer technology was developed mostly in countries that speak Western European languages (particularly English), and many early binary encodings were developed with a bias favoring English—a language written without diacritical marks. Withcomputer memory andcomputer storage at premium, earlycharacter sets were limited to the Latin alphabet, the ten digits and a few punctuation marks and conventional symbols. The American Standard Code for Information Interchange (ASCII), first published in 1963, encoded just 95 printable characters. It included just four free-standing diacritics—acute, grave, circumflex and tilde—which were to be used by backspacing and overprinting the base letter. TheISO/IEC 646 standard (1967) defined national variations that replace some American graphemes withprecomposed characters (such as⟨é⟩,⟨è⟩ and⟨ë⟩), according to language—but remained limited to 95 printable characters.
Unicode was conceived to solve this problem by assigning every known character its own code; if this code is known, most modern computer systems provide amethod to input it. For historical reasons, almost all the letter-with-accent combinations used in European languages were given uniquecode points and these are calledprecomposed characters. For other languages, it is usually necessary to use acombining character diacritic together with the desired base letter. Unfortunately, even as of 2024, many applications and web browsers remain unable to operate the combining diacritic concept properly.
Depending on thekeyboard layout andkeyboard mapping, it is more or less easy to enter letters with diacritics on computers and typewriters. Keyboards used in countries where letters with diacritics are the norm, have keys engraved with the relevant symbols. In other cases, such as when theUS international orUK extended mappings are used, the accented letter is created by first pressing the key with the diacritic mark, followed by the letter to place it on. This method is known as thedead key technique, as it produces no output of its own but modifies the output of the key pressed after it.
Latvian has the following letters:⟨ā⟩,⟨ē⟩,⟨ī⟩,⟨ū⟩,⟨č⟩,⟨ģ⟩,⟨ķ⟩,⟨ļ⟩,⟨ņ⟩,⟨š⟩,⟨ž⟩
Lithuanian. In general usage, where letters appear with the caron (⟨č⟩,⟨š⟩ and⟨ž⟩), they are considered as separate letters from⟨c⟩,⟨s⟩ or⟨z⟩ and collated separately; letters with theogonek (⟨ą⟩,⟨ę⟩,⟨į⟩ and⟨ų⟩), themacron (⟨ū⟩) and theoverdot (⟨ė⟩) are considered as separate letters as well, but not given a unique collation order.
Welsh uses the circumflex, diaeresis, acute, and grave accents on its seven vowels⟨a⟩,⟨e⟩,⟨i⟩,⟨o⟩,⟨u⟩,⟨w⟩,⟨y⟩ (hence the composites⟨â⟩,⟨ê⟩,⟨î⟩,⟨ô⟩,⟨û⟩,⟨ŵ⟩,⟨ŷ⟩,⟨ä⟩,⟨ë⟩,⟨ï⟩,⟨ö⟩,⟨ü⟩,⟨ẅ⟩,⟨ÿ⟩,⟨á⟩,⟨é⟩,⟨í⟩,⟨ó⟩,⟨ú⟩,⟨ẃ⟩,⟨ý⟩,⟨à⟩,⟨è⟩,⟨ì⟩,⟨ò⟩,⟨ù⟩,⟨ẁ⟩,⟨ỳ⟩). However all except the circumflex (which is used as a macron) are fairly rare.
Following spelling reforms since the 1970s,Scottish Gaelic uses graves only, which can be used on any vowel (⟨à⟩,⟨è⟩,⟨ì⟩,⟨ò⟩,⟨ù⟩). Formerly acute accents could be used on⟨á⟩,⟨ó⟩ and⟨é⟩, which were used to indicate a specific vowel quality. With the elimination of these accents, the new orthography relies on the reader having prior knowledge of pronunciation of a given word.
Manx uses the cedilla diacritic⟨ç⟩ combined with h to give the digraph⟨çh⟩ (pronounced/tʃ/) to mark the distinction between it and the digraph⟨ch⟩ (pronounced/h/ or/x/). Other diacritics used in Manx included the circumflex and diaeresis, as in⟨â⟩,⟨ê⟩,⟨ï⟩, etc. to mark the distinction between two similarly spelled words but with slightly differing pronunciation.
Irish uses only acute accents to mark long vowels, following the 1948 spelling reform.Lenition is indicated using anoverdot inGaelic type (⟨ċ⟩,⟨ḋ⟩,⟨ḟ⟩,⟨ġ⟩,⟨ṁ⟩,⟨ṗ⟩,⟨ṡ⟩,⟨ṫ⟩); inRoman type, a suffixed⟨h⟩ is used. Thus,a ṁáṫair is equivalent toa mháthair.
Breton does not have a single orthography (spelling system), but uses diacritics for a number of purposes. The diaeresis is used to mark that two vowels are pronounced separately and not as a diphthong/digraph. The circumflex is used to mark long vowels, but usually only when the vowel length is not predictable by phonology. Nasalization of vowels may be marked with a tilde, or following the vowel with the letter⟨ñ⟩. The plural suffix -où is used as a unified spelling to represent a suffix with a number of pronunciations in different dialects, and to distinguish this suffix from the digraph⟨ou⟩ which is pronounced as/u:/. An apostrophe is used to distinguish⟨c'h⟩, pronounced/x/ as the digraph⟨ch⟩ is used in other Celtic languages, from the French-influenced digraph ch, pronounced/ʃ/.
Estonian has a distinct letter⟨õ⟩, which contains a tilde. Estonian vowels withdouble-dot diacritics⟨ä⟩,⟨ö⟩,⟨ü⟩ are similar to German, but these are also distinct letters, unlike
German umlauted letters. All four have their own place in the alphabet, between⟨w⟩ and⟨x⟩.Carons in⟨š⟩ or⟨ž⟩ appear only in foreign proper names andloanwords. Also these are distinct letters, placed in the alphabet betweens andt.
Finnish uses double-dotted vowels (⟨ä⟩ and⟨ö⟩). As in Swedish and Estonian, these are regarded as individual letters, rather than 'vowel + diacritic' combinations (as happens in German). It also uses the characters⟨å⟩,⟨š⟩ and⟨ž⟩ in foreign names and loanwords. In the Finnish and Swedish alphabets,⟨å⟩,⟨ä⟩ and⟨ö⟩ collate as separate letters after⟨z⟩, the others as variants of their base letter.
Hungarian uses the double-dot, the acute and double acute diacritics (the last is unique to Hungarian): (⟨ö⟩,⟨ü⟩), (⟨á⟩,⟨é⟩,⟨í⟩,⟨ó⟩,⟨ú⟩) and (⟨ő⟩,⟨ű⟩). The acute accent indicates the long form of a vowel (in case of⟨i⟩/⟨í⟩,⟨o⟩/⟨ó⟩,⟨u⟩/⟨ú⟩) while the double acute performs the same function for⟨ö⟩ and⟨ü⟩. The acute accent can also indicate a different sound (more open, as in case of⟨a⟩/⟨á⟩,⟨e⟩/⟨é⟩). Both long and short forms of the vowels are listed separately in theHungarian alphabet, but members of the pairs⟨a⟩/⟨á⟩,⟨e⟩/⟨é⟩,⟨i⟩/⟨í⟩,⟨o⟩/⟨ó⟩,⟨ö⟩/⟨ő⟩,⟨u⟩/⟨ú⟩ and⟨ü⟩/⟨ű⟩ are collated in dictionaries as the same letter.
Livonian has the following letters:⟨ā⟩,⟨ä⟩,⟨ǟ⟩,⟨ḑ⟩,⟨ē⟩,⟨ī⟩,⟨ļ⟩,⟨ņ⟩,⟨ō⟩,⟨ȯ⟩,⟨ȱ⟩,⟨õ⟩,⟨ȭ⟩,⟨ŗ⟩,⟨š⟩,⟨ț⟩,⟨ū⟩,⟨ž⟩.
Dutch uses acute, circumflex, grave and two-dots diacritics with most vowels and cedilla with c, as in French. This results in⟨á⟩,⟨à⟩,⟨ä⟩,⟨é⟩,⟨è⟩,⟨ê⟩,⟨ë⟩,⟨í⟩,⟨î⟩,⟨ï⟩,⟨ó⟩,⟨ô⟩,⟨ö⟩,⟨ú⟩,⟨û⟩,⟨ü⟩ and⟨ç⟩. This is mostly on words (and names) originating from French (likecrème, café, gêne, façade). The acute accent is also used to stress the vowel (likeéén). The two-dots diacritic is used as a linguistic diaeresis (avowel hiatus) that splits the two vowels, e.g.,reële, reünie, coördinatie), rather than to indicate a linguisticumlaut as used in German.
Afrikaans uses 16 additional vowel forms, both uppercase and lowercase:⟨á⟩,⟨ä⟩,⟨é⟩,⟨è⟩,⟨ê⟩,⟨ë⟩,⟨í⟩,⟨î⟩,⟨ï⟩,⟨ó⟩,⟨ô⟩,⟨ö⟩,⟨ú⟩,⟨û⟩,⟨ü⟩,⟨ý⟩.
Faroese uses acutes and some additional letters. All are considered separate letters and have their own place in the alphabet:⟨á⟩,⟨í⟩,⟨ó⟩,⟨ú⟩,⟨ý⟩ and⟨ø⟩.
Icelandic uses acutes and other additional letters. All are considered separate letters, and have their own place in the alphabet: {{angbrZá}},⟨é⟩,⟨í⟩,⟨ó⟩,⟨ú⟩,⟨ý⟩ and⟨ö⟩.
Danish andNorwegian use additional characters like the o-slash⟨ø⟩ and the a-overring⟨å⟩. These letters come after⟨z⟩ and⟨æ⟩ in the order⟨ø⟩,⟨å⟩. Historically, the⟨å⟩ has developed from a ligature by writing a small superscript⟨a⟩ over a lowercase⟨a⟩; if an⟨å⟩ character is unavailable, some Scandinavian languages allow the substitution of a doubleda, thus⟨aa⟩. The Scandinavian languages collate these letters after⟨z⟩, but have different nationalcollation standards.
Swedish uses a-diaeresis (⟨ä⟩) and o-diaeresis (⟨ö⟩) in the place ofash (⟨æ⟩) and slashed o (⟨ø⟩) in addition to the a-overring (⟨å⟩). Historically, the two-dots diacritic for the Swedish letters⟨ä⟩ and⟨ö⟩ developed from a small Gothic⟨e⟩ written above the letters. These letters are collated after⟨z⟩, in the order⟨å⟩,⟨ä⟩,⟨ö⟩.
Portuguese uses a tilde with the vowels⟨a⟩ and⟨o⟩ and a cedilla with c.
Romanian uses abreve on the lettera (⟨ă⟩) to indicate the soundschwa/ə/, as well as a circumflex over the lettersa (⟨â⟩) andi (⟨î⟩) for the sound/ɨ/. Romanian also writes acomma below the letterss (⟨ș⟩) andt (⟨ț⟩) to represent the sounds/ʃ/ and/t͡s/, respectively. These characters are collated after their non-diacritic equivalent.
Spanish uses acute accents (⟨á⟩,⟨é⟩,⟨í⟩,⟨ó⟩,⟨ú⟩) to indicate stress falling on a different syllable than the one it would fall on based on default rules, and to distinguish certain one-syllable homonyms (e.g.el (masculine singular definite article) andél [he]). The acute accent is also used to break up sequences of vowels that would normally be pronouced as a diphthong into two syllables, as in the wordreír. Diaeresis is used on u only, to distinguish the combinationsgue, gui/ge/,/gi/ fromgüe, güi/gwe/,/gwi/, e.g.vergüenza, lingüística. The tilde on⟨ñ⟩ is not considered a diacritic as⟨ñ⟩ is considered a distinct letter from⟨n⟩, not a mutated form of it.
Gaj's Latin alphabet, used inCroatian and latinizedSerbian, has the symbols⟨č⟩,⟨ć⟩,⟨đ⟩,⟨š⟩ and⟨ž⟩, which are considered separate letters and are listed as such in dictionaries and other contexts in which words are listed according to alphabetical order. It also has onedigraph including a diacritic,dž, which is also alphabetized independently, and follows⟨d⟩ and precedes⟨đ⟩ in the alphabetical order.
TheCzech alphabet uses the acute (á é í ó ú ý), caron (čďěňřšťž), and for one letter (ů) the ring. (In ď and ť the caron is modified to look rather like an apostrophe.) Letter with caron are considered separate letters, whereas vowels are considered only as longer variants of the unaccented letters. Acute does not affect alphabetical order, letters with caron are ordered after original counterparts.
Polish has the following letters:ąćęłńóśźż. These are considered to be separate letters: each of them is placed in the alphabet immediately after its Latin counterpart (e.g.⟨ą⟩ between⟨a⟩ and⟨b⟩),⟨ź⟩ and⟨ż⟩ are placed after⟨z⟩ in that order.
TheSerbian Cyrillic alphabet has no diacritics, instead it has a grapheme (glyph) for every letter ofits Latin counterpart (including Latin letters with diacritics and the digraphs dž,lj andnj).
TheSlovak alphabet uses the acute (á é í ó ú ýĺŕ), caron (č ď ľ ň š ť ž dž), umlaut (ä) and circumflex accent (ô). All of those are considered separate letters and are placed directly after the original counterpart in thealphabet.[8]
The basicSlovenian alphabet has the symbols⟨č⟩,⟨š⟩, and⟨ž⟩, which are considered separate letters and are listed as such in dictionaries and other contexts in which words are listed according to alphabetical order. Letters with acaron are placed right after the letters as written without the diacritic. The letter⟨đ⟩ ('d with bar') may be used in non-transliterated foreign words, particularly names, and is placed after⟨č⟩ and before⟨d⟩.
Azerbaijani includes the distinct Turkish alphabet lettersÇ,Ğ,I,İ,Ö,Ş andÜ.
Crimean Tatar includes the distinct Turkish alphabet lettersÇ,Ğ,I,İ,Ö,Ş andÜ. Unlike Turkish, Crimean Tatar also has the letterÑ.
Gagauz includes the distinct Turkish alphabet lettersÇ,Ğ,I,İ,Ö andÜ. Unlike Turkish, Gagauz also has the lettersÄ,ÊȘ andȚ.Ș andȚ are derived from theRomanian alphabet for the same sounds. Sometime the TurkishŞ may be used instead ofȘ.
Turkish uses a⟨G⟩ with a breve (⟨Ğ⟩), two letters withtwo dots (⟨Ö⟩ and⟨Ü⟩, representing two rounded front vowels), two letters with a cedilla (⟨Ç⟩ and⟨Ş⟩, representing the affricate/tʃ/ and the fricative/ʃ/), and also possesses a dotted capital⟨İ⟩ (and adotless lowercase⟨ı⟩ representing a high unrounded back vowel). In Turkish each of these are separate letters, rather than versions of other letters, where dotted capital⟨İ⟩ and lower case⟨i⟩ are the same letter, as are dotless capital⟨I⟩ and lowercase⟨ı⟩.Typographically,⟨Ç⟩ and⟨Ş⟩ are sometimes rendered with anunderdot, as in⟨Ṣ⟩. The new Azerbaijani, Crimean Tatar, and Gagauz alphabets are based on the Turkish alphabet and its same diacriticized letters, with some additions.
Turkmen includes the distinct Turkish alphabet lettersÇ,Ö,Ş andÜ. In addition, Turkmen uses A with diaeresis (Ä) to represent/æ/, N with caron (⟨Ň⟩) to represent thevelar nasal/ŋ/, Y with acute (⟨Ý⟩) to represent thepalatal approximant/j/, and Z with caron (⟨Ž⟩) to represent/ʒ/.
Albanian has two special lettersÇ andË upper and lowercase. They are placed next to the most similar letters in the alphabet, c and e correspondingly.
Esperanto has the symbolsŭ,ĉ,ĝ,ĥ,ĵ andŝ, which are included in the alphabet, and considered separate letters.
Filipino also has the characterñ as a letter and is collated between n and o.
ModernGreenlandic does not use any diacritics, althoughø andå are used to spell loanwords, especially from Danish and English.[9][10] From 1851 until 1973, Greenlandic was written in an alphabet invented bySamuel Kleinschmidt, wherelong vowels andgeminate consonants were indicated by diacritics on vowels (in the case of consonant gemination, the diacritics were placed on the vowel preceding the affected consonant). For example, the nameKalaallit Nunaat was spelledKalâdlit Nunât. This scheme uses thecircumflex (◌̂) to indicate a long vowel (e.g.⟨ât, ît, ût⟩; modern:⟨aat, iit, uut⟩), anacute accent (◌́) to indicate gemination of the following consonant: (i.e.⟨ák, ík, úk⟩; modern:⟨akk, ikk, ukk⟩) and, finally, atilde (◌̃) or agrave accent (◌̀), depending on the author, indicates vowel length and gemination of the following consonant (e.g.⟨ãt/àt, ĩt/ìt, ũt/ùt⟩; modern:⟨aatt, iitt, uutt⟩).⟨ê, ô⟩, used only before⟨r, q⟩, are now written⟨ee, oo⟩ in Greenlandic.
Hawaiian uses the kahakō (macron) over vowels, although there is some disagreement over considering them as individual letters. The kahakō over a vowel can completely change the meaning of a word that is spelled the same but without the kahakō.
Kurdish uses the symbolsÇ,Ê,Î,Ş andÛ with other 26 standard Latin alphabet symbols.
Lakota alphabet uses thecaron for the lettersč,ȟ,ǧ,š, andž. It also uses theacute accent for stressed vowels á, é, í, ó, ú, áŋ, íŋ, úŋ.
Malay uses some diacritics such asá, ā, ç, í, ñ, ó, š, ú. Uses of diacritics was continued until late 19th century exceptā andē.
Maltese uses a C, G, and Z with a dot over them (Ċ, Ġ, Ż), and also has an H with an extra horizontal bar. For uppercase H, the extra bar is written slightly above the usual bar. For lowercase H, the extra bar is written crossing the vertical, like at, and not touching the lower part (Ħ, ħ). The above characters are considered separate letters. The letter 'c' without a dot has fallen out of use due to redundancy. 'Ċ' is pronounced like the English 'ch' and 'k' is used as a hard c as in 'cat'. 'Ż' is pronounced just like the English 'Z' as in 'Zebra', while 'Z' is used to make the sound of 'ts' in English (like 'tsunami' or 'maths'). 'Ġ' is used as a soft 'G' like in 'geometry', while the 'G' sounds like a hard 'G' like in 'log'. The digraph 'għ' (calledgħajn after theArabic letter nameʻayn for غ) is considered separate, and sometimes ordered after 'g', whilst in other volumes it is placed between 'n' and 'o' (the Latin letter 'o' originally evolved from the shape ofPhoenicianʻayin, which was traditionally collated after Phoeniciannūn).
Vietnamese uses thehorn diacritic for the lettersơ andư; thecircumflex for the lettersâ,ê, andô; thebreve for the letteră; and a bar through the letterđ. Separately, it also has á, à, ả, ã and ạ, the five tones used for vowels besides the flat tone 'a'.
Belarusian,Bulgarian, Russian and Ukrainian have the letter⟨й⟩.
Belarusian andRussian have the letter⟨ё⟩. In Russian, this letter is usually replaced by⟨е⟩, although it has a different pronunciation. The use of⟨е⟩ instead of⟨ё⟩ does not affect the pronunciation.Ё is always used in children's books and in dictionaries. Aminimal pair is все (vs'e, "everybody" pl.) and всё (vs'o, "everything" n. sg.). In Belarusian the replacement by⟨е⟩ is a mistake; in Russian, it is permissible to use either⟨е⟩ or⟨ё⟩ for⟨ё⟩ but the former is more common in everyday writing (as opposed to instructional or juvenile writing).
In Bulgarian andMacedonian the possessive pronoun ѝ (ì, "her") is spelled with a grave accent in order to distinguish it from the conjunction и (i, "and").
The acute accent◌́ above any vowel in Cyrillic alphabets is used in dictionaries, books for children and foreign learners to indicate the word stress, it also can be used for disambiguation of similarly spelled words with different lexical stresses.
English is one of the few European languages that does not have many words that contain diacritical marks. Instead, digraphs are the main way the Modern English alphabet adapts the Latin to its phonemes. Exceptions are unassimilated foreign loanwords, including borrowings fromFrench (and, increasingly,Spanish, likejalapeño andpiñata); however, the diacritic is also sometimes omitted from such words. Loanwords that frequently appear with the diacritic in English includecafé,résumé orresumé (a usage that helps distinguish it from the verbresume),soufflé, andnaïveté (seeEnglish terms with diacritical marks). In older practice (and even among some orthographically conservative modern writers), one may see examples such asélite,mêlée andrôle.
English speakers and writers once used the diaeresis more often than now in words such ascoöperation (from Fr.coopération),zoölogy (from Grk.zoologia), andseeër (now more commonlysee-eror simply seer) as a way of indicating that adjacent vowels belonged to separate syllables, but this practice has become far less common.The New Yorker magazine is a major publication that continues to use the diaeresis in place of a hyphen for clarity and economy of space.[12]
A few English words, often when used out of context, especially in isolation, can only be distinguished from other words of the same spelling by using a diacritic or modified letter. These includeexposé,lamé,maté,öre,øre,résumé androsé. In a few words, diacritics that did not exist in the original have been added for disambiguation, as inmaté (from Sp. and Port. mate), saké (the standard Romanization of the Japanese has no accent mark), andMalé (from Dhivehi މާލެ), to clearly distinguish them from the English wordsmate, sake, andmale.
The acute and grave accents are occasionally used in poetry and lyrics: the acute to indicate stress overtly where it might be ambiguous (rébel vs.rebél) or nonstandard for metrical reasons (caléndar), the grave to indicate that an ordinarily silent or elided syllable is pronounced (warnèd,parlìament).
In certain personal names such asRenée andZoë, often two spellings exist, and the person's own preference will be known only to those close to them. Even when the name of a person is spelled with a diacritic, likeCharlotte Brontë, this may be dropped in English-language articles, and even in official documents such aspassports, due either to carelessness, the typist not knowing how to enter letters with diacritical marks, or technical reasons (California, for example, does not allow[clarification needed] names with diacritics, as the computer system cannot process such characters). They also appear in some worldwide company names and/or trademarks, such asNestlé andCitroën.
The following languages have letter-diacritic combinations that are not considered independent letters.
Afrikaans uses a diaeresis to mark vowels that are pronounced separately and not as one would expect where they occur together, for examplevoel (to feel) as opposed tovoël (bird). The circumflex is used inê, î, ô andû generally to indicate longclose-mid, as opposed toopen-mid vowels, for example in the wordswêreld (world) andmôre (morning, tomorrow). The acute accent is used to add emphasis in the same way as underlining or writing in bold or italics in English, for exampleDit is jóú boek (It isyour book). The grave accent is used to distinguish between words that are different only in placement of the stress, for exampleappel (apple) andappèl (appeal) and in a few cases where it makes no difference to the pronunciation but distinguishes between homophones. The two most usual cases of the latter are in the sayingsòf... òf (either... or) andnòg... nòg (neither... nor) to distinguish them fromof (or) andnog (again, still).
Aymara uses a diacritical horn overp, q, t, k, ch.
Catalan has the following composite characters:à, ç, é, è, í, ï, ó, ò, ú, ü, l·l. The acute and the grave indicatestress andvowel height, the cedilla marks the result of a historicalpalatalization, the diaeresis indicates either ahiatus, or that the letteru is pronounced when the graphemesgü, qü are followed bye ori, theinterpunct (·) distinguishes the different values ofll/l·l.
Dutch uses the diaeresis. For example, inruïne it means that theu and thei are separately pronounced in their usual way, and not in the way that the combinationui is normally pronounced. Thus it works as a separation sign and not as an indication for an alternative version of thei. Diacritics can be used for emphasis (érg koud forvery cold) or for disambiguation between a number of words that are spelled the same when context does not indicate the correct meaning (één appel = one apple,een appel = an apple;vóórkomen = to occur,voorkómen = to prevent). Grave and acute accents are used on a very small number of words, mostly loanwords. The ç also appears in some loanwords.[13]
Faroese. Non-Faroese accented letters are not added to the Faroese alphabet. These includeé,ö,ü,å and recently also letters likeš,ł, andć.
Filipino has the following composite characters:á, à, â, é, è, ê, í, ì, î, ó, ò, ô, ú, ù, û. Everyday use of diacritics for Filipino is, however, uncommon, and meant only to distinguish betweenhomonyms between a word with the usualpenultimate stress and one with a different stress placement. This aids both comprehension and pronunciation if both are relatively adjacent in a text, or if a word is itself ambiguous in meaning. The letterñ ("eñe") is not an with a diacritic, but rather collated as a separate letter, one of eight borrowed from Spanish. Diacritics appear inSpanishloanwords andnames observing Spanish orthography rules.
Finnish. Carons inš andž appear only in foreign proper names andloanwords, but may be substituted withsh orzh if and only if it is technically impossible to produce accented letters in the medium. Contrary to Estonian,š andž are not considered distinct letters in Finnish.
French uses five diacritics. The grave (accent grave) marks the sound/ɛ/ when over an e, as inpère ("father") or is used to distinguish words that are otherwise homographs such asa/à ("has"/"to") orou/où ("or"/"where"). Theacute (accent aigu) is only used in "é", modifying the "e" to make the sound/e/, as inétoile ("star"). Thecircumflex (accent circonflexe) generally denotes that an S once followed the vowel in Old French or Latin, as infête ("party"), the Old French beingfeste and the Latin beingfestum. Whether the circumflex modifies the vowel's pronunciation depends on the dialect and the vowel. Thecedilla (cédille) indicates that a normally hard "c" (before the vowels "a", "o", and "u") is to be pronounced/s/, as inça ("that"). The diaeresis diacritic (French:tréma) indicates that two adjacent vowels that would normally be pronounced as one are to be pronounced separately, as inNoël ("Christmas").
Galician vowels can bear an acute (á, é, í, ó, ú) to indicate stress or difference between two otherwise same written words (é, 'is' vs.e, 'and'), but the diaeresis is only used withï andü to show two separate vowel sounds in pronunciation. Only in foreign words may Galician use other diacritics such asç (common during the Middle Ages),ê, orà.
German uses the three umlauted charactersä,ö andü. These diacritics indicate vowel changes. For instance, the wordOfen[ˈoːfən] "oven" has the pluralÖfen[ˈøːfən]. The mark originated as a superscripte; a handwritten blacklettere resembles two parallel vertical lines, like a diaeresis. Due to this history, "ä", "ö" and "ü" can be written as "ae", "oe" and "ue" respectively, if the umlaut letters are not available.
Hebrew has many various diacritic marks known asniqqud that are used above and below script to represent vowels. These must be distinguished fromcantillation, which are keys to pronunciation and syntax.
TheInternational Phonetic Alphabet uses diacritic symbols and characters to indicate phonetic features or secondary articulations.
Irish uses the acute to indicate that a vowel islong:á,é,í,ó,ú. It is known assíneadh fada "long sign" or simplyfada "long" in Irish. In the olderGaelic type,overdots are used to indicatelenition of a consonant:ḃ,ċ,ḋ,ḟ,ġ,ṁ,ṗ,ṡ,ṫ.
Italian mainly has theacute and thegrave (à,è/é,ì,ò/ó,ù), typically to indicate a stressed syllable that would not be stressed under the normal rules of pronunciation but sometimes also to distinguish between words that are otherwise spelled the same way (e.g. "e", and; "è", is). Despite its rare use, Italian orthography allows the circumflex (î) too, in two cases: it can be found in old literary context (roughly up to 19th century) to signal asyncope (fêro→fecero, they did), or in modern Italian to signal the contraction of ″-ii″ due to the plural ending -i whereas the root ends with another -i; e.g.,s. demonio,p. demonii→demonî; in this case the circumflex also signals that the word intended is not demoni, plural of "demone" by shifting the accent (demònî, "devils"; dèmoni, "demons").
Maltese also uses the grave on its vowels to indicate stress at the end of a word with two syllables or more:– lowercase letters: à, è, ì, ò, ù; capital letters: À, È, Ì, Ò, Ù
Occitan has the following composite characters:á, à, ç, é, è, í, ï, ó, ò, ú, ü, n·h, s·h. The acute and the grave indicatestress andvowel height, the cedilla marks the result of a historicalpalatalization, the diaeresis indicates either ahiatus, or that the letteru is pronounced when the graphemesgü, qü are followed bye ori, and theinterpunct (·) distinguishes the different values ofnh/n·h andsh/s·h (i.e., that the letters are supposed to be pronounced separately, not combined into "ny" and "sh").
Portuguese has the following composite characters:à, á, â, ã, ç, é, ê, í, ó, ô, õ, ú. The acute and the circumflex indicate stress and vowel height, the grave indicates crasis, the tilde represents nasalization, and the cedilla marks the result of a historical lenition.
Acutes are also used inSlavic language dictionaries and textbooks to indicatelexical stress, placed over the vowel of the stressed syllable. This can also serve to disambiguate meaning (e.g., in Russian писа́ть (pisáť) means "to write", but пи́сать (písať) means "to piss"), or "бо́льшая часть" (the biggest part) vs "больша́я часть" (the big part).
Spanish uses the acute and the diaeresis. The acute is used on a vowel in a stressed syllable in words with irregular stress patterns. It can also be used to "break up" adiphthong as intío (pronounced[ˈti.o], rather than[ˈtjo] as it would be without the accent). Moreover, the acute can be used to distinguish words that otherwise are spelled alike, such assi ("if") andsí ("yes"), and also to distinguish interrogative and exclamatory pronouns from homophones with a different grammatical function, such asdonde/¿dónde? ("where"/"where?") orcomo/¿cómo? ("as"/"how?"). The acute may also be used to avoid typographical ambiguity, as in1 ó 2 ("1 or 2"; without the acute this might be interpreted as "1 0 2". The diaeresis is used only overu (ü) for it to be pronounced[w] in the combinationsgue andgui, whereu is normally silent, for exampleambigüedad. In poetry, the diaeresis may be used oni andu as a way to force a hiatus. As foreshadowed above, in nasalñ thetilde (squiggle) is not considered a diacritic sign at all, but a composite part of a distinct glyph, with its own chapter in the dictionary: a glyph that denotes the 15th letter of the Spanish alphabet.
Swedish uses theacute to show non-standard stress, for example inkafé (café) andresumé (résumé). This occasionally helps resolve ambiguities, such aside (hibernation) versusidé (idea). In these words, the acute is not optional. Some proper names use non-standard diacritics, such asCarolina Klüft andStaël von Holstein. For foreign loanwords the original accents are strongly recommended, unless the word has been infused into the language, in which case they are optional. Hencecrème fraîche butampere. Swedish also has the letterså,ä, andö, but these are considered distinct letters, nota ando with diacritics.
Tamil does not have any diacritics in itself, but uses theArabic numerals 2, 3 and 4 as diacritics to represent aspirated, voiced, and voiced-aspirated consonants when Tamil script is used to write long passages inSanskrit.
Vietnamese uses the acute (dấu sắc), the grave (dấu huyền), the tilde (dấu ngã), the underdot (dấu nặng) and the hook above (dấu hỏi) on vowels astone indicators.
Welsh uses the circumflex, diaeresis, acute, and grave on its seven vowelsa, e, i, o, u, w, y. The most common is the circumflex (which it callsto bach, meaning "little roof", oracen grom "crooked accent", orhirnod "long sign") to denote a long vowel, usually to disambiguate it from a similar word with a short vowel or a semivowel. The rarer grave accent has the opposite effect, shortening vowel sounds that would usually be pronounced long. The acute accent and diaeresis are also occasionally used, to denote stress and vowel separation respectively. Thew-circumflexŵ and they-circumflexŷ are among the most commonly accented characters in Welsh, but unusual in languages generally, and were until recently very hard to obtain in word-processed and HTML documents.
Several languages that are not written with the Roman alphabet aretransliterated, or romanized, using diacritics. Examples:
Arabic has severalromanisations, depending on the type of the application, region, intended audience, country, etc. many of them extensively use diacritics, e.g., some methods use an underdot for renderingemphatic consonants (ṣ, ṭ, ḍ, ẓ, ḥ). The macron is often used to render long vowels. š is often used for/ʃ/, ġ for/ɣ/.
Chinese has severalromanizations that use the umlaut, but only onu (ü). InHanyu Pinyin, the fourtones ofMandarin Chinese are denoted by the macron (first tone), acute (second tone), caron (third tone) and grave (fourth tone) diacritics. Example:ā, á, ǎ, à.
Sanskrit, as well as many of its descendants, likeHindi andBengali, uses a losslessromanization system,IAST. This includes several letters with diacritical markings, such as the macron (ā, ī, ū), over- and underdots (ṛ, ḥ, ṃ, ṇ, ṣ, ṭ, ḍ) as well as a few others (ś, ñ).
Possibly the greatest number of combining diacriticsrequired to compose a valid character in any Unicode language is 8, for the "well-known grapheme cluster in Tibetan and Ranjana scripts" orHAKṢHMALAWARAYAṀ.[14]
It consists of
U+0F67ཧTIBETAN LETTER HA
U+0F90ྐTIBETAN SUBJOINED LETTER KA
U+0FB5ྵTIBETAN SUBJOINED LETTER SSA
U+0FA8ྨTIBETAN SUBJOINED LETTER MA
U+0FB3ླTIBETAN SUBJOINED LETTER LA
U+0FBAྺTIBETAN SUBJOINED LETTER FIXED-FORM WA
U+0FBCྼTIBETAN SUBJOINED LETTER FIXED-FORM RA
U+0FBBྻTIBETAN SUBJOINED LETTER FIXED-FORM YA
U+0F82ྂTIBETAN SIGN NYI ZLA NAA DA
An example of rendering, may be broken depending on browser:
Some users have explored the limits of rendering in web browsers and other software by "decorating" words with excessive nonsensical diacritics per character to produce so-calledZalgo text.
^Baum, Dan (16 December 2010)."The New Yorker's odd mark — the diaeresis".dscriber. Archived fromthe original on 16 December 2010.Among the many mysteries of The New Yorker is that funny little umlaut over words like coöperate and reëlect. The New Yorker seems to be the only publication on the planet that uses it, and I always found it a little pretentious until I did some research. Turns out, it's not an umlaut. It's a diaeresis.
^Sweet, Henry (1877).A Handbook of Phonetics. Cambridge: Cambridge University Press. pp. 174–175.Even letters with accents and diacritics [...] being only cast for a few founts, act practically as new letters. [...] We may consider the h in sh and th simply as a diacritic written for convenience on a line with the letter it modifies.
^Nestle, Eberhard (1888).Syrische Grammatik mit Litteratur, Chrestomathie und Glossar. Berlin: H. Reuther's Verlagsbuchhandlung. [translated to English asSyriac grammar with bibliography, chrestomathy and glossary, by R. S. Kennedy. London: Williams & Norgate 1889].
^Coakley, J. F. (2002).Robinson's Paradigms and Exercises in Syriac Grammar (5th ed.). Oxford University Press.ISBN978-0-19-926129-1.
^Michaelis, Ioannis Davidis (1784).Grammatica Syriaca.