Movatterモバイル変換


[0]ホーム

URL:


#

Chapter 17

Southeast Asia-II

Indonesia and the Philippines

Four traditional Philippine scripts are described here: Tagalog (Baybayin), Hanunóo, Buhid, and Tagbanwa. They have limited current use. Each is a very simplifiedabugida which makes use of a few nonspacing vowel signs.

Although the official language of Indonesia, Bahasa Indonesia, is written in the Latin script, Indonesia has many local, traditional scripts, which are ultimately derived from Brahmi. Some of these scripts are documented in this chapter. Balinese and Javanese are closely related, highly ornate scripts; Balinese is primarily used for the Balinese language on the island of Bali, and Javanese for the Javanese language on the island of Java. Sundanese is used to write the Sundanese language on the island of Java. The Rejang script is used to write the Rejang language in southwest Sumatra, and the Batak script is used to write several Batak dialects, also on the island of Sumatra. Buginese (Lontara) and Makasar are two similar scripts that developed on the island of Sulawesi and are used to write Buginese, Makasar, and other languages.

Kawi, a historical script derived from Brahmi, is the common ancestor of several or perhaps all of the scripts described in this chapter. Kawi was used to write the Old Javanese, Sanskrit, Old Malay, Old Balinese, and Old Sundanese languages in insular southeast Asia between the 8th and 16th century.

#17.1 Philippine Scripts: Tagalog, Hanunóo, Buhid, and Tagbanwa

#17.1.1 Tagalog: U+1700–U+171F

#Hanunóo: U+1720–U+173F

#Buhid: U+1740–U+175F

#Tagbanwa: U+1760–U+177F

The Tagalog (Baybayin), Hanunóo, Buhid, and Tagbanwa scripts are traditional scripts of the Philippines, and are in limited use today. South Indian scripts of the Pallava dynasty made their way to the Philippines, although the exact route is uncertain. They may have been transported by way of the Kavi scripts of Western Java between the tenth and fourteenth centuriesCE.

Written accounts of the Tagalog script by Spanish missionaries and documents in Tagalog date from the mid-1500s. The first book in this script was printed in Manila in 1593. While the Tagalog script (also known as Baybayin), was used to write Tagalog, Bisaya, Ilocano, and other languages, it fell out of normal use by the mid-1700s. The modern Tagalog language (also known as Filipino) is now primarily written in the Latin script.

The Hanunóo, Buhid, and Tagbanwa scripts are related to Tagalog but may not be directly descended from it. The Hanunóo and the Buhid peoples live in Mindoro, while the Tagbanwa live in Palawan. Hanunóo enjoys the most use; it is widely used to write love poetry, a popular pastime among the Hanunóo. Tagbanwa is used less often.

#17.1.2 Principles of the Philippine Scripts

The Philippine scripts share features with the other Brahmi-derived scripts to which they are related.

#Consonant Letters. Philippine scripts have consonants containing an inherent-a vowel, which may be modified by the addition of vowel signs or canceled (killed) by the use of a virama-type mark. No conjunct consonants are employed in the Philippine scripts.

Two forms of the Tagalog letterra are encoded:U+170DTAGALOG LETTER RA represents the preferred modern form that derived from the letterda. In contrast,U+171FTAGALOG LETTER ARCHAIC RA represents a distinct historical form, also known as the Zambalesra.

#Independent Vowel Letters. Philippine scripts use independent vowels to write syllables that do not begin with one of the consonant letters.

#Dependent Vowel Signs. The vowel-i is written with a mark above the associated consonant, and the vowel-u with an identical mark below. The mark is known askudlit “diacritic,”tuldik “accent,” ortuldok “dot” in Tagalog, and asulitan “diacritic” in Tagbanwa. The Philippine scripts employ only the two vowel signsi andu, which are also used to stand for the vowelse ando, respectively.

#Virama. Although all languages normally written with the Philippine scripts have syllables ending in consonants, not all of the scripts have a mechanism for expressing the canceled -a. As a result, in those orthographies, the final consonants are unexpressed.

Francisco Lopez introduced a cross-shapedvirama for the Tagalog script in his 1620 catechism, but this innovation did not seem to find favor with native users, who seem to have considered the script adequate without it (they preferredᜃᜃᜉᜒkakapi toᜃᜃᜋᜉᜒkakampi). A similar reform for the Hanunóo script seems to have been better received. The Hanunóopamudpod was devised by Antoon Postma, who went to the Philippines from the Netherlands in the mid-1950s. In traditional orthography,ᜰᜲ ᜠᜩᜳ ᜪ ᜢᜩᜧsi apu ba upada is, with thepamudpod, rendered more accurately asᜰᜲ ᜠᜬ᜴ᜩᜳᜧ᜴ ᜪᜬ᜴ ᜢᜩᜧᜨ᜴si aypud bay upadan; the Hanunóo pronunciation issi aypod bay upadan.U+1715TAGALOG SIGN PAMUDPOD represents thepamudpod sign borrowed from Hanunóo for use in contemporary texts of the Tagalog script.

The Tagalogvirama, Hanunóopamudpod, and Tagalogpamudpod only cancel the inherent -a; they do not conjoin letters.

#Directionality. The Philippine scripts are read from left to right in horizontal lines running from top to bottom. They may be written or carved either in that manner or in vertical lines running from bottom to top, moving from left to right. In the latter case, the letters are written sideways so they may be read horizontally. This method of writing is probably due to the medium and writing implements used. Text is often scratched with a sharp instrument onto beaten strips of bamboo, which are held pointing away from the body and worked from the proximal to distal ends, in columns from left to right.

#Rendering. In Tagalog and Tagbanwa, the vowel signs simply rest over or under the consonants. In Hanunóo and Buhid, ligatures are often formed, as shown inTable 17-1.

#Table 17-1. Hanunóo and Buhid Vowel Sign Combinations
HanunóoBuhid
xx + ᜲx+ ᜳxx + ᝒx+ ᝓ
ᜣᜲᜣᜳᝃᝒᝃᝓ
ᜤᜲᜤᜳᝄᝒᝄᝓ
ᜥᜲᜥᜳᝅᝒᝅᝓ
ᜦᜲᜦᜳᝆᝒᝆᝓ
ᜧᜲᜧᜳᝇᝒᝇᝓ
ᜨᜲᜨᜳᝈᝒᝈᝓ
ᜩᜲᜩᜳᝉᝒᝉᝓ
ᜪᜲᜪᜳᝊᝒᝊᝓ
ᜫᜲᜫᜳᝋᝒᝋᝓ
ᜬᜲᜬᜳᝌᝒᝌᝓ
ᜭᜲᜭᜳᝍᝒᝍᝓ
ᜮᜲᜮᜳᝎᝒᝎᝓ
ᜯᜲᜯᜳᝏᝒᝏᝓ
ᜰᜲᜰᜳᝐᝒᝐᝓ
ᜱᜲᜱᜳᝑᝒᝑᝓ

#Punctuation. Punctuation has been unified for the Philippine scripts. In the Hanunóo block,U+1735PHILIPPINE SINGLE PUNCTUATION andU+1736PHILIPPINE DOUBLE PUNCTUATION are encoded.

#17.2 Buginese

#17.2.1 Buginese: U+1A00–U+1A1F

The Buginese script is used on the island of Sulawesi, mainly in the southwest. A variety of traditional literature has been printed in it. As of 1971, as many as 2.3 million speakers of Buginese were reported in the southern part of Sulawesi. The Buginese script is one of the easternmost of the Brahmi scripts and is perhaps related to Javanese. It is attested as early as the fourteenth centuryCE. Buginese bears some affinity to Tagalog and, like Tagalog, does not traditionally record final consonants. The Buginese language, an Austronesian language with a rich traditional literature, is one of the foremost languages of Indonesia. The script was previously also used to write the Makasar, Bimanese, and Madurese languages.

#Repertoire. The repertoire contained in the Buginese block is intended to represent the core set of Buginese characters in standard printing fonts developed in the mid 19th century for the Bugis and Makasar languages. Variant letterforms and other extensions seen in palm leaf manuscripts or additional letters used in some languages are not yet encoded in this block. A visible virama symbol has also been attested, but is not needed for this core repertoire for Buginese.

#Structure. Buginese vowel signs are used in a manner similar to that seen in other Brahmi-derived scripts. Consonants have an inherent /a/ vowel sound. Consonant conjuncts are not formed.

#Ligature. One ligature is found in the Buginese script. It is formed by the ligation of <a,-i> +ya to representîya, as shown in the first line ofFigure 17-1. The ligature takes the shape of the Buginese letterya, but with a dot applied at the far left side. Contrast that with the normal representation of the syllableyi, in which the dot indicating the vowel sign occurs in a centered position, as shown in the second line ofFigure 17-1. The ligature forîya is not obligatory; it would be requested by inserting azero width joiner.

#Figure 17-1. Buginese Ligature

#Order. Several orderings are possible for Buginese. The Unicode Standard encodes the Buginese characters in the Matthes order.

#Punctuation. Buginese uses spaces between certain units. One punctuation symbol,U+1A1EBUGINESE PALLAWA, is functionally similar to the full stop and comma of the Latin script. There is also another separation mark,U+1A1FBUGINESE END OF SECTION.

U+A9CFJAVANESE PANGRANGKEP or a doubling of the vowel sign (especiallyU+1A19BUGINESE VOWEL SIGN E andU+1A1ABUGINESE VOWEL SIGN O) is sometimes used to denote word reduplication. The shape of the Buginese reduplication sign is based on the Arabic digit two. The functionally similarU+A9CFJAVANESE PANGRANGKEP which has the same shape, is recommended for this sign in Buginese, rather thanU+0662ARABIC-INDIC DIGIT TWO, to avoid potential problems for text layout.

#Numerals. There are no known digits specific to the Buginese script.

#17.3 Balinese

#17.3.1 Balinese: U+1B00–U+1B7F

The Balinese script, oraksara Bali, is used for writing the Balinese language, the native language of the people of Bali, known locally asbasa Bali. It is a descendant of the ancient Brahmi script of India, and therefore it has many similarities with modern scripts of South Asia and Southeast Asia, which are also members of that family. The Balinese script is used to write Kawi, or Old Javanese, which strongly influenced the Balinese language in the eleventh centuryCE. The script is also used to write the Sasak language, which is spoken on the island of Lombok to the east of Bali. Some Balinese words have been borrowed from Sanskrit, which may also be written in the Balinese script.

#Structure. Balinese consonants have an inherent-a vowel sound. Consonants combine with following consonants in the usual Brahmic fashion: the inherent vowel is “killed” byU+1B44◌᭄BALINESE ADEG ADEG (virama), and the following consonant is subjoined, often with a change in shape.Table 17-2 shows the base consonants and their conjunct forms.

#Table 17-2. Balinese Base Consonants and Conjunct Forms
ConsonantBase FormConjunct Form
ka◌᭄ᬓ
kha◌᭄ᬔ
ga◌᭄ᬕ
gha◌᭄ᬖ
nga◌᭄ᬗ
ca◌᭄ᬘ
cha◌᭄ᬙ
ja◌᭄ᬚ
jha◌᭄ᬛ
nya◌᭄ᬜ
tta◌᭄ᬝ
ttha◌᭄ᬞ
dda◌᭄ᬟ
ddha◌᭄ᬠ
nna◌᭄ᬡ
ta◌᭄ᬢ
tha◌᭄ᬣ
da◌᭄ᬤ
dha◌᭄ᬥ
na◌᭄ᬦ
pa◌᭄ᬧ
pha◌᭄ᬨ
ba◌᭄ᬩ
bha◌᭄ᬪ
ma◌᭄ᬫ
ya◌᭄ᬬ
ra◌᭄ᬭ
la◌᭄ᬮ
wa◌᭄ᬯ
sha◌᭄ᬰ
ssa◌᭄ᬱ
sa◌᭄ᬲ
ha◌᭄ᬳ

The seven lettersU+1B45BALINESE LETTER KAF SASAK throughU+1B4BBALINESE LETTER ASYURA SASAK were proposed in the late 20th century as extensions for the Sasak language to replace use of the nukta,U+1B34◌᬴BALINESE SIGN REREKAN, but have seen little use.

Balinese dependent vowel signs are used in a manner similar to that employed by other Brahmic scripts.

Independent vowels are used in a manner similar to that seen in other Brahmic scripts, with a few differences. For example,U+1B05BALINESE LETTER AKARA andU+1B0BBALINESE LETTER RA REPA can be treated as consonants; that is, they can be followed byadeg adeg. In Sasak, the vowel letterakara can be followed by an explicitadeg adegᬅ᭄ in word- or syllable-final position, where it indicates the glottal stop; other consonants can also be subjoined to it.

#Behavior of ra.U+1B03◌ᬃBALINESE SIGN SURANG typically represents a final consonant-r. This sign is derived from the cluster-initial signr- (also known asrepha) of the parent script Kawi; it still represents arepha when transliterating Kawi, but it has been reanalyzed to represent a final-r in the Balinese orthography. As shown inFigure 17-2, the same written form, pronounced asdhamar in the Balinese language, representsdharma in transliterated Kawi. Because asurang used as a final-r cannot be visually distinguished from asurang used asrepha, they are encoded in the same way. When combined with another above-base sign, asurang used asrepha may be rendered to the left of the other sign rather than to the right.

#Figure 17-2.Writingdharma in Balinese

For searching and sorting,surang should be treated as equivalent tora. When the processed text is transliterated Kawi,surang also needs to be reordered to precede its orthographic syllable. Two other combining signs are also equivalent to base letters for searching and sorting:U+1B02◌ᬂBALINESE SIGN CECEK (anusvara) is equivalent tonga, andU+1B04◌ᬄBALINESE SIGN BISAH (visarga) is equivalent toha.

#Behavior of ra repa. The unique behavior ofU+1B0BBALINESE LETTER RA REPA (vocalic ṛ) results from a reanalysis of the independent vowel letter as a consonant. In a compound word in which the first element ends in a consonant and the second element begins with an originalra +pepet, such asPak Rërëhᬧᬓ᭄ᬋᬋᬄ “Mr Rërëh”, the subjoined form ofra repa is used; this particular sequence is encodedka +adeg adeg +ra repa. However, in other contexts where thera repa represents the original Sanskrit vowel,U+1B3A◌ᬺBALINESE VOWEL SIGN RA REPA is used, as inKrësnaᬓᬺᬱ᭄ᬡ.

#Rendering. The vowel signsu anduu take different forms when combined with subscripted consonant clusters, as shown inTable 17-3. The upper limit of consonants in a cluster is three, the last of which can bey,w, orr.

#Table 17-3.Balinese Consonant Clusters withu anduu
SyllableGlyph
kyuᬓ᭄ᬬᬸ
kyuuᬓ᭄ᬬᬹ
kwuᬓ᭄ᬯᬸ
kwuuᬓ᭄ᬯᬹ
kruᬓ᭄ᬭᬸ
kruuᬓ᭄ᬭᬹ
kryuᬓ᭄ᬭ᭄ᬬᬸ
kryuuᬓ᭄ᬭ᭄ᬬᬹ
skruᬲ᭄ᬓ᭄ᬭᬸ
skruuᬲ᭄ᬓ᭄ᬭᬹ

#Nukta. The combining markU+1B34◌᬴BALINESE SIGN REREKAN (nukta) is used to extend the character repertoire for foreign sounds.

#Archaic Jnya. The characterU+1B4CBALINESE LETTER ARCHAIC JNYA is occasionally used in older texts in place ofja +subjoinednya. Both forms may be present in the same text, but the archaic form is not found in modern Balinese texts. A subjoined form of this character is unattested.

#Ordering. The traditional orderha na ca ra ka |da ta sa wa la |ma ga ba nga |pa ja ya nya is taught in schools, although van der Tuuk followed the Javanese orderpa ja ya nya |ma ga ba nga for the second half. The arrangement of characters in the code charts follows the Brahmic ordering.

#Punctuation.U+1B5EBALINESE CARIK SIKI andU+1B5FBALINESE CARIK PAREREN are used as comma and full stop, respectively. Their inverted versionsU+1B4EBALINESE INVERTED CARIK SIKI andU+1B4FBALINESE INVERTED CARIK PAREREN have been used in some manuscripts to indicate finer subdivisions.U+1B5DBALINESE CARIK PAMUNGKAH is used as a colon.

BothU+1B5ABALINESE PANTI andU+1B5BBALINESE PAMADA are used to begin a section of text. A shorter version ofpanti,U+1B7F᭿BALINESE PANTI BAWAK, may be used to indicate finer subdivisions.

A variety of punctuation marks are used to indicate the end of a section. These usually consist ofU+1B5CBALINESE WINDU enclosed within two other punctuation marks, which vary depending on which sign began the section. Examples include:carik siki᭞᭜᭞,carik pareren᭟᭜᭟ (sometimes calledpasalinan),panti᭚᭜᭚, andcarik agung᭛᭜᭛.

At the end of a text,U+1B7DBALINESE PANTI LANTANG andU+1B7EBALINESE PAMADA LANTANG may be used, depending on the secular or religious nature of the text. These may also be used together withU+1B5CBALINESE WINDU or their short counterparts in combinations such as᭽᭜᭽ and᭚᭜᭽.

#Line Breaking. Line breaks may occur after any orthographic syllable. Traditional Balinese texts are written on palm leaves; books of these leaves bound together are calledlontar.U+1B60BALINESE PAMENENG may be inserted in lontar texts at the end of a line to fill the line.

#Musical Symbols. Bali is well known for its rich musical heritage. A number of related notation systems are used to write music. To represent degrees of a scale, the syllablesding dong dang deng dung are used (encoded at U+1B61..U+1B64, U+1B66), in the same way thatdo re mi fa so la ti is used in Western tradition. The symbols representing these syllables are based on the vowel matras, together with some other symbols. However, unlike the regular vowel matras, these stand-alone spacing characters take diacritical marks. They also have different positions and sizes relative to the baseline. These matra-like symbols are encoded in the range U+1B61..U+1B6A, along with a modifiedaikara. Some notation systems use other spacing letters, such asU+1B09BALINESE LETTER UKARA andU+1B27BALINESE LETTER PA, which are not separately encoded for musical use. TheU+1B01◌ᬁBALINESE SIGN ULU CANDRA (candrabindu) can also be used withU+1B62BALINESE MUSICAL SYMBOL DENG andU+1B68BALINESE MUSICAL SYMBOL DEUNG, and possibly others.BALINESE SIGN ULU CANDRA can be used to indicate modre symbols as well.

A range of diacritical marks is used with these musical notation base characters to indicate metrical information. Some additional combining marks indicate the instruments used; this set is encoded at U+1B6B..U+1B73. A set of symbols describing certain features of performance are encoded at U+1B74..U+1B7C. These symbols describe the use of the right or left hand, the open or closed hand position, the “male” or “female” drum (of the pair) which is struck, and the quality of the striking.

More information about Balinese musical notations is available in Unicode Technical Note 51, “Musical Symbols and Sasak Characters in the Balinese Script”.

#Modre Symbols. The Balinese script also includes a range of “holy letters” called modre symbols. Most of these letters can be composed from the constituent parts currently encoded, includingU+1B01◌ᬁBALINESE SIGN ULU CANDRA.

#17.4 Javanese

#17.4.1 Javanese: U+A980–U+A9DF

The Javanese script, oraksara Jawa, is used for writing the Javanese language, known locally asbasa Jawa. The script is a descendent of the ancient Brahmi script of India, and so has many similarities with the modern scripts of South Asia and Southeast Asia which are also members of that family. The Javanese script is also used for writing Sanskrit, Jawa Kuna (a kind of Sanskritized Javanese), and transcriptions of Kawi, as well as the Sundanese language, also spoken on the island of Java, and the Sasak language, spoken on the island of Lombok.

The Javanese script was in current use in Java until about 1945; in 1928 Bahasa Indonesia was made the national language of Indonesia and its influence eclipsed that of other languages and their scripts. Traditional Javanese texts are written on palm leaves; books of these bound together are calledlontar, a word which derives from ron “leaf” and tal “palm”.

#Consonants. Consonants have an inherent-a vowel sound. Consonants combine with following consonants in the usual Brahmic fashion: the inherent vowel is “killed” byU+A9C0JAVANESE PANGKON, and the following consonant is subjoined, often with a change in shape.

In Javanese, Sanskrit vocalic liquids (short and long versions of and) are treated as consonant letters with an alternate inherent vowel:,reu,, andleu; they are not independent vowels with dependent vowel equivalents, as is the case in Balinese or Devanagari. Short and long versions of thevocalic-ḷ are separately encoded, asU+A98AJAVANESE LETTER NGA LELET andU+A98BJAVANESE LETTER NGA LELET RASWADI. In contrast, the long version of thevocalic-ṛ is represented by a sequence of the short vowelU+A989JAVANESE LETTER PA CEREK followed by the dependent vowel sign-aa,U+A9B4JAVANESE VOWEL SIGN TARUNG, serving as a length mark in this case.

U+A9B3JAVANESE SIGN CECAK TELU is a diacritic used with various consonantal base letters to represent foreign sounds. Typically these diacritic-marked consonants are used for sounds borrowed from Arabic.

#Independent Vowels. Independent vowel letters are used essentially as in other Brahmic scripts. Modern Javanese usesU+A986JAVANESE LETTER I andU+A987JAVANESE LETTER II for short and longi, but the Kawi orthography instead usesU+A985JAVANESE LETTER I KAWI andU+A986JAVANESE LETTER I for short and longi, respectively.

The long versions of theu ando vowels are written as sequences, usingU+A9B4JAVANESE VOWEL SIGN TARUNG as a length mark.

#Dependent Vowels. Javanese—unlike Balinese—represents multipart dependent vowels with sequences of characters, in a manner similar to the Myanmar script. The Balinese community considers it important to be able to directly transliterate Sanskrit into Balinese, so multipart dependent vowels are encoded as single, composite forms in Balinese, as is done in Devanagari. In contrast, for the Javanese script, the correspondence with Sanskrit letters is not so critical, and a different approach to the encoding has been taken. Similar to the treatment of long versions of Javanese independent vowels, the two-part dependent vowels are explicitly represented with a sequence of two characters, usingU+A9B4JAVANESE VOWEL SIGN TARUNG, as shown inFigure 17-3.

#Figure 17-3. Representation of Javanese Two-Part Vowels

Tarung is not used alone when writing the Javanese language, but it represents the vowelaa when writing Sanskrit ando when writing Sundanese. An alternative glyph oftarung has been separately encoded asU+A9B5JAVANESE VOWEL SIGN TOLONG, which is not normally needed, except when used in contrast with the ordinarytarung.

#Consonant Signs. The charactersU+A980JAVANESE SIGN PANYANGGA,U+A981JAVANESE SIGN CECAK, andU+A983JAVANESE SIGN WIGNYAN are analogous toU+0901DEVANAGARI SIGN CANDRABINDU,U+0902DEVANAGARI SIGN ANUSVARA, andU+0903DEVANAGARI SIGN VISARGA and behave in much the same way.

There are three medial consonant signs,U+A9BDJAVANESE CONSONANT SIGN KERET,U+A9BEJAVANESE CONSONANT SIGN PENGKAL, andU+A9BFJAVANESE CONSONANT SIGN CAKRA, which represent-rĕ, -ya,and-ra respectively. These medial consonant signs contrast with the subjoined forms of the letters,ya, andra. The subjoined forms may indicate a syllabic boundary, whereaskeret,pengkal, andcakra are used in ordinary consonant clusters.

#Rendering. There are many conjunct forms in Javanese, though most are fairly regular and easy to identify. Subjoined consonants and vowel signs rendered below them usually interact typographically. For example, the vowel signs [u] and [u:] take different forms when combined with subscripted consonant clusters. Consonant clusters may have up to three elements. In three-element clusters, the last element is always one of the medial glides:-ya,-wa, or-ra.

#Digits. The Javanese script has its own set of digits, seven of which (1, 2, 3, 6, 7, 8, 9) look just like letters of the alphabet. Implementations with concerns about security issues need to take this into account. The punctuation markU+A9C7JAVANESE PADA PANGKAT is often used with digits in order to help to distinguish numbers from sequences of letters.

#Punctuation. A large number of punctuation marks are used in Javanese. Titles may be flanked by the pair of ornamental characters,U+A9C1JAVANESE LEFT RERENGGAN andU+A9C2JAVANESE RIGHT RERENGGAN; glyphs used for these may vary widely.

U+A9C8JAVANESE PADA LINGSA is a danda mark that corresponds functionally to the use of a comma. The doubled form,U+A9C9JAVANESE PADA LUNGSI, corresponds functionally to the use of a full stop. It is also used as a “ditto” mark in vertical lists.U+A9C7JAVANESE PADA PANGKAT is used much like the European colon.

U+A9C7JAVANESE PADA PANGKAT is used to abbreviate personal names and is placed at the end of the abbreviation.

The doubledU+A9CBJAVANESE PADA ADEG ADEG typically begins a paragraph or section, while the simpleU+A9CAJAVANESE PADA ADEG is used as a common divider though it can be used in pairs marking text for attention. The two characters,U+A9CCJAVANESE PADA PISELEH andU+A9CDJAVANESE TURNED PADA PISELEH, are used similarly, either both together or withU+A9CCJAVANESE PADA PISELEH simply repeated.

The punctuation ring,U+A9C6JAVANESE PADA WINDU, is not used alone, a situation similar to the pattern of use for its Balinese counterpartU+1B5CBALINESE WINDU. When used withU+A9CBJAVANESE PADA ADEG ADEG thiswindu sign is calledpada guru,pada bab, oruger-uger, and is used to begin correspondence where the writer does not desire to indicate a rank distinction as compared to his audience. More formal letters may begin with one of the three signs:U+A9C3JAVANESE PADA ANDAP (for addressing a higher-ranked person),U+A9C4JAVANESE PADA MADYA (for addressing an equally-ranked person), orU+A9C5JAVANESE PADA LUHUR (for addressing a lower-ranked person).

#Reduplication.U+A9CFJAVANESE PANGRANGKEP is used to show the reduplication of a syllable. The character derives fromU+0662ARABIC-INDIC DIGIT TWO but in Javanese it does not have a numeric use. The Javanese reduplication mark is encoded as a separate character from the Arabic digit, because it differs in its Bidi_Class property value.

#Line Breaking. Opportunities for line breaking occur after any full orthographic syllable. Hyphens are not used.

In some printed texts, an epenthetic spacingU+A9BAJAVANESE VOWEL SIGN TALING is placed at the end of a line when the next line begins with the glyph forU+A9BAJAVANESE VOWEL SIGN TALING, which is reminiscent of a specialized hyphenation (or of quire marking). This practice is nearly impossible to implement in a free-flowing text environment. Typographers wishing to duplicate a printed page may manually insertU+00A0NO-BREAK SPACE beforeU+A9BAJAVANESE VOWEL SIGN TALING at the end of a line, but this would not be orthographically correct.

#17.5 Rejang

#17.5.1 Rejang: U+A930–U+A95F

The Rejang language is spoken by about 200,000 people living on the Indonesian island of Sumatra, mainly in the southwest. There are five major dialects: Lebong, Musi, Kebanagun, Pesisir (all in Bengkulu Province), and Rawas (in South Sumatra Province). Most Rejang speakers live in fairly remote rural areas, and slightly less than half of them are literate.

The Rejang script was in use prior to the introduction of Islam to the Rejang area. The earliest attested document appears to date from the mid-18th centuryCE. The traditional Rejang corpus consists chiefly of ritual texts, medical incantations, and poetry.

#Structure. Rejang is a Brahmi-derived script. It is related to other scripts of the Indonesian region, such as Batak and Buginese.

Consonants in Rejang have an inherent /a/ vowel sound. Vowel signs are used in a manner similar to that employed by other Brahmi-derived scripts. There are no consonant conjuncts. The basic syllabic structure is C(V)(F): a consonant, followed by an optional vowel sign and an optional final consonant sign or virama.

#Rendering. Rejang texts tend to have a slanted appearance typified by the appearance ofU+A937REJANG LETTER BA. This sense that the script is tilted to the right affects the placement of the combining marks for vowel signs. Vowel signs above a letter are offset to the right, and vowel signs below a letter are offset to the left, as the “above” and “below” positions for letters are perceived in terms of the overall slant of the letters.

#Ordering. The ordering of the consonants and vowel signs for Rejang in the code charts follows a generic Brahmic script pattern. The Brahmic ordering of Rejang consonants is attested in numerous sources. There is little evidence one way or the other for preferences in the relative order of Rejang vowel signs and consonant signs.

#Digits. There are no known script-specific digits for the Rejang script.

#Punctuation. European punctuation marks such as comma, full stop, and colon, are used in modern writing.U+A95FREJANG SECTION MARK may be used at the beginning and end of paragraphs.

Traditional Rejang texts tend not to use spaces between words, but their use does occur in more recent texts. There is no known use of hyphenation.

#17.6 Batak

#17.6.1 Batak: U+1BC0–U+1BFF

The Batak script is used on the island of Sumatra to write the five Batak dialects: Karo, Mandailing, Pakpak, Simalungun, and Toba. The script is calledsi-sia-sia orsurat na sampulu sia, which means “the nineteen letters.” The script is taught in schools mainly for cultural purposes, and is used on some signs for shops and government offices.

#Structure. Batak is a Brahmi-derived script. It is written left to right.

Consonants in Batak have an inherent /a/ vowel sound. Batak uses a vowel killer which is calledpangolat in Mandailing, Pakpak, and Toba. In Karo the killer is calledpenengen, and in Simalungen it is known aspanongonan. The appearance of the killer differs between some of the dialects.

Batak has three independent vowels and makes use of a number of vowel signs and two consonant signs. Some vowel signs are only used by certain language communities. There are no consonant conjuncts. The basic syllabic structure is C(V)(Cs|Cd): a consonant, followed by an optional vowel sign, which may be followed either by a consonant sign Cs (-ng or-h) or a killed final consonant Cd.

#Rendering. Most vowel signs and the two killers,U+1BF2BATAK PANGOLAT andU+1BF3BATAK PANONGONAN, are spacing marks.U+1BEEBATAK VOWEL SIGN U can ligate with its base consonant.

The two consonant signs,U+1BF0BATAK CONSONANT SIGN NG andU+1BF1BATAK CONSONANT SIGN H, are nonspacing marks, usually rendered above the spacing vowel signs. WhenU+1BF0BATAK CONSONANT SIGN NG occurs together with the nonspacing mark,U+1BE9BATAK VOWEL SIGN EE, both are rendered above the base consonant, with the glyph for theee at the top left and the glyph for theng at the top right.

The main peculiarity of Batak rendering concerns the reordering of the glyphs for vowel signs when one of the two killers,pangolat orpanongonan, is used to close the syllable by killing the inherent vowel of a final consonant. This reordering for display is entirely regular. So, while the representation of the syllable /tip/ is done in logical order: <ta,vowel sign i,pa,pangolat>, when rendered for display the glyph for the vowel sign is visually applied to the final consonant,pa, rather than to theta. The glyph for thepangolat always stays at the end of the syllable.

#Punctuation. Punctuation is not normally used; instead all letters simply run together. However, a number ofbindu characters are occasionally used to disambiguate similar words or phrases.U+1BFFBATAK SYMBOL BINDU PANGOLAT is trailing punctuation, following a word, surrounding the previous character somewhat.

The minor mark used to begin paragraphs and stanzas isU+1BFCBATAK SYMBOL BINDU NA METEK, which means “small bindu.” It has a shape-based variant,U+1BFDBATAK SYMBOL BINDU PINARBORAS (“rice-shaped bindu”), which is likewise used to separate sections of text.U+1BFEBATAK SYMBOL BINDU JUDUL (“title bindu”) is sometimes used to separate a title from the main text, which normally begins on the same line.

#Line Breaking. Traditionally, line breaks can occur before any spacing character. However, the vowel reordering described above is required even when a line break occurs between the characters involved. In typical Unicode-based implementations, this requires keeping the characters involved on the same line.

#17.7 Sundanese

#17.7.1 Sundanese: U+1B80–U+1BBF

The Sundanese script, oraksara Sunda, is used for writing the Sundanese language, one of the languages of the island of Java in Indonesia. It is a descendant of the ancient Brahmi script of India, and so has similarities with the modern scripts of South Asia and Southeast Asia which are also members of that family. The script has official support. It is taught in schools and used on road signs.

The Sundanese language has been written using a number of different scripts over the years. Pallawa or Pra-Nagari was first used in West Java to write Sanskrit from the fifth to the eighth centuriesCE.Sunda Kuna or Old Sundanese was derived from Pallawa and was used in the Sunda Kingdom from the 14th to the 18th centuries. The earliest example of Old Sundanese is the Prasasti Kawali stone. The Javanese script was used to write Sundanese from the 17th to the 19th centuries, and the Arabic script was used from the 17th to the 20th centuries. The Latin script has been in wide use since the 20th century. The modern Sundanese script, called Sunda Baku or Official Sundanese, became official in 1996. This modern script was derived from Old Sundanese.

#Structure. Sundanese consonants have an inherent vowel /a/. This inherent vowel can be modified by the addition of dependent vowel signs (matras). The script also has independent vowels.

In the modern orthography, an explicit vowel killer character,U+1BAASUNDANESE SIGN PAMAAEH, is used to indicate the absence, or “killing,” of the inherent vowel, but does not build consonant conjuncts. In Old Sundanese, however, consonant conjuncts do appear, and are formed withU+1BABSUNDANESE SIGN VIRAMA.

#Medials. In the modern orthography, initial Sundanese consonants can be followed by one of the three consonant signs for medial consonants,-ya,-ra, or-la. These medial consonants are graphically displayed as subjoined elements to their base consonants, and are not considered conjuncts proper, because they are not formed using avirama. In Old Sundanese, a subjoinedma,U+1BACSUNDANESE CONSONANT SIGN PASANGAN MA, and a subjoinedwa,U+1BADSUNDANESE CONSONANT SIGN PASANGAN WA, occur. They contrast with the conjunct forms created with thevirama.

#Final Consonants. Sundanese historical texts employ a final consonant, U+1BBESUNDANESE LETTER FINAL K, which is distinct from the modern representation with the explicit vowel killer U+1BAA ᮪SUNDANESE SIGN PAMAAEH:ᮊ᮪ <1B8A, 1BAA>. U+1BBFᮿSUNDANESE LETTER FINAL M was used in a 21st-century document, based on a scribal error in an old Sundanese manuscript, and should not be used in current practice. Rather, both old and modern representations offinal m useᮙ᮪ <1B99, 1BAA>.

#Combining Marks. Three final consonants are separately encoded as combining marks:-ng,-r,-h. These are analogues of Brahmicanusvara,repha, andvisarga, respectively.

#Historic Characters. Additional historic consonants appear only in old texts:reu,leu, andarchaic i. Thearchaic i is represented by U+1BBDSUNDANESE LETTER BHA because it was misinterpreted asbhain early transcriptions; the erroneous name has been corrected with formal name aliasSUNDANESE LETTER ARCHAIC I.

Another historic character, U+1BBASUNDANESE AVAGRAHA, has two functions. In one, it kills the inherent vowel of the preceding consonant and causes hiatus before an initiala. In the other, it doubles the preceding consonant, from which it may be separated in writing by a dependent vowel.

#Additional Consonants. Two supplemental consonant letters are used in the modern script:U+1BAESUNDANESE LETTER KHA andU+1BAFSUNDANESE LETTER SYA. These are used to represent the borrowed sounds denoted by the Arabic letterskha andsheen, respectively.

#Digits. Sundanese has its own script-specific digits, which are separately encoded in this block.

#Punctuation. Sundanese uses European punctuation marks, such as comma, full stop, question mark, and quotation marks. Spaces are used in text. Opportunities for hyphenation occur after any full orthographic syllable.

#Ordering. The order of characters in the code charts follows the Brahmic ordering. Theha-na-ca-ra-ka order found in Javanese and Balinese does not seem to be used in Sundanese.

#Ordering of Syllable Components. Dependent vowels and other signs are encoded after the consonant to which they apply. The ordering of elements for the modern Sundanese orthography is shown in more detail inTable 17-4.

#Table 17-4. Modern Sundanese Syllabic Structure
ClassExamplesEncoding
consonant or independent vowel[U+1B83..U+1BA0, U+1BAE, U+1BAF]
consonant sign -ya, -ra, -la ᮡ, ᮢ, ᮣ[U+1BA1..U+1BA3]
dependent vowel, killer ᮤ, ᮪[U+1BA4..U+1BA9, U+1BAA]
final consonant ᮀ[U+1B80..U+1B82]

The killer (pamaaeh) occupies the same logical position as a dependent vowel, but indicates the absence, rather than the presence of a vowel. It cannot be followed by a combining mark for a final consonant, nor can it be preceded by a consonant sign.

The left-side dependent vowelU+1BA6SUNDANESE VOWEL SIGN PANAELAENG occurs in logical order after the consonant (and any medial consonant sign), but in visual presentation its glyph appearsbefore (to the left of) the consonant.

#Rendering. When more than one sign appears above or below a consonant, the two are rendered side-by-side, rather than being stacked vertically.

#17.7.2 Sundanese Supplement: U+1CC0–U+1CCF

The Sundanese Supplement block contains eightbindu punctuation marks found in historical materials.

#17.8 Makasar

#17.8.1 Makasar: U+11EE0–U+11EFF

The Makasar script was used historically in South Sulawesi, Indonesia for writing the Makasar language. It is sometimes spelled “Makassar,” and is also referred to as “Old Makassarese” or “Makassarese bird script.” The script was maintained for official purposes in the kingdoms of Makasar in the 17th century, and it was used for writing a number of historical accounts, such as the “Chronicles of Gowa and Tallo’,” but it was superseded by the Buginese script in the 19th century and is no longer used. Although Makasar is thought to have evolved from Rejang, it shares several similarities with Buginese.

#Structure. Makasar is a Brahmi-derivedabugida. It is written horizontally, from left to right. Consonant signs carry an inherent /a/ vowel sign. Alternative vowel sounds are expressed by applying one of four combining characters to a consonant. Each vowel sign appears on a different side of the base consonant: right, left, top, and bottom. They are all encoded as combining characters following the consonant.

Like Buginese, geminated and clustered consonants are not indicated, nor are syllable-final consonants. However, Makasar differs from the Buginese script in that it does not have the pre-nasalized clusters, such as /ŋka/, that occur in Buginese, and it includes special features for consonant repetition.

There is only one independent vowel sign, U+11EF1𑻱MAKASAR LETTER A. Vowel signs can be attached to this character to produce other vowel sounds when a syllable has no consonant, such as at the beginning of a word.

#Consonant Repetition. Adjacent syllables that use the same consonant can be written by appending two vowel signs to a single consonant, as shown in the following example. Usually both vowels are the same in this case, and a consonant can take a maximum of two vowel signs.

U+11EE7𑻧da + U+11EF4 𑻴vowel sign u + U+11EF4 𑻴vowel sign u𑻧𑻴𑻴 [dudu]

U+11EF2𑻲MAKASAR ANGKA can also be used to repeat the consonant used in the previous syllable. This is particularly useful when one or both syllables use the inherent vowel, butangka may also be followed by a different vowel sound from that of the previous syllable.Angka is associated with the inherent vowel or a vowel sign in the same way as any normal consonant character. For example:

U+11EED𑻭ra + U+11EF4 𑻴vowel sign u + U+11EF2𑻲angka𑻭𑻴𑻲 [rura]

U+11EE5𑻥ma + U+11EF2𑻲angka + U+11EF3 𑻳vowel sign i𑻥𑻲𑻳 [mami]

#Letter va.U+11EEFMAKASAR LETTER VA is named “VA” even though the consonant is pronounced /w/ in the Makasar language. The name for this character aligns with the name for the related letterU+1A13BUGINESE LETTER VA.

#Digits. The available Makasar manuscript sources show two distinct sets of digits. The first set strongly resembles European digits and can be represented with U+0030..U+0039. The second set strongly resembles Arabic-Indic digits, and can be represented with U+0660..U+0669. Therefore, script-specific digits for Makasar are not separately encoded. Digits are frequently used, and both sets occur concurrently in the sources.

The Arabic-Indic digits are restricted to Arabic-language environments—particularly for expressing dates of the Hijri era. The European digits are used for general purposes, but occur within Arabic-language contexts for writing non-Hijri dates, specifically those of the Gregorian calendar.

Digits may occur above U+0600؀ARABIC NUMBER SIGN or U+0601؁ARABIC SIGN SANAH, seeFigure 9-7 for an example.

#Punctuation. Sentences are delimited with U+11EF7𑻷MAKASAR PASSIMBANG, and sections are terminated with U+11EF8𑻸MAKASAR END OF SECTION. Words are often, but not always, separated by spaces. Line breaks normally appear after syllable boundaries. Hyphens or other marks indicating continuance are not used.

The end of a text is often marked using a stylized rendering of the Arabic wordtammatU, meaning “it is complete.” There is no atomic character encoded for this symbol, so the sequence should be represented using Arabic letters <ta +meem +shadda +ta>, where theshadda is optional.

#17.9 Kawi

#17.9.1 Kawi: U+11F00–U+11F5F

The Kawi script is a historical Brahmi-derived script that was used between the 8th and 16th century in insular southeast Asia to write the Old Javanese, Sanskrit, Old Malay, Old Balinese, and Old Sundanese languages. A large portion of its corpus is found in Java, but Kawi materials have also been found in Sumatra, the Malay Peninsula, Bali, and the Philippines. Letter shapes evolved significantly over its 800 years of use, and later Kawi shows many variations over its wide geographic distribution; eventually, these variants evolved into the many modern Brahmic scripts of insular southeast Asia. The 21st century has brought renewed interest in the script, including some use in social media to write the modern Javanese or Indonesian languages.

The typeface used here is primarily based on early Kawi inscriptions, with some glyphs adapted from later attestations.

#Structure. The Kawi script is anabugida and written from left to right. The inherent vowel of a consonant can be overridden by attaching a dependent vowel sign. It can also be suppressed by attaching avirama sign,U+11F41◌𑽁KAWI SIGN KILLER, or the conjunct form of another consonant or vocalic liquid, which is encoded by preceding the consonant or vocalic liquid withU+11F42◌𑽂KAWI CONJOINER. A vowelless consonantr- that starts an orthographic syllable may be represented by arepha, which is encoded asU+11F02𑼂KAWI SIGN REPHA. Consonant stacks with up to four consonants are known.

#Consonants.Table 17-5 shows the base consonants and their conjunct forms.

#Table 17-5. Kawi Base Consonants and Conjunct Forms
ConsonantBase FormConjunct Form
ka𑼒◌𑽂𑼒
kha𑼓◌𑽂𑼓
ga𑼔◌𑽂𑼔
gha𑼕◌𑽂𑼕
nga𑼖◌𑽂𑼖
ca𑼗◌𑽂𑼗
cha𑼘◌𑽂𑼘
ja𑼙◌𑽂𑼙
jha𑼚◌𑽂𑼚
nya𑼛◌𑽂𑼛
tta𑼜◌𑽂𑼜
ttha𑼝◌𑽂𑼝
dda𑼞◌𑽂𑼞
ddha𑼟◌𑽂𑼟
nna𑼠◌𑽂𑼠
ta𑼡◌𑽂𑼡
tha𑼢◌𑽂𑼢
da𑼣◌𑽂𑼣
dha𑼤◌𑽂𑼤
na𑼥◌𑽂𑼥
pa𑼦◌𑽂𑼦
pha𑼧◌𑽂𑼧
ba𑼨◌𑽂𑼨
bha𑼩◌𑽂𑼩
ma𑼪◌𑽂𑼪
ya𑼫◌𑽂𑼫
ra𑼬◌𑽂𑼬,◌𑽂𑼬
la𑼭◌𑽂𑼭
wa𑼮◌𑽂𑼮
sha𑼯◌𑽂𑼯
ssa𑼰◌𑽂𑼰,◌𑽂𑼰
sa𑼱◌𑽂𑼱
ha𑼲◌𑽂𑼲,◌𑽂𑼲

The below-base conjunct form ofra is commonly used when the pre-base form would collide with other marks, but can also be used as a stylistic variant. The second conjunct forms ofssa andha are stylistic variants.

A vowellessr- that starts an orthographic syllable is normally written with arepha above the following consonant, but occasionally with the base form ofra with a subjoined consonant, for example,rwa𑼂𑼮 versus𑼬𑽂𑼮. In some late Kawi varieties, therepha glyph may be used for a final-r consonant.

U+11F33𑼳KAWI LETTER JNYA is a graphic simplification of the consonant cluster𑼙𑽂𑼛jnya; it has no conjunct form. Additional marks can be attached to it.

#Independent Vowels. The Kawi script has a set of independent vowels and vocalic liquid letters. Dependent vowel signs and other signs can be attached to them. Lettersau,eu, andeuu are visually composites of other letters and dependent vowels, and are encoded as such. Lettersaa,ii, anduu occur in both composite and visually distinct forms; the latter are encoded separately. SeeTable 17-6.

#Table 17-6. Kawi Independent Vowels with Composite Representations
VowelVisually DistinctComposite
aa𑼅 11F05𑼄𑼴 <11F04, 11F34>
ii𑼇 11F07𑼆𑼴 <11F06, 11F34>
uu𑼉 11F09𑼈𑼴 <11F08, 11F34>
au𑼐𑼴 <11F10, 11F34>
eu𑼄𑽀 <11F04, 11F40>
euu𑼄𑽀𑼴 <11F04, 11F40, 11F34>

Two vocalic liquid letters have conjunct forms—the nature of◌𑽂𑼌 is not entirely clear. They are shown inTable 17-7.

#Table 17-7. Kawi Vocalic Liquids with Conjunct Forms
Vocalic LiquidBase FormConjunct Form
Vocalicr𑼊◌𑽂𑼊
Vocalicl𑼌◌𑽂𑼌

#Dependent Vowels. The dependent vowelso,au, andeuu are visually composites of other letters and dependent vowels, and are encoded as such, as shown inTable 17-8.

#Table 17-8. Kawi Dependent Vowels with Composite Representations
VowelComposite
o◌𑼾𑼴 <11F3E, 11F34>
au◌𑼿𑼴 <11F3F, 11F34>
euu◌𑽀𑼴 <11F40, 11F34>

The dependent vowelaa has several glyph variants. The primary form isU+11F34◌𑼴KAWI VOWEL SIGN AA; one alternate form has been encoded asU+11F35◌𑼵KAWI VOWEL SIGN ALTERNATE AA, as its use may be required to avoid confusability. Other variants, such as◌𑼵, may be supported as stylistic variants.

The dependent vowel◌𑼴aa has been repurposed as a consonant reduplicator in some manuscripts, and can in this case be combined with other vowels, for example,𑼦𑼶𑼴ppi <11F26, 11F36, 11F34>.

The dependent vowels◌𑼶i and◌𑼸u are sometimes used together in a single cluster to mark the cluster as canceled and not meant to be read, for example,𑼭𑼶𑼸 for a misspelledla,li, orlu.

In some inscriptions, the dependent vowels◌𑼿ai and◌𑼿𑼴au are written with pre-base components that look similar to sequences of two dependent vowels◌𑼾e. To transcribe these, use◌𑼾 twice:◌𑼾𑼾 for◌𑼿 and◌𑼾𑼾𑼴 for◌𑼿𑼴.

#Other Signs.U+11F00◌𑼀KAWI SIGN CANDRABINDU indicates nasalization in specific words such as𑼐𑼴𑼀om.U+11F01◌𑼁KAWI SIGN ANUSVARA represents final, whileU+11F03◌𑼃KAWI SIGN VISARGA represents final-h.U+11F5A◌𑽚KAWI SIGN NUKTA is used to modify a few consonants to represent foreign sounds, typically coming from Arabic. For example, combining it with U+11F26𑼦KAWI LETTER PA results in𑼦𑽚fa.

#Digits. The Kawi script has its own set of decimal digits. The digitU+11F52𑽒KAWI DIGIT TWO is used for the syllablero in some manuscripts; additional marks can be attached to it in this usage.

#Punctuation. Kawi materials use several punctuation characters to divide text into sections.

U+11F48𑽈KAWI PUNCTUATION SPACE FILLER is used to justify texts or fill gaps that are too small to fit another letter in the middle or at the end of a line. This character looks likeU+11F54𑽔KAWI DIGIT FOUR in some inscriptions, but differs in others.

U+11F45𑽅KAWI PUNCTUATION SECTION MARKER,U+11F46𑽆KAWI PUNCTUATION ALTERNATE SECTION MARKER (which differs from U+11F45 in having some additional flourish),U+11F4E𑽎KAWI PUNCTUATION SPIRAL, andU+11F4F𑽏KAWI PUNCTUATION CLOSING SPIRAL are similar in function tosiddham signs in various other scripts, which are generally used as invocations at the beginning of texts. The Kawi analogues to thesiddham signs have several distinct variants, which are often used in combination with other punctuation marks to indicate opening, closing, and major breaks in a text, such as𑽆𑽊𑽆 or𑽇𑽎𑽇.

#Encoding Order and Rendering. Information on the encoding order of syllable components and on rendering is available in Unicode Technical Note #48, “Implementing Kawi.”

#Line Breaking. Opportunities for line breaking occur after any full orthographic syllable. Hyphens are not used.


[8]ページ先頭

©2009-2025 Movatter.jp