Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

CJK characters

From Wikipedia, the free encyclopedia
Logographs in shared East Asian written tradition
For help with CJK character display, seeHelp:Multilingual support (East Asian).
Translation of "That old man is 72 years old" inVietnamese,Cantonese,Mandarin (insimplified andtraditional characters),Japanese, andKorean.

Ininternationalization,CJK characters is a collective term forgraphemes used in theChinese,Japanese, andKorean writing systems, which each includeChinese characters. It can also go byCJKV to includeChữ Nôm, the Chinese-originlogographic script formerly used for theVietnamese language, orCJKVZ to also includeSawndip, used to write theZhuang languages.

Character repertoire

[edit]

Standard Mandarin Chinese andStandard Cantonese are written almost exclusively inChinese characters. Over 3,000 characters are required for general literacy, with up to 40,000 characters for reasonably complete coverage. Japanese uses fewer characters—general literacy in Japanese can be expected with 2,136 characters. The use of Chinese characters in Korea is increasingly rare, although idiosyncratic use of Chinese characters in proper names requires knowledge (and therefore availability) of many more characters. As of 2013[update], some South Korean students were still expected to learn1,800 characters.[1]

Other scripts used for these languages, such asbopomofo and theLatin-basedpinyin for Chinese,hiragana andkatakana for Japanese, andhangul for Korean, are not strictly "CJK characters", although CJK character sets almost invariably include them as necessary for full coverage of the target languages.

Thesinologist Carl Leban (1971) produced an early survey of CJK encoding systems.

Until the early 20th century,Classical Chinese was the written language of government and scholarship in Vietnam. Popular literature inVietnamese was written in thechữ Nôm script, consisting of Chinese characters with many characters created locally. Since the 1920s, the script since then used for recording literature has been the Latin-basedVietnamese alphabet.[2][3]

Encoding

[edit]

The number of characters required for complete coverage of all these languages' needs cannot fit in the 256-character code space of 8-bitcharacter encodings, requiring at least a 16-bit fixed width encoding or multi-byte variable-length encodings. The 16-bit fixed width encodings, such as those fromUnicode up to and including version 2.0, are now deprecated due to the requirement to encode more characters than a 16-bit encoding can accommodate—Unicode 5.0 has some 70,000 Han characters—and the requirement by the Chinese government that software in China support theGB 18030 character set.

Although CJK encodings have common character sets, the encodings often used to represent them have been developed separately by different East Asian governments and software companies, and are mutually incompatible.Unicode has attempted, with some controversy, to unify the character sets in a process known asHan unification.

CJK character encodings should consist minimally of Han characters plus language-specific phonetic scripts such aspinyin,bopomofo, hiragana, katakana and hangul.[4]

CJK character encodings include:

The CJK character sets take up the bulk of the assignedUnicode code space. There is much controversy among Japanese experts of Chinese characters about the desirability and technical merit of theHan unification process used to map multiple Chinese and Japanese character sets into a single set of unified characters.[citation needed]

All three languages can be written bothleft-to-right and top-to-bottom (right-to-left and top-to-bottom in ancient documents), but are usually considered left-to-right scripts when discussing encoding issues.

Legal status

[edit]

Libraries cooperated on encoding standards forJACKPHY characters in the early 1980s. According toKen Lunde, the abbreviation "CJK" was a registeredtrademark ofResearch Libraries Group[5] (which merged withOCLC in 2006). The trademark owned by OCLC between 1987 and 2009 has now expired.[6]

See also

[edit]

References

[edit]
  1. ^Lunde, Ken (2009).CJKV information processing (2nd ed.). Beijing, Boston, Farnham, Sebastopol, Tokyo: O'Reilly.ISBN 978-0-596-51447-1.
  2. ^Coulmas (1991), pp. 113–115.
  3. ^DeFrancis (1977).
  4. ^This article is based on material taken fromCJK at theFree On-line Dictionary of Computing prior to 1 November 2008 and incorporated under the "relicensing" terms of theGFDL, version 1.3 or later.
  5. ^Ken Lunde, 1996
  6. ^Justia listing

Works cited

[edit]

Sources

[edit]

External links

[edit]
Block namePlaneChart rangeCharactersHan unificationScripts contained in block

0BMP
0 BMP
2SIP
2 SIP
2 SIP
2 SIP
2 SIP
3TIP
3 TIP
2 SIP
3 TIP
0 BMP
0 BMP
0 BMP
0 BMP
0 BMP
0 BMP
0 BMP
0 BMP
0 BMP
1SMP
2 SIP

4E00–9FFF
3400–4DBF
20000–2A6DF
2A700–2B73F
2B740–2B81F
2B820–2CEAF
2CEB0–2EBEF
30000–3134F
31350–323AF
2EBF0–2EE5F
323B0–3347F
2E80–2EFF
2F00–2FDF
2FF0–2FFF
3000–303F
31C0–31EF
3200–32FF
3300–33FF
F900–FAFF
FE30–FE4F
1F200–1F2FF
2F800–2FA1F

20,992
6,592
42,720
4,160
222
5,774
7,473
4,939
4,192
622
4,298
115
214
16
64
39
255
256
472
32
64
542

Unified
Unified
Unified
Unified
Unified
Unified
Unified
Unified
Unified
Unified
Unified
Not unified
Not unified
Not unified
Not unified
Not unified
Not unified
Not unified
12 are unified
Not unified
Not unified
Not unified

Han
Han
Han
Han
Han
Han
Han
Han
Han
Han
Han
Han
Han
Common
Han,Hangul, Common,Inherited
Common
Hangul,Katakana, Common
Katakana, Common
Han
Common
Hiragana, Common
Han

Totals 
22
104,053
  
  1. ^
    As of version 17.0
Retrieved from "https://en.wikipedia.org/w/index.php?title=CJK_characters&oldid=1333694591"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp