| Languages | Chinese,Japanese,Korean |
|---|---|
| Standard | MARC-8, ANSI/NISO Z39.64 (both EACC version) |
| Current status | Used mainly by library systems |
| Classification | TBCS forCJK based on theISO 2022 structure,JACKPHY component of MARC |
TheChinese Character Code for Information Interchange (Chinese:中文資訊交換碼) orCCCII is acharacter set developed by the Chinese Character Analysis Group inTaiwan. It was first published in 1980, and significantly expanded in 1982 and 1987.[1]
It is used mostly bylibrary systems.[2][3] It is one of the earliest established and most sophisticated encodings fortraditional Chinese (predating the establishment ofBig5 in 1984 andCNS 11643 in 1986).[2] It is distinguished by its unique system for encodingsimplified versions and othervariants of its main set ofhanzi characters.[1]
A variant of an earlier version of CCCII is used by theLibrary of Congress as part ofMARC-8, under the nameEast Asian Character Code (EACC, ANSI/NISO Z39.64),[4] where it comprises part ofMARC 21'sJACKPHY support. However, EACC contains fewer characters than the most recent versions of CCCII.[5][1] Work atApple based onResearch Libraries Group's CJK Thesaurus, which was used to maintain EACC, was one of the direct predecessors ofUnicode'sUnihan set.[6]

CCCII is designed as an 94n set, as defined byISO/IEC 2022.[1] Each Chinese character is represented by a 3-byte code in which each byte is 7-bit, between0x21 and 0x7E inclusive. Thus, the maximum number of Chinese characters representable in CCCII is 94×94×94 = 830584. In practice the number of characters encodable by CCCII would be less than this number, because variant characters are encoded in related ISO 2022 planes under CCCII, so most of the code points would have to be reserved for variants.
In practice, however, bytes outside of these ranges are sometimes used. The code 0x212320 is used by some implementations as anideographic space.[8] A CCCII specification used by libraries in Hong Kong uses codes starting with 0x2120 for punctuation and symbols.[9] The first byte 0x7F is used by some variants to encode codes for some otherwise unavailableUnified Repertoire and Ordering orCJK Unified Ideographs Extension A hanzi (e.g. 0x7F3449 for U+3449 or 0x7F796E for U+796E;[9] notice how the continuation bytes match theUCS-2BE code), and this may include bytes outside of the 0x21–0x7E or even 0x20–0x7F range, e.g. 0x7F551C for U+551C,[10] 0x7F5AA4 for U+5AA4[10] or 0x7F8EDA for U+8EDA.[9]
CCCII/EACC is not registered in theInternational Registry of Coded Character Sets to be Used with Escape Sequences,[11] and as such, does not have a standard designation escape for use with ISO 2022. MARC-8 assigns EACC the private-useF-byte 0x31 (1) in its implementation of ANSI X3.41 (ISO 2022).[12]
The 94 ISO 2022 planes are grouped into 16 layers of 6 planes each (except for layer 16, which contains the four planes 91–94).[1] Layer 1 contains both non-hanzi andhanzi characters, with the non-hanzi and most frequently used hanzi being placed in plane 1, and with the remaining five planes consisting of less common hanzi.[1] Layer 2 containssimplified Chinese characters, with theirrow and cell numbers being the same as theirtraditional Chinese equivalents in layer 1. Layers 3 through 12 contain furthervariant forms, at row and cell numbers homologous to the first two layers.[13]
The last four layers are used for other purposes. Specifically, layer 13 contains additional characters forJapanese language support (kana and Japanesekokuji), and layer 14 contains additional characters forKorean language support (hangul).[13] Layer 15 is unused (reserved), while layer 16 is used for other characters.[1]
This distinctive design has been criticized by Christian Wittern of the International Research Institute for Zen Buddhism atHanazono University, who asserts that the relationship of character variants "is very complex and can not be expressed in a fixed, one-dimensional, hard-wired codetable".[3]Ken Lunde describes it as "one of the most well thought-out character set standards from Taiwan", describing its structure as "to be truly admired", but concluding thatOpenType variant form substitution can provide the same level of functionality.[1]
CCCII defines roughly 53940 code points as of its 1987 edition, although a more recent draft from 1989 extends this to 75684 code points (comprising 44167 unique characters and 31517 variants). EACC, the variant used by the Library of Congress, includes only a smaller set of 15686 characters.[1]
As of 1995, CCCII or EACC was used mostly in libraries in theUnited States,Hong Kong andTaiwan. Although CCCII promised pan-CJK coverage, its support was limited to specialized hardware; difficulty ascertaining when the root versus variant character should be used, exacerbated by a lack of firmly established reference glyphs, further limited its adoption, resulting inBig5 being more commonly used for Chinese in those territories outside of library use (sinceUnicode had yet to become widely adopted at the time).[3]
As of 2009[update], EACC is still in extensive use for specialized bibliographic purposes.[1] It was also an important precursor to Unicode:[1] work atApple on a CJK character cross-reference database based onResearch Libraries Group's CJK Thesaurus, used to maintain EACC, was directly incorporated into the development ofUnicode'sUnihan set.[6] Unicodehanzi characters are referenced to their corresponding CCCII and EACC codes in theUnihan database, in the keyskCCCII andkEACC;[4] however, since Unicode's character unification criteria (based on those used by the JapaneseJIS X 0208 and on those developed by the Association for a Common Chinese Code in China) differ from those used by CCCII, not all variant characters are individually mapped.[6] Mapping tables for hanzi,hangul,kana and punctuation between EACC and Unicode are available from the Library of Congress.[14]
Following are charts for punctuation, symbols,kana and Hanguljamo, showing the characters and giving possible Unicode mappings. Where possible, these are referenced against published mapping data.
Unicode mappings for Hangul syllables are omitted below for brevity, but are documented by the Library of Congress.[15] CCCII hanzi number in the tens of thousands[1][3] and are not shown below (except where they are also included in the non-hanzi range, as radicals or numerals), but mappings to Unicode are available from the Unihan database[4] and from elsewhere.[10][9]
Although CCCII is usually a 94n set,[1] and therefore does not usually use codes starting with 0x2120,[10] the following layout is used by a variant used by libraries in Hong Kong:[9]
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 2x | 、 | 。 | ・ | ゙ | ゚ | ´ | ` | ¨ |  ̄ | ヽ | ヾ | ゝ | ゞ | |||
| 3x | 〃 | 〆 | ‖ | … | ‥ | |||||||||||
| 4x | “ | 〔 | 〕 | 「 | 」 | 『 | 』 | 【 | 】 | ± | × | ÷ | ||||
| 5x | ≠ | ≦ | ≧ | ∞ | ∴ | ♂ | ♀ | ° | ℃ | ¢ | £ | § | ☆ | ★ | ○ | ● |
| 6x | ′ | ″ | ◎ | |||||||||||||
| 7x | ◇ | ◆ | □ | ■ | △ | ▲ | ▽ | ▼ | ※ | 〒 | → | ← | ↑ | ↓ |
No characters are assigned in plane 1 row 1, which is reserved forcontrol codes.[1]
This row contains mathematical operators. EACC leaves this row empty.[14] The following table is referenced against sources from Taiwan.[2][10]
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 2x | ∞ | + | − | ± | × | ⋅ | ÷ | ∕ | = | ≠ | ≡ | ≈ | ∼ | ∝ | < | |
| 3x | > | ≮ | ≯ | ≤ | ≥ | ≪ | ≫ | ∂ | ∫ | Δ | ∆ | ∇ | ▫ | ∠ | ⊤ | ∥ |
| 4x | ≅ | ≞ | ∴ | ∃ | ∀ | ∪ | ∩ | ⊂ | ⊃ | ⇒ | ⇔ | ∋ | ∈ | ∉ | ∑ | ㏒ |
| 5x | ㏑ | ℯ | π | √ | ︕ | ⎸ | ⎹ | 〈 | 〉 | |||||||
| 6x | ||||||||||||||||
| 7x |
The following table is referenced against CCCII data provided by the Hong KongInnovative Users Group, a group of libraries in Hong Kong, and hosted by theUniversity of Hong Kong.[17][9] It uses an entirely different layout in this row:
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 2x | ∈ | ∋ | ⊆ | ⊇ | ⊂ | ⊃ | ∪ | ∩ | ∧ | ∨ | ¬ | ⇒ | ⇔ | ∀ | ∃ | |
| 3x | ∠ | ⊥ | ⌒ | ∂ | ∇ | ≡ | ≒ | ≪ | ≫ | √ | ∽ | ∝ | ∵ | ∫ | ∬ | |
| 4x | Å | ‰ | ♯ | ♭ | ♪ | † | ‡ | ¶ | ◯ | |||||||
| 5x | ─ | │ | ┌ | ┐ | ┘ | └ | ├ | ┬ | ┤ | ┴ | ┼ | ━ | ┃ | ┏ | ┓ | ┛ |
| 6x | ┗ | ┣ | ┳ | ┫ | ┻ | ╋ | ┠ | ┯ | ┨ | ┷ | ┿ | ┝ | ┰ | ┥ | ┸ | ╂ |
| 7x |
This row includes punctuation,western Arabic numerals and Roman letters.[10] Comparerow 3 of Wansung code androw 3 of GB 2312.
Different variants variously encode theideographic space (U+3000) at 0x212320 (which the MARC specification acknowledges),[8][9] 0x212321 (which is listed in the ANSI standard, and is also acknowledged by MARC),[8][9] or 0x21635F.[10] EACC includes only thehyphen-minus, parentheses and ideographic space in this set.[8]
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 2x | IDSP[a] | !/IDSP[b] | " | # | $ | % | & | ' | (/( | )/) | * | + | , | -/- | . | / |
| 3x | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? |
| 4x | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
| 5x | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ↑ | _ |
| 6x | `/' | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o |
| 7x | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ |
In EACC, this row includes severalPrivate Use Area mapped characters used internally to represent character components by theRLIN input method,[18] which is used by the Library of Congress for non-Roman cataloging.[19] These component characters should only be used internally by anIME and, if encountered elsewhere, may be replaced with thegeta mark (U+3013),[18] which this row also includes at 0x212A46. This row is unassigned in CCCII,[1] but the geta mark is also listed at that location in some mappings for CCCII.[10]
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 2x | � | � | � | � | � | � | � | � | � | � | � | � | � | � | ||
| 3x | � | � | � | � | � | � | � | � | � | � | � | � | � | � | � | |
| 4x | � | � | � | � | � | � | 〓 | |||||||||
| 5x | ||||||||||||||||
| 6x | ||||||||||||||||
| 7x |
This row contains various punctuation marks used in Chinese,[1][8] in addition to other symbols. CCCII includes a set of 35 punctuation marks in this row.[1] EACC includes only 13 characters in this row (shown boxed below).[8]
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 2x | ︵ | ︶ | ﹁ | ﹂ | 「 | 」 | ︳ | _ | ﹃ | ﹄ | 『 | 』 | ︴ | ﹏ | ︹ | |
| 3x | ︺ | 〔/[ | 〕/] | 。 | ・/. | 、 | ⋮ | ⋯ | , | ; | : | ? | ︱ | ! | ︲ | ︱ |
| 4x | ‘ | ’ | “ | ” | 《 | 》 | 【 | 】 | 〖 | 〗 | ||||||
| 5x | $ | ¢ | ₡ | £ | ¥ | ₨ | d. | s. | / | # | % | ⅌ | @ | ¶ | ® | |
| 6x | © | ℅ | & | § | † | ‡ | * | |||||||||
| 7x | ヽ | ヾ | ゝ | ゞ | α | 〒 |
These rows containChinese radicals,[1]Roman numerals,[10]celestial stems andterrestrial branches.[16]
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 2x | ⼀ | ⼁ | ⼂ | ⼃ | ⼄ | ⼅ | ⼆ | ⼇ | ⼈ | ⼉ | ⼊ | ⼋ | ⼌ | |||
| 3x | ⼍ | ⼎ | ⼏ | ⼐ | ⼑ | ⼒ | ⼓ | ⼔ | ⼕ | ⼖ | ⼗ | ⼘ | ⼙ | ⼚ | ⼛ | ⼜ |
| 4x | ⼝ | ⼞ | ⼟ | ⼠ | ⼡ | ⼢ | ⼣ | ⼤ | ⼥ | ⼦ | ⼧ | ⼨ | ⼩ | ⼪ | ⼫ | |
| 5x | ⼬ | ⼭ | ⼮ | ⼯ | ⼰ | ⼱ | ⼲ | ⼳ | ⼴ | ⼵ | ⼶ | ⼷ | ⼸ | ⼹ | ⼺ | ⼻ |
| 6x | ⼼ | ⼽ | ⼾ | ⼿ | ⽀ | ⽁ | ⽂ | ⽃ | ⽄ | ⽅ | ⽆ | ⽇ | ⽈ | ⽉ | ⽊ | |
| 7x | ⽋ | ⽌ | ⽍ | ⽎ | ⽏ | ⽐ | ⽑ | ⽒ | ⽓ | ⽔ | ⽕ | ⽖ | ⽗ | ⽘ | ⽙ |
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 2x | ⽚ | ⽛ | ⽜ | ⽝ | ⽞ | ⽟ | ⽠ | ⽡ | ⽢ | ⽣ | ⽤ | ⽥ | ⽦ | ⽧ | ||
| 3x | ⽨ | ⽩ | ⽪ | ⽫ | ⽬ | ⽭ | ⽮ | ⽯ | ⽰ | ⽱ | ⽲ | ⽳ | ⽴ | ⽵ | ⽶ | |
| 4x | ⽷ | ⽸ | ⽹ | ⽺ | ⽻ | ⽼ | ⽽ | ⽾ | ⽿ | ⾀ | ⾁ | ⾂ | ⾃ | ⾄ | ⾅ | ⾆ |
| 5x | ⾇ | ⾈ | ⾉ | ⾊ | ⾋ | ⾌ | ⾍ | ⾎ | ⾏ | ⾐ | ⾑ | ⾒ | ⾓ | ⾔/訁 | ⾕ | |
| 6x | ⾖ | ⾗ | ⾘ | ⾙ | ⾚ | ⾛ | ⾜ | ⾝ | ⾞ | ⾟ | ⾠ | ⾡ | ⾢ | ⾣ | ⾤ | ⾥ |
| 7x | ⾦/釒 | ⾧ | ⾨ | ⾩ | ⾪ | ⾫ | ⾬ | ⾭ | ⾮ | ⾯ | ⾰ | ⾱ | ⾲ |
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 2x | ⾳ | ⾴ | ⾵ | ⾶ | ⾷/飠 | ⾸ | ⾹ | ⾺ | ⾻ | ⾼ | ⾽ | ⾾ | ⾿ | ⿀ | ||
| 3x | ⿁ | ⿂ | ⿃ | ⿄ | ⿅ | ⿆ | ⿇ | ⿈ | ⿉ | ⿊ | ⿋ | ⿌ | ⿍ | |||
| 4x | ⿎ | ⿏ | ⿐ | ⿑ | ⿒ | ⿓ | ⿔ | ⿕ | ||||||||
| 5x | 甲 | 乙 | 丙 | 丁 | 戊 | 己 | 庚 | 辛 | 壬 | 癸 | ||||||
| 6x | 子 | 丑 | 寅 | 卯 | 辰 | 巳 | 午 | 未 | 申 | 酉 | 戌 | 亥 | ||||
| 7x | Ⅰ | Ⅱ | Ⅲ | Ⅳ | Ⅴ | Ⅵ | Ⅶ | Ⅷ | Ⅸ | Ⅹ | Ⅺ | Ⅻ |
This row includes Chinese numerals andbopomofo characters.[1] EACC includes only the ideographic zero (〇).[8]
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 2x | 〡 | 〢 | 〣 | 〤 | 〥 | 〦 | 〧 | 〨 | 〩 | 〸 | 〹 | 〺 | ||||
| 3x | 〇 | 一 | 二 | 三 | 四 | 五 | 六 | 七 | 八 | 九 | 十 | 百 | 千 | 万 | ||
| 4x | 零 | 壹 | 貳 | 參 | 肆 | 伍 | 陸 | 柒 | 捌 | 玖 | 拾 | 佰 | 仟 | 萬 | 億 | |
| 5x | ˊ | ˇ | ˋ | ˙/﹒[c] | ㄅ | ㄆ | ㄇ | ㄈ | ㄉ | ㄊ | ㄋ | ㄌ | ㄍ | ㄎ | ㄏ | ㄐ |
| 6x | ㄑ | ㄒ | ㄓ | ㄔ | ㄕ | ㄖ | ㄗ | ㄘ | ㄙ | ㄚ | ㄛ | ㄜ | ㄝ | ㄞ | ㄟ | ㄠ |
| 7x | ㄡ | ㄢ | ㄣ | ㄤ | ㄥ | ㄦ | ㄧ | ㄨ | ㄩ | ü |
This row contains thereference mark (kome jirushi).[10]
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 6x | ※ |
A variant used by libraries in Hong Kong does not include bopomofo characters in plane 1 row 15, but includes them in a different layout in plane 7.[9]
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 7x | ㄅ | ㄆ | ㄇ | ㄈ | ㄉ | ㄊ | ㄋ | ㄌ | ㄍ | ㄎ |
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 2x | ㄏ | ㄐ | ㄑ | ㄒ | ㄓ | ㄔ | ㄕ | ㄖ | ㄗ | ㄘ | ㄙ | ㄚ | ㄛ | ㄜ | ㄝ | |
| 3x | ㄞ | ㄟ | ㄠ | ㄡ | ㄢ | ㄣ | ㄤ | ㄥ | ㄦ | ㄧ | ㄨ | ㄩ |
This row is in plane 73, the first plane of layer 13, which contains characters included forJapanese language support.[13] It contains punctuation.[8] Comparerow 1 of JIS X 0208, which this row tends to follow the layout of for the characters it includes.
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 2x | ・ | |||||||||||||||
| 3x | 々 | 〆 | ー | |||||||||||||
| 4x | ||||||||||||||||
| 5x | 〈 | 〉 | 《 | 》 | ||||||||||||
| 6x | ||||||||||||||||
| 7x |
This row containshiragana. Comparerow 4 of JIS X 0208.
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 2x | ぁ | あ | ぃ | い | ぅ | う | ぇ | え | ぉ | お | か | が | き | ぎ | く | |
| 3x | ぐ | け | げ | こ | ご | さ | ざ | し | じ | す | ず | せ | ぜ | そ | ぞ | た |
| 4x | だ | ち | ぢ | っ | つ | づ | て | で | と | ど | な | に | ぬ | ね | の | は |
| 5x | ば | ぱ | ひ | び | ぴ | ふ | ぶ | ぷ | へ | べ | ぺ | ほ | ぼ | ぽ | ま | み |
| 6x | む | め | も | ゃ | や | ゅ | ゆ | ょ | よ | ら | り | る | れ | ろ | ゎ | わ |
| 7x | ゐ | ゑ | を | ん |
This row containskatakana. Comparerow 5 of JIS X 0208, which this row corresponds to, besides the addition of the separatedakuten andhandakuten.
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 2x | ァ | ア | ィ | イ | ゥ | ウ | ェ | エ | ォ | オ | カ | ガ | キ | ギ | ク | |
| 3x | グ | ケ | ゲ | コ | ゴ | サ | ザ | シ | ジ | ス | ズ | セ | ゼ | ソ | ゾ | タ |
| 4x | ダ | チ | ヂ | ッ | ツ | ヅ | テ | デ | ト | ド | ナ | ニ | ヌ | ネ | ノ | ハ |
| 5x | バ | パ | ヒ | ビ | ピ | フ | ブ | プ | ヘ | ベ | ペ | ホ | ボ | ポ | マ | ミ |
| 6x | ム | メ | モ | ャ | ヤ | ュ | ユ | ョ | ヨ | ラ | リ | ル | レ | ロ | ヮ | ワ |
| 7x | ヰ | ヱ | ヲ | ン | ヴ | ヵ | ヶ | ◌゙/゛ | ◌゚/゜ |
These rows contains Koreanjamo.
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 5x | ㄱ | ㄴ | ㄷ | ㄹ | ㅁ | ㅂ | ㅅ | ㅇ | ㅈ | |||||||
| 6x | ㅊ | ㅋ | ㅌ | ㅍ | ㅎ | ㄲ | ㄸ | ㅃ | ||||||||
| 7x | ㅆ | ㅉ | ㅏ | ㅐ | ㅑ | ㅓ | ㅔ | ㅕ | ㅗ | ㅘ | ㅛ |
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 2x | ㅜ | ㅠ | ㅡ | ㅢ | ㅣ |
This row contains several historicHangul characters no longer in regular use. Several of these are mapped to thePrivate Use Area.[18]
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 2x | ㆁ | ㆆ | ㅿ | � | ㆍ | |||||||||||
| 3x | ||||||||||||||||
| 4x | � | � | � | � | � | � | � | � | � | � | � | � | � | � | � | � |
| 5x | � | � | � | � | � | � | � | � | ||||||||
| 6x | ||||||||||||||||
| 7x |
This row contains additionalkatakana used to write foreign phonemes.[10]
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 7x | ヷ | ヸ | ヹ |
CCCII: The earliest (and most sophisticated) Traditional Chinese encoding... used mostly in library systems.... Map for "CCCII" is supplied by theKoha Taiwan project.
{{cite web}}: CS1 maint: archived copy as title (link)