Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

CJK Unified Ideographs Extension I

From Wikipedia, the free encyclopedia
For a list of all CJK characters encoded in Unicode, seeCJK Unified Ideographs.
Unicode character block
CJK Unified Ideographs Extension I
RangeU+2EBF0..U+2EE5F
(624 code points)
PlaneSIP
ScriptsHan
Assigned622 code points
Unused2 reserved code points
Unicode version history
15.1(2023)622 (+622)
Unicode documentation
Code chart ∣ Web page
Note:[1][2]

CJK Unified Ideographs Extension I is aUnicode block comprisingCJK Unified Ideographs included in drafts of an amendment to China'sGB 18030 standard circulated in 2022 and 2023, which were fast-tracked intoUnicode in 2023.

Background

[edit]
Further information:GB 18030

Unlike most other sets of CJK unified ideographs, Extension I was not prepared and submitted by theIdeographic Research Group (IRG).[3]

GB 18030 is a mandatory national standard of thePeople's Republic of China (PRC). It defines aUnicode Transformation Format which retains compatibility with existing data in the earlierGBK andEUC-CN character encodings, and specifies particular Unicode characters which devices sold in China must support.[4] Its 2022 edition,GB 18030-2022, changed a number of required characters to map to standard Unicodecode points, rather than toprivate use area code points.

In late 2022, the PRC made a draft of a further amendment to be made to GB 18030 available for public consultation. This draft would have placed 897 newsinographic characters in Plane 10 (hexadecimal: 0A), ayet-untitled astral Unicode plane.[5] This was motivated by a "strong need of citizen real-name certification in China".[6] Since it would impactISO/IEC 10646 (the Universal Coded Character Set, theISO standard synchronised with Unicode), the draft was circulated inISO/IEC JTC 1/SC 2, the ISO subcommittee responsible for ISO 10646. The Chinese national body maintained that "ISO/IEC 10646 do not specify the purpose of the 0A plane", which ISO 10646 denotes as "reserved for future standardization", and that this use was therefore "not inappropriate".[5]

However, since the intent of ISO 10646 was for Plane 10 to be reserved for future allocation by ISO 10646 and Unicode via their usual ballot process, not for it to be allocatedunilaterally by national standards bodies, this proposed move was criticised by experts and other national bodies as one which would "destabilize the synchronization" between GB 18030 and ISO/IEC 10646 (and thus Unicode), and which would make it impossible to conform to both with a single implementation,[5] effectivelyforking Unicode. At its meeting in March 2023, the IRG emphasised the importance of providing any subsequent GB 18030 amendment drafts to IRG experts in a timely manner, and of not "using the ISO/IEC 10646 standard inappropriately".[7]

As an alternative, therepertoire (eventually reduced to 622 characters after expert review) was fast-tracked into Unicode version 15.1 in September 2023, as the CJK Unified Ideographs Extension I block.[5] The characters constitute the "GIDC23"Unihan source,[8] defined as sourced from the "ID system of the Ministry of Public Security of China, 2023".[9] TheCJK Unified Ideographs Extension D block was cited as a precedent, since it comprised a repertoire of urgently needed characters (UNCs) from IRG member bodies, whereas the IRG working-set initially slated to become Extension D would instead becomeExtension E.[10] For compactness, the block was allocated to the available space in theSupplementary Ideographic Plane afterCJK Unified Ideographs Extension F, as opposed to on theTertiary Ideographic Plane afterCJK Unified Ideographs Extension H; this means that the CJK extension blocks are no longer in alphabetical order by extension letter.[11] Following this, the draft GB 18030 amendment was modified to use the Extension I code points.[6]

At its next meeting in October 2023, the IRG expressed concerns about bypassing the IRG for large collections of CJK characters, and noted that two of the characters in Extension I had, for the purposes of other regions' character sources, previously been unified with existing characters under IRG unification rules:[3][12]

  • Allowing for interchangeable forms of thegrass radical,U+2ED9D 𮶝CJK UNIFIED IDEOGRAPH-2ED9D corresponds to the pre-existing T-source (Taiwan) glyph forU+8286 CJK UNIFIED IDEOGRAPH-8286 (referenced fromCNS 11643),[13] as well as to a proposed J-source (Japan) glyph for the same.[14] A character corresponding to the other (G-source, i.e. Mainland China) glyph of U+8286 does exist elsewhere in more recent editions of CNS 11643, so the addition of U+2ED9D impacts the existing correspondences between CNS 11643 and Unicode although, due to neither character being in planes 1 or 2, there are no implications for the Unicode mapping ofBig5.[12]
  • U+2EDE0 𮷠CJK UNIFIED IDEOGRAPH-2EDE0 corresponds to a proposed J-source (Japan) glyph forU+8FF3 CJK UNIFIED IDEOGRAPH-8FF3.[15] It had previously been proposed as a new character twice (once with reference to CNS 11643, and once by Japan), but rejected on the basis that it was unifiable with U+8FF3.[12] The proposed glyph was later moved to the newU+2EDE0 𮷠CJK UNIFIED IDEOGRAPH-2EDE0 code point, per a request by the Japanese national body.[16]

In response, the IRG recommended that, in future, submitters of proposed CJK characters be required to provide information about the impact on other CJK character sources of any disunifications proposed by the submission, and that the IRG be given time to review all large submissions of CJK characters. The IRG encouraged the Chinese body to propose solutions to the issues caused by the addition of these two characters at the next IRG meeting.[3]

Block

[edit]
CJK Unified Ideographs Extension I[1][2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+2EBFx𮯰𮯱𮯲𮯳𮯴𮯵𮯶𮯷𮯸𮯹𮯺𮯻𮯼𮯽𮯾𮯿
U+2EC0x𮰀𮰁𮰂𮰃𮰄𮰅𮰆𮰇𮰈𮰉𮰊𮰋𮰌𮰍𮰎𮰏
U+2EC1x𮰐𮰑𮰒𮰓𮰔𮰕𮰖𮰗𮰘𮰙𮰚𮰛𮰜𮰝𮰞𮰟
U+2EC2x𮰠𮰡𮰢𮰣𮰤𮰥𮰦𮰧𮰨𮰩𮰪𮰫𮰬𮰭𮰮𮰯
U+2EC3x𮰰𮰱𮰲𮰳𮰴𮰵𮰶𮰷𮰸𮰹𮰺𮰻𮰼𮰽𮰾𮰿
U+2EC4x𮱀𮱁𮱂𮱃𮱄𮱅𮱆𮱇𮱈𮱉𮱊𮱋𮱌𮱍𮱎𮱏
U+2EC5x𮱐𮱑𮱒𮱓𮱔𮱕𮱖𮱗𮱘𮱙𮱚𮱛𮱜𮱝𮱞𮱟
U+2EC6x𮱠𮱡𮱢𮱣𮱤𮱥𮱦𮱧𮱨𮱩𮱪𮱫𮱬𮱭𮱮𮱯
U+2EC7x𮱰𮱱𮱲𮱳𮱴𮱵𮱶𮱷𮱸𮱹𮱺𮱻𮱼𮱽𮱾𮱿
U+2EC8x𮲀𮲁𮲂𮲃𮲄𮲅𮲆𮲇𮲈𮲉𮲊𮲋𮲌𮲍𮲎𮲏
U+2EC9x𮲐𮲑𮲒𮲓𮲔𮲕𮲖𮲗𮲘𮲙𮲚𮲛𮲜𮲝𮲞𮲟
U+2ECAx𮲠𮲡𮲢𮲣𮲤𮲥𮲦𮲧𮲨𮲩𮲪𮲫𮲬𮲭𮲮𮲯
U+2ECBx𮲰𮲱𮲲𮲳𮲴𮲵𮲶𮲷𮲸𮲹𮲺𮲻𮲼𮲽𮲾𮲿
U+2ECCx𮳀𮳁𮳂𮳃𮳄𮳅𮳆𮳇𮳈𮳉𮳊𮳋𮳌𮳍𮳎𮳏
U+2ECDx𮳐𮳑𮳒𮳓𮳔𮳕𮳖𮳗𮳘𮳙𮳚𮳛𮳜𮳝𮳞𮳟
U+2ECEx𮳠𮳡𮳢𮳣𮳤𮳥𮳦𮳧𮳨𮳩𮳪𮳫𮳬𮳭𮳮𮳯
U+2ECFx𮳰𮳱𮳲𮳳𮳴𮳵𮳶𮳷𮳸𮳹𮳺𮳻𮳼𮳽𮳾𮳿
U+2ED0x𮴀𮴁𮴂𮴃𮴄𮴅𮴆𮴇𮴈𮴉𮴊𮴋𮴌𮴍𮴎𮴏
U+2ED1x𮴐𮴑𮴒𮴓𮴔𮴕𮴖𮴗𮴘𮴙𮴚𮴛𮴜𮴝𮴞𮴟
U+2ED2x𮴠𮴡𮴢𮴣𮴤𮴥𮴦𮴧𮴨𮴩𮴪𮴫𮴬𮴭𮴮𮴯
U+2ED3x𮴰𮴱𮴲𮴳𮴴𮴵𮴶𮴷𮴸𮴹𮴺𮴻𮴼𮴽𮴾𮴿
U+2ED4x𮵀𮵁𮵂𮵃𮵄𮵅𮵆𮵇𮵈𮵉𮵊𮵋𮵌𮵍𮵎𮵏
U+2ED5x𮵐𮵑𮵒𮵓𮵔𮵕𮵖𮵗𮵘𮵙𮵚𮵛𮵜𮵝𮵞𮵟
U+2ED6x𮵠𮵡𮵢𮵣𮵤𮵥𮵦𮵧𮵨𮵩𮵪𮵫𮵬𮵭𮵮𮵯
U+2ED7x𮵰𮵱𮵲𮵳𮵴𮵵𮵶𮵷𮵸𮵹𮵺𮵻𮵼𮵽𮵾𮵿
U+2ED8x𮶀𮶁𮶂𮶃𮶄𮶅𮶆𮶇𮶈𮶉𮶊𮶋𮶌𮶍𮶎𮶏
U+2ED9x𮶐𮶑𮶒𮶓𮶔𮶕𮶖𮶗𮶘𮶙𮶚𮶛𮶜𮶝𮶞𮶟
U+2EDAx𮶠𮶡𮶢𮶣𮶤𮶥𮶦𮶧𮶨𮶩𮶪𮶫𮶬𮶭𮶮𮶯
U+2EDBx𮶰𮶱𮶲𮶳𮶴𮶵𮶶𮶷𮶸𮶹𮶺𮶻𮶼𮶽𮶾𮶿
U+2EDCx𮷀𮷁𮷂𮷃𮷄𮷅𮷆𮷇𮷈𮷉𮷊𮷋𮷌𮷍𮷎𮷏
U+2EDDx𮷐𮷑𮷒𮷓𮷔𮷕𮷖𮷗𮷘𮷙𮷚𮷛𮷜𮷝𮷞𮷟
U+2EDEx𮷠𮷡𮷢𮷣𮷤𮷥𮷦𮷧𮷨𮷩𮷪𮷫𮷬𮷭𮷮𮷯
U+2EDFx𮷰𮷱𮷲𮷳𮷴𮷵𮷶𮷷𮷸𮷹𮷺𮷻𮷼𮷽𮷾𮷿
U+2EE0x𮸀𮸁𮸂𮸃𮸄𮸅𮸆𮸇𮸈𮸉𮸊𮸋𮸌𮸍𮸎𮸏
U+2EE1x𮸐𮸑𮸒𮸓𮸔𮸕𮸖𮸗𮸘𮸙𮸚𮸛𮸜𮸝𮸞𮸟
U+2EE2x𮸠𮸡𮸢𮸣𮸤𮸥𮸦𮸧𮸨𮸩𮸪𮸫𮸬𮸭𮸮𮸯
U+2EE3x𮸰𮸱𮸲𮸳𮸴𮸵𮸶𮸷𮸸𮸹𮸺𮸻𮸼𮸽𮸾𮸿
U+2EE4x𮹀𮹁𮹂𮹃𮹄𮹅𮹆𮹇𮹈𮹉𮹊𮹋𮹌𮹍𮹎𮹏
U+2EE5x𮹐𮹑𮹒𮹓𮹔𮹕𮹖𮹗𮹘𮹙𮹚𮹛𮹜𮹝
Notes
1.^ As of Unicode version 17.0
2.^ Grey areas indicate non-assigned code points

The CJK Unified Ideographs Extension I block has two ideographicvariation sequences registered in the Unicode Ideographic Variation Database (IVD).[17][18] These sequences specify the desired glyph variant for a given Unicode character.

History

[edit]

The following Unicode-related documents record the purpose and process of defining specific characters in the CJK Unified Ideographs Extension I block:

VersionFinal code points[a]CountL2 IDWG2 IDIRG IDDocument
15.1U+2EBF0..2EE5D622L2/23-011Lunde, Ken (2023-01-11), "18) GB 18030-2022 Amendment",CJK & Unihan Group Recommendations for UTC #174 Meeting
L2/23-057N5201N2591Draft GB 18030-2022 Amendment Feedback & Recommendations, 2023-02-03
L2/23-100GB 18030-2022 Amendment, Draft 2 + Disposition of Comments, Draft 1, 2023-04-10
L2/23-082Lunde, Ken (2023-04-22), "02 and 03",CJK & Unihan Group Recommendations for UTC #175 Meeting
L2/23-106N5214Lunde, Ken (2023-04-24), "The Alternate Proposal—Unicode Version 15.1",Proposal to provisionally assign or accept 603 urgently-needed ideographs
L2/23-076Constable, Peter (2023-05-01), "E.4.2 Proposal to provisionally assign or accept 603 urgently-needed ideographs",UTC #175 Minutes
L2/23-114RN5214R2Lunde, Ken (2023-07-05),Proposal to encode 622 urgently needed ideographs in UCS
L2/23-115Constable, Peter (2023-05-01),USNB Comments on Draft 2 of GB 18030-2020 Amendment 1 and recommendation for ISO/IEC 10646:2022 Amendment 2
L2/23-154N5238Revision of 622 UNCs of China (Feedback on WG2 N5214), 2023-06-30
L2/23-163Lunde, Ken (2023-07-11), "01",CJK & Unihan Group Recommendations for UTC #176 Meeting
L2/23-157Constable, Peter (2023-07-31), "E.1 Section 1 and E.1 Section 9 [Affects U+2EDE3]",UTC #176 Minutes
  1. ^Proposed code points and characters names may differ from final code points and names

References

[edit]
  1. ^"Unicode character database".The Unicode Standard. Retrieved2023-09-12.
  2. ^"Enumerated Versions of The Unicode Standard".The Unicode Standard. Retrieved2023-09-12.
  3. ^abcIdeographic Research Group (2023-10-20)."Recommendation IRG M61.12: Issue of Extension I to Other CJK Source Characters (IRGN2635 & Feedback, IRGN2622)"(PDF).IRG Meeting #61 Recommendations and Action Items.ISO/IEC JTC1/SC2 N4885, WG2 N5243,IRG N2620;UTC L2/23-250.
  4. ^Kaplan, Michael S (2013-03-28)."You call it GB18030, I call it UTF-GBK..."Sorting it all out.
  5. ^abcdUnited States National Body (May 1, 2023)."USNB Comments on Draft 2 of GB 18030-2022 Amendment 1 and recommendation for ISO/IEC 10646:2020 Amendment 2"(PDF).ISO/IEC JTC1/SC2 N4852, WG2 N5222;UTC L2/23-115.
  6. ^abChina National Body (2023-10-13)."IRG #61 Activity Report"(PDF).ISO/IEC JTC1/SC2/WG2/IRG N2623;UTC L2/23-240.
  7. ^Ideographic Research Group (2023-03-24)."Recommendation IRG M60.7: Draft GB18030-2022 Amendment Feedback (IRGN2591, IRGN2605)"(PDF).IRG Meeting #60 Recommendations and Action Items.ISO/IEC JTC1/SC2 N4840, WG2 N5205,IRG N2600;UTC L2/23-087.
  8. ^"CJK Unified Ideographs Extension I"(PDF).The Unicode Standard, Version 15.1.Unicode Consortium. 2023.
  9. ^Lunde, Ken; Cook, Richard, eds. (2023-09-01)."kIRG_GSource".Unicode Han Database (Unihan). Unicode 15.1.0. UAX #38.
  10. ^Lunde, Ken (2023-04-22)."03) L2/23-100: GB 18030-2022 Amendment, Draft 2 + Disposition of Comments, Draft 1"(PDF).CJK & Unihan Group Recommendations for UTC #175 Meeting.UTC L2/23-082.
  11. ^"CJK/Unihan Changes".Unicode 15.1.0.Unicode Consortium. 2023-09-12.To keep the CJK block ranges as compact as possible, Extension I has been added to Plane 2, instead of directly after Extension H on Plane 3. Implementers should also check that their code does not assume that CJK extensions all occur in alphabetic order by the extension letter.
  12. ^abcSim, Cheon-hyeong (2023-05-17)."2. Newly introduced half-duplicated characters"(PDF).Application for Horizontal Extensions of Multiple Sources in CJK-ExtI. pp. 3–5.ISO/IEC JTC1/SC2/WG2/IRG N2635. (Note: the referenced document refers to an earlier draft of Extension I with code points that differ from those in the final version accepted into Unicode. U+2ED90 in the referenced document corresponds toU+2ED9D 𮶝CJK UNIFIED IDEOGRAPH-2ED9D in the final version, while U+2EDD1 in the referenced document corresponds toU+2EDE0 𮷠CJK UNIFIED IDEOGRAPH-2EDE0 in the final version.)
  13. ^"CJK Unified Ideographs"(PDF).The Unicode Standard, Version 15.0.Unicode Consortium. p. 823.
  14. ^Japan National Body (2023-04-24)."WG2 n5221 data file: Proposed Horizontal Extension"(PDF).Request for Horizontal Extension in the J-column of ISO/IEC 10646(PDF). p. 414.ISO/IEC JTC1/SC2/WG2 N5221;UTC L2/23-144.
  15. ^Japan National Body (2023-04-24)."WG2 n5221 data file: Proposed Horizontal Extension"(PDF).Request for Horizontal Extension in the J-column of ISO/IEC 10646(PDF). p. 458.ISO/IEC JTC1/SC2/WG2 N5221;UTC L2/23-144.
  16. ^Suignard, Michel, ed. (2024-01-03)."Disposition of comments on CDAM2.3 to ISO/IEC 10646 6th edition"(PDF).ISO/IEC JTC1/SC2/WG2 N5245,UTC L2/24-016.
  17. ^"Ideographic Variation Database". Unicode Consortium.
  18. ^"UTS #37, Unicode Ideographic Variation Database". Unicode Consortium.

Further reading

[edit]
  • Lunde, Ken (2023-07-15)."The First Amendment". This article details how the CJK Unified Ideographs Extension I block became standardized, and its relationship with two drafts of the GB 18030-2022 amendment.
Block namePlaneChart rangeCharactersHan unificationScripts contained in block

0BMP
0 BMP
2SIP
2 SIP
2 SIP
2 SIP
2 SIP
3TIP
3 TIP
2 SIP
3 TIP
0 BMP
0 BMP
0 BMP
0 BMP
0 BMP
0 BMP
0 BMP
0 BMP
0 BMP
1SMP
2 SIP

4E00–9FFF
3400–4DBF
20000–2A6DF
2A700–2B73F
2B740–2B81F
2B820–2CEAF
2CEB0–2EBEF
30000–3134F
31350–323AF
2EBF0–2EE5F
323B0–3347F
2E80–2EFF
2F00–2FDF
2FF0–2FFF
3000–303F
31C0–31EF
3200–32FF
3300–33FF
F900–FAFF
FE30–FE4F
1F200–1F2FF
2F800–2FA1F

20,992
6,592
42,720
4,160
222
5,774
7,473
4,939
4,192
622
4,298
115
214
16
64
39
255
256
472
32
64
542

Unified
Unified
Unified
Unified
Unified
Unified
Unified
Unified
Unified
Unified
Unified
Not unified
Not unified
Not unified
Not unified
Not unified
Not unified
Not unified
12 are unified
Not unified
Not unified
Not unified

Han
Han
Han
Han
Han
Han
Han
Han
Han
Han
Han
Han
Han
Common
Han,Hangul, Common,Inherited
Common
Hangul,Katakana, Common
Katakana, Common
Han
Common
Hiragana, Common
Han

Totals 
22
104,053
  
  1. ^
    As of version 17.0
Retrieved from "https://en.wikipedia.org/w/index.php?title=CJK_Unified_Ideographs_Extension_I&oldid=1310527464"
Category:
Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp