Movatterモバイル変換


[0]ホーム

URL:


[RFC Home] [TEXT|PDF|HTML] [Tracker] [IPR] [Info page]

Obsoleted by:2978 INFORMATIONAL
Network Working Group                                           N. FreedRequest for Comments: 2278                                      InnosoftBCP: 19                                                        J. PostelCategory: Best Current Practice                                      ISI                                                            January 1998IANA CharsetRegistration ProceduresStatus of this Memo   This document specifies an Internet Best Current Practices for the   Internet Community, and requests discussion and suggestions for   improvements.  Distribution of this memo is unlimited.Copyright Notice   Copyright (C) The Internet Society (1998).  All Rights Reserved.1.  Abstract   MIME [RFC-2045,RFC-2046,RFC-2047,RFC-2184] and various other   modern Internet protocols are capable of using many different   charsets. This in turn means that the ability to label different   charsets is essential. This registration procedure exists solely to   associate a specific name or names with a given charset and to give   an indication of whether or not a given charset can be used in MIME   text objects. In particular, the general applicability and   appropriateness of a given registered charset is a protocol issue,   not a registration issue, and is not dealt with by this registration   procedure.2.  Definitions and Notation   The following sections define various terms used in this document.2.1.  Requirements Notation   This document occasionally uses terms that appear in capital letters.   When the terms "MUST", "SHOULD", "MUST NOT", "SHOULD NOT", and "MAY"   appear capitalized, they are being used to indicate particular   requirements of this specification. A discussion of the meanings of   these terms appears in [RFC-2119].Freed & Postel           Best Current Practice                  [Page 1]

RFC 2278                  Charset Registration              January 19982.2.  Character   A member of a set of elements used for the organisation, control, or   representation of data.2.3.  Charset   The term "charset" (see historical note below) is used here to refer   to a method of converting a sequence of octets into a sequence of   characters. This conversion may also optionally produce additional   control information such as directionality indicators.   Note that unconditional and unambiguous conversion in the other   direction is not required, in that not all characters may be   representable by a given charset and a charset may provide more than   one sequence of octets to represent a particular sequence of   characters.   This definition is intended to allow charsets to be defined in a   variety of different ways, from simple single-table mappings such as   US-ASCII to complex table switching methods such as those that use   ISO 2022's techniques, to be used as charsets.  However, the   definition associated with a charset name must fully specify the   mapping to be performed.  In particular, use of external profiling   information to determine the exact mapping is not permitted.   HISTORICAL NOTE: The term "character set" was originally used in MIME   to describe such straightforward schemes as US-ASCII and ISO-8859-1   which consist of a small set of characters and a simple one-to-one   mapping from single octets to single characters. Multi-octet   character encoding schemes and switching techniques make the   situation much more complex. As such, the definition of this term was   revised to emphasize both the conversion aspect of the process, and   the term itself has been changed to "charset" to emphasize that it is   not, after all, just a set of characters. A discussion of these   issues as well as specification of standard terminology for use in   the IETF appears inRFC 2130.2.4.  Coded Character Set   A Coded Character Set (CCS) is a mapping from a set of abstract   characters to a set of integers. Examples of coded character sets are   ISO 10646 [ISO-10646], US-ASCII [US-ASCII], and the ISO-8859 series   [ISO-8859].Freed & Postel           Best Current Practice                  [Page 2]

RFC 2278                  Charset Registration              January 19982.5.  Character Encoding Scheme   A Character Encoding Scheme (CES) is a mapping from a Coded Character   Set or several coded character sets to a set of octets. A given CES   is typically associated with a single CCS; for example, UTF-8 applies   only to ISO 10646.3.  Registration Requirements   Registered charsets are expected to conform to a number of   requirements as described below.3.1.  Required Characteristics   Registered charsets MUST conform to the definition of a "charset"   given above.  In addition, charsets intended for use in MIME content   types under the "text" top-level type must conform to the   restrictions on that type described inRFC 2045. All registered   charsets MUST note whether or not they are suitable for use in MIME.   All charsets which are constructed as a composition of a CCS and a   CES MUST either include the CCS and CES they are based on in their   registration or else cite a definition of their CCS and CES that   appears elsewhere.   All registered charsets MUST be specified in a stable, openly   available specification. Registration of charsets whose   specifications aren't stable and openly available is forbidden.3.2.  New Charsets   This registration mechanism is not intended to be a vehicle for the   definition of entirely new charsets. This is due to the fact that the   registration process does NOT contain adequate review mechanisims for   such undertakings.   As such, only charsets defined by other processes and standards   bodies, or specific profiles of such charsets, are eligible for   registration.3.3.  Naming Requirements   One or more names MUST be assigned to all registered charsets.   Multiple names for the same charset are permitted, but if multiple   names are assigned a single primary name for the charset MUST be   identified. All other names are considered to be aliases for the   primary name and use of the primary name is preferred over use of any   of the aliases.Freed & Postel           Best Current Practice                  [Page 3]

RFC 2278                  Charset Registration              January 1998   Each assigned name MUST uniquely identify a single charset.  All   charset names MUST be suitable for use as the value of a MIME content   type charset parameter and hence MUST conform to MIME parameter value   syntax. This applies even if the specific charset being registered is   not suitable for use with the "text" media type.   Finally, charsets being registered for use with the "text" media type   MUST have a primary name that conforms to the more restrictive syntax   of the charset field in MIME encoded-words [RFC-2047,RFC-2184] and   MIME extended parameter values [RFC-2184]. A combined ABNF definition   for such names is as follows:   mime-charset = 1*<Any CHAR except SPACE, CTLs, and cspecials>   cspecials    = "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / "                  <"> / "/" / "[" / "]" / "?" / "." / "=" / "*"   CHAR         =  <any ASCII character>        ; (  0-177,  0.-127.)   SPACE        =  <ASCII SP, space>            ; (     40,      32.)   CTL          =  <any ASCII control           ; (  0- 37,  0.- 31.)                    character and DEL>          ; (    177,     127.)3.4.  Functionality Requirement   Charsets must function as actual charsets: Registration of things   that are better thought of as a transfer encoding, as a media type,   or as a collection of separate entities of another type, is not   allowed.  For example, although HTML could theoretically be thought   of as a charset, it is really better thought of as a media type and   as such it cannot be registered as a charset.3.5.  Usage and Implementation Requirements   Use of a large number of charsets in a given protocol may hamper   interoperability. However, the use of a large number of undocumented   and/or unlabelled charsets hampers interoperability even more.   A charset should therefore be registered ONLY if it adds significant   functionality that is valuable to a large community, OR if it   documents existing practice in a large community. Note that charsets   registered for the second reason should be explicitly marked as being   of limited or specialized use and should only be used in Internet   messages with prior bilateral agreement.3.6.  Publication Requirements   Charset registrations can be published in RFCs, however, RFC   publication is not required to register a new charset.Freed & Postel           Best Current Practice                  [Page 4]

RFC 2278                  Charset Registration              January 1998   The registration of a charset does not imply endorsement, approval,   or recommendation by the IANA, IESG, or IETF, or even certification   that the specification is adequate. It is expected that applicability   statements for particular applications will be published from time to   time that recommend implementation of, and support for, charsets that   have proven particularly useful in those contexts.3.7.  MIBenum Requirements   Each registered charset MUST also be assigned a unique enumerated   integer value. These "MIBenum" values are defined by and used in the   Printer MIB [RFC-1759].   A MIBenum value for each charset will be assigned by IANA at the time   of registration.4.  Registration Procedure   The following procedure has been implemented by the IANA for review   and approval of new charsets.  This is not a formal standards   process, but rather an administrative procedure intended to allow   community comment and sanity checking without excessive time delay.4.1.  Present the Charset to the Community   Send the proposed charset registration to the "ietf-   charsets@iana.org" mailing list.  This mailing list has been   established for the sole purpose of reviewing proposed charset   registrations. Proposed charsets are not formally registered and must   not be used; the "x-" prefix specified inRFC 2045 can be used until   registration is complete.   The intent of the public posting is to solicit comments and feedback   on the definition of the charset and the name chosen for it over a   two week period.4.2.  Charset Reviewer   When the two week period has passed and the registration proposer is   convinced that consensus has been achieved, the registration   application should be submitted to IANA and the charset reviewer. The   charset reviewer, who is appointed by the IETF Applications Area   Director(s), either approves the request for registration or rejects   it.  Rejection may occur because of significant objections raised on   the list or objections raised externally.  If the charset reviewer   considers the registration sufficiently important and controversial,   a last call for comments may be issued to the full IETF. The charsetFreed & Postel           Best Current Practice                  [Page 5]

RFC 2278                  Charset Registration              January 1998   reviewer may also recommend standards track processing (before or   after registration) when that appears appropriate and the level of   specification of the charset is adequate.   Decisions made by the reviewer must be posted to the ietf-charsets   mailing list within 14 days. Decisions made by the reviewer may be   appealed to the IESG.4.3.  IANA Registration   Provided that the charset registration has either passed review or   has been successfully appealed to the IESG, the IANA will register   the charset, assign a MIBenum value, and make its registration   available to the community.5.  Location of Registered Charset List   Charset registrations will be posted in the anonymous FTP file   "ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets" and all   registered charsets will be listed in the periodically issued   "Assigned Numbers" RFC [currentlyRFC-1700].  The description of the   charset may also be published as an Informational RFC by sending it   to "rfc-editor@isi.edu" (please follow the instructions to RFC   authors [RFC-2223]).6.  Registration Template   To: ietf-charsets@iana.org   Subject: Registration of new charset   Charset name(s):   (All names must be suitable for use as the value of a MIME content-   type parameter.)   Published specification(s):   (A specification for the charset must be openly available that   accurately describes what is being registered. If a charset is   defined as a composition of a CCS and a CES then these defintions   must either be included or referenced.)   Person & email address to contact for further information:Freed & Postel           Best Current Practice                  [Page 6]

RFC 2278                  Charset Registration              January 19987.  Security Considerations   This registration procedure is not known to raise any sort of   security considerations that are appreciably different from those   already existing in the protocols that employ registered charsets.8.  References   [ISO-2022]        International Standard -- Information Processing -- Character        Code Structure and Extension Techniques, ISO/IEC 2022:1994, 4th        ed.   [ISO-8859]        International Standard -- Information Processing -- 8-bit        Single-Byte Coded Graphic Character Sets        - Part 1: Latin Alphabet No. 1, ISO 8859-1:1987, 1st ed.        - Part 2: Latin Alphabet No. 2, ISO 8859-2:1987, 1st ed.        - Part 3: Latin Alphabet No. 3, ISO 8859-3:1988, 1st ed.        - Part 4: Latin Alphabet No. 4, ISO 8859-4:1988, 1st ed.        - Part 5: Latin/Cyrillic Alphabet, ISO 8859-5:1988, 1st        ed.        - Part 6: Latin/Arabic Alphabet, ISO 8859-6:1987, 1st ed.        - Part 7: Latin/Greek Alphabet, ISO 8859-7:1987, 1st ed.        - Part 8: Latin/Hebrew Alphabet, ISO 8859-8:1988, 1st ed.        - Part 9: Latin Alphabet No. 5, ISO/IEC 8859-9:1989, 1st        ed.        International Standard -- Information Technology -- 8-bit        Single-Byte Coded Graphic Character Sets        - Part 10: Latin Alphabet No. 6, ISO/IEC 8859-10:1992,        1st ed.   [ISO-10646]        ISO/IEC 10646-1:1993(E),  "Information technology --        Universal Multiple-Octet Coded Character Set (UCS) --        Part 1: Architecture and Basic Multilingual Plane",        JTC1/SC2, 1993.   [RFC-2048]        Freed, N., Klensin, J., and J. Postel, "Multipurpose Internet        Mail Extensions (MIME) Part Four: Registration Procedures",RFC2048, November 1996.   [RFC-1700]        Reynolds, J., and J. Postel, "Assigned Numbers", STD 2,RFC1700, October 1994.Freed & Postel           Best Current Practice                  [Page 7]

RFC 2278                  Charset Registration              January 1998   [RFC-1759]        Smith, R., Wright, F., Hastings, T., Zilles, S., and J.        Gyllenskog, "Printer MIB",RFC 1759, March 1995.   [RFC-2045]        Freed, N., and N. Borenstein, "Multipurpose Internet Mail        Extensions (MIME) Part One: Format of Internet Message Bodies",RFC 2045, November 1996.   [RFC-2046]        Freed, N., and N. Borenstein, "Multipurpose Internet Mail        Extensions (MIME) Part Two: Media Types",RFC 2046, November        1996.   [RFC-2047]        Moore, K., "Multipurpose Internet Mail Extensions (MIME) Part        Three: Representation of Non-Ascii Text in Internet Message        Headers",RFC 2047, November 1996.   [RFC-2119]        Bradner, S., "Key words for use in RFCs to Indicate Requirement        Levels",BCP 14,RFC 2119, March 1997.   [RFC-2130]        Weider, C., Preston, C., Simonsen, K., Alvestrand, H., Atkinson,        R., Crispin, M., and P. Svanberg, "Report from the IAB Character        Set Workshop",RFC 2130, April 1997.   [RFC-2184]        Freed, N., and K. Moore, "MIME Parameter Value and Encoded Word        Extensions: Character Sets, Languages, and Continuations",RFC2184, August 1997.   [US-ASCII]        Coded Character Set -- 7-Bit American Standard Code for        Information Interchange, ANSI X3.4-1986.Freed & Postel           Best Current Practice                  [Page 8]

RFC 2278                  Charset Registration              January 19989.  Authors' Addresses   Ned Freed   Innosoft International, Inc.   1050 Lakes Drive   West Covina, CA 91790   USA   Phone: +1 626 919 3600   Fax:   +1 626 919 3614   EMail: ned.freed@innosoft.com   Jon Postel   USC/Information Sciences Institute   4676 Admiralty Way   Marina del Rey, CA  90292   USA   Phone: +1 310 822 1511   Fax:   +1 310 823 6714   EMail: Postel@ISI.EDUFreed & Postel           Best Current Practice                  [Page 9]

RFC 2278                  Charset Registration              January 1998Full Copyright Statement   Copyright (C) The Internet Society (1998).  All Rights Reserved.   This document and translations of it may be copied and furnished to   others, and derivative works that comment on or otherwise explain it   or assist in its implementation may be prepared, copied, published   and distributed, in whole or in part, without restriction of any   kind, provided that the above copyright notice and this paragraph are   included on all such copies and derivative works.  However, this   document itself may not be modified in any way, such as by removing   the copyright notice or references to the Internet Society or other   Internet organizations, except as needed for the purpose of   developing Internet standards in which case the procedures for   copyrights defined in the Internet Standards process must be   followed, or as required to translate it into languages other than   English.   The limited permissions granted above are perpetual and will not be   revoked by the Internet Society or its successors or assigns.   This document and the information contained herein is provided on an   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.Freed & Postel           Best Current Practice                 [Page 10]

[8]ページ先頭

©2009-2026 Movatter.jp