Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Windows-1252

From Wikipedia, the free encyclopedia
(Redirected fromCP-1252)
Windows character set for Latin alphabet
This article is about the character encoding commonly mislabeled as "ANSI". For the actual ANSI character encoding, seeASCII. For the actual "ANSI extended Latin" encoding, seeANSEL.
Windows-1252
MIME / IANAwindows-1252[1]
Alias(es)cp1252 (code page 1252)
LanguagesAll supported byISO/IEC 8859-1 plus full support for French[a] and Finnish andligature forms forEnglish; e.g. Danish(except for arare exceptional letter), Irish, Italian, Norwegian, Portuguese, Spanish, Swedish, German (missing uppercase[b]), Icelandic, Faroese, Luxembourgish, Albanian, Estonian, Swahili, Tswana, Catalan, Basque, Occitan,Rotokas, Toki Pona, Lojban, Romansh, Dutch (except the IJ/ij character, substituted byIJ/ij orÿ), and Slovene (except theč character, substituted byç). Some languages lack their standard quotation marks (such as German „quotes“).
Created byMicrosoft
StandardWHATWG Encoding Standard
Classificationextended ASCII,Windows-125x
ExtendsISO 8859-1 (excluding C1 controls)
Transforms / EncodesISO 8859-15
Succeeded byUnicode (UTF-8,UTF-16)

Windows-1252 orCP-1252 (Windows code page 1252) is alegacy single-bytecharacter encoding[2] that is used by default (as the "ANSI code page") inMicrosoft Windows throughout theAmericas,Western Europe,Oceania, and much ofAfrica.[3]

Initially the same asISO 8859-1, it began to diverge starting inWindows 2.0 by adding additional characters in the 0x80 to 0x9F (hex) range (the ISO standards reserve this range forC1 control codes). Notable additional characters includecurly quotation marks and all printable characters fromISO 8859-15.

It is the most-used single-byte character encoding in the world. Although almost allwebsites now use the multi-byte character encodingUTF-8, as of February 2026[update], 0.9%[4] of websites declaredISO 8859-1 which is treated as Windows-1252 by all modern browsers (as required by theHTML5 standard[5]), plus 0.3% declared Windows-1252 directly,[4][6] for a total of 1.2%. Some countries or languages show a higher usage than the global average, in 2025 Brazil according to website use, use is at 2.2%,[7] and in Germany at 2.1%[8][9] (these are the sums of ISO-8859-1 and CP-1252 declarations).

Name

[edit]

It is known to Windows by thecode page number 1252, and by theIANA-approved name "windows-1252".

Historically, the phrase "ANSI Code Page" was used in Windows to refer to non-DOS encodings; the intention was that most of these would beANSI standards such asISO-8859-1. Even though Windows-1252 was the first and by far most popular code page named so in Microsoft Windows parlance, the code page has never been an ANSI standard. Microsoft explains, "The term ANSI as used to signify Windows code pages is a historical reference, but is nowadays a misnomer that continues to persist in the Windows community."[10]

LaTeX can input Windows-1252 by usinginputenc.sty with parameteransinew (and more recentlycp1252).[11][12]

IBM uses code page 1252 (CCSID 1252 andeuro sign extended CCSID 5348) for Windows-1252.[13][14][15]

It is called "WE8MSWIN1252" byOracle Database.[16]

History

[edit]
  • The first version of the codepage was used in MicrosoftWindows 1.0. It matched the ISO-8859-1 standard (including leaving code points 0xD7 and 0xF7 undefined, as they were not in the standard at that time).
  • The second version of the codepage was introduced in MicrosoftWindows 2.0. In this version, code points 0xD7, 0xF7, 0x91, and 0x92 are defined.
  • The third version of the codepage was introduced in MicrosoftWindows 3.1. It defined all code points used in the final version except theeuro sign and theZ with caron character pair.
  • The final version (shown below) was introduced in MicrosoftWindows 98.

Starting in the 1990s, manyMicrosoft products that could produce HTML included Windows-1252-exclusive characters, but marked theencoding as ISO-8859-1, ASCII, or undeclared.[citation needed] Characters exclusive to Windows-1252 would render incorrectly on non-Windows operating systems (often as question marks).[17][18] In particular, typographers' quotes—curly variants of the standard straightapostrophes andquotation marks in US-ASCII—were commonly used in files produced in Windows applications such asMicrosoft Word due to thesmart quotes feature, which can automatically convert straight apostrophes and quotation marks to the curly variants.[19] To fix this, by 2000 most web browsers and e-mail clients treated the charsets ISO-8859-1 and US-ASCII as Windows-1252[citation needed]—this behavior is now required by the HTML5 specification.[5] Undeclared charsets in HTML are also assumed to be Windows-1252.[20][21]

AlthoughWindows NT supportedUnicode and attempted to encourage programs to use it, it only provided the 16-bit code units ofUCS-2/UTF-16, despite the existing support for other multibyte character encodings such asShift-JIS. As many applications preferred to use 8-bit strings, Windows-1252 remained the most popular encoding on Windows.[citation needed]UTF-8 has been supported sinceWindows 10 so this is gradually changing.[citation needed]

Codepage layout

[edit]

The following table shows Windows-1252. Differences fromISO-8859-1 have theUnicodecode point number below the character, based on the Unicode.org mapping of Windows-1252 with "best fit". A tooltip, generally available only when one points to the immediate right of the character, shows the Unicode code point name and the decimalAlt code.

Windows-1252 (CP1252)[22][23][24][25][26]
0123456789ABCDEF
0_NULSOHSTXETXEOTENQACKBELBSHTLFVTFFCRSOSI
1_DLEDC1DC2DC3DC4NAKSYNETBCANEMSUBESCFSGSRSUS
2_ SP !"#$%&'()*+,-./
3_0123456789:;<=>?
4_@ABCDEFGHIJKLMNO
5_PQRSTUVWXYZ[\]^_
6_`abcdefghijklmno
7_pqrstuvwxyz{|}~DEL
8_
20AC

201A
ƒ
0192

201E

2026

2020

2021
ˆ
02C6

2030
Š
0160

2039
Œ
0152
Ž
017D
9_
2018

2019

201C

201D

2022

2013

2014
˜
02DC

2122
š
0161

203A
œ
0153
ž
017E
Ÿ
0178
A_NBSP¡¢£¤¥¦§¨©ª«¬SHY®¯
B_°±²³´µ·¸¹º»¼½¾¿
C_ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏ
D_ÐÑÒÓÔÕÖרÙÚÛÜÝÞß
E_àáâãäåæçèéêëìíîï
F_ðñòóôõö÷øùúûüýþÿ

  According to the information on Microsoft's and the Unicode Consortium's websites, positions 81, 8D, 8F, 90, and 9D are unused; however, the Windows APIMultiByteToWideChar maps these to the correspondingC1 control codes. The "best fit" mapping documents this behavior, too.[22]

Related encodings

[edit]

OS/2 extensions

[edit]

TheOS/2 operating system supports an encoding by the name ofCode page 1004 (CCSID 1004) or "Windows Extended".[27][28] This mostly matches code page 1252, with the exception of certainC0 control characters being replaced bydiacritic characters.

Code page 1004 (differing rows only)[29][30][31][32]
0123456789ABCDEF
0_NULSOHSTXETXˉ
02C9
˘
02D8
˙
02D9
BEL˚
02DA
HT˝
02DD
˛
02DB
ˇ
02C7
CRSOSI

MS-DOS extensions (rare)

[edit]

There is a rarely used, but useful, graphics extended code page 1252 where codes 0x00 to 0x1f allow for box drawing as used in applications such as MSDOS Edit and Codeview. One of the applications to use this code page was an Intel Corporation Install/Recovery disk image utility from mid/late 1995. These programs were written for its P6 User Test Program machines (US example[33]). It was used exclusively in its then EMEA region (Europe, Middle East & Africa). In time the programs were changed to usecode page 850.

Graphics Extended Code Page 1252[citation needed]
0123456789ABCDEF
0_
1_

See also

[edit]

Notes

[edit]
  1. ^Excluding thenarrow non-breaking space, which is preferred to the regularnon-breaking space when spacing certain kinds of punctuation.
  2. ^uppercase ẞ was not officially adopted until 2017

References

[edit]
  1. ^Character Sets,Internet Assigned Numbers Authority (IANA), 2018-12-12
  2. ^"Encoding. Living Standard".WHATWG. 13 June 2024. § 9. Legacy single-byte encodings. Retrieved2024-06-28.
  3. ^Karl-Bridge-Microsoft (2021-10-26)."Code Pages - Win32 apps".learn.microsoft.com. Retrieved2024-10-09.
  4. ^ab"Historical trends in the usage statistics of character encodings for websites, February 2026".w3techs.com. Retrieved2026-02-05.
  5. ^ab"Encoding".WHATWG. 27 January 2015. sec. 5.2 Names and labels.Archived from the original on 4 February 2015. Retrieved4 February 2015.
  6. ^"Frequenty Asked Questions".w3techs.com.
  7. ^"Distribution of Character Encodings among websites that use Brazil".W3Techs. Retrieved2026-02-05.
  8. ^"Distribution of Character Encodings among websites that use .de".W3Techs. Retrieved2026-02-05.
  9. ^"Distribution of Character Encodings among websites that use German".W3Techs.Archived from the original on 4 April 2024. Retrieved2025-04-16.
  10. ^Wissink, Cathy (5 April 2002)."Unicode and Windows XP"(PDF).Microsoft. p. 1. Archived fromthe original(PDF) on 4 February 2015. Retrieved4 February 2015.
  11. ^"LaTeX News, Issue 28"(PDF; 379 KB). The LaTeX Project. Apr 2018. Retrieved2024-07-27.
  12. ^"Inputenc – Accept different input encodings". The LaTeX Project. 2024-02-08. Retrieved2024-07-27.
  13. ^"Code page 1252 information document". IBM. 30 September 1997. Archived fromthe original on 2016-03-03.
  14. ^"CCSID 1252 information document". IBM. Archived fromthe original on 2016-03-26.
  15. ^"CCSID 5348 information document". IBM. Archived fromthe original on 2014-11-29.
  16. ^"Database Client Installation Guide". Oracle. Retrieved2021-02-14.
  17. ^Texin, Tex."Comparing Characters in Windows-1252, ISO-8859-1, ISO-8859-15".I18nQA.com.
  18. ^van Emden, Eva (28 January 2011)."How to make typographers' quotes in HTML".vancouvereditor.com. Retrieved7 January 2024.If you use typographers' quotes without specifying the right character encoding for your HTML file, some of your viewers are going to see question marks, boxes, or other crazy symbols instead of the beautiful curly quotes you intended them to see.
  19. ^"Smart quotes in Word".Microsoft Support. Microsoft. Retrieved7 January 2024.
  20. ^"NetWare Web Search: Understanding Character Set Encodings".Novell Documentation. Novell.if a document does not contain a CHARSET encoding value, the default encoding for HTML documents is ISO-8859-1, also known as Latin1. The default encoding for plain text documents is US-ASCII.
  21. ^Observed behavior in Chrome, this may be UTF-8 in some browsers.[original research?]
  22. ^ab"Unicode mappings of Windows-1252 with 'Best Fit'".Unicode.Archived from the original on 4 February 2015. Retrieved4 February 2015.
  23. ^Code Page 01252(PDF), IBM, 1998,archived(PDF) from the original on 27 October 2023
  24. ^Code Page (CPGID) 01252(txt), IBM, 1998,archived from the original on 8 April 2023
  25. ^International Components for Unicode (ICU), ibm-1252_P100-2000.ucm, 2002-12-03
  26. ^International Components for Unicode (ICU), ibm-5348_P100-1997.ucm, 2002-12-03
  27. ^"Code page 1004 information document". Archived fromthe original on 2015-06-25.
  28. ^"CCSID 1004 information document". Archived fromthe original on 2016-03-26.
  29. ^"Code Page 01004"(PDF).IBM.Archived(PDF) from the original on 2015-07-08. (version based on Windows 3.1 version of Windows-1252)
  30. ^Code Page CPGID 01004 (pdf)(PDF), IBM
  31. ^Code Page CPGID 01004 (txt), IBM
  32. ^Borgendale, Ken (2001)."Codepage 1004 - Windows Extended".OS/2 codepages by number.Archived from the original on 2018-05-13. Retrieved2018-05-13. (version based on current version of Windows-1252)
  33. ^Storaasli, Olaf (1996)."Performance of the NASA equation solvers on computational mechanics applications"(PDF).Performance of NASA Equation Solvers on Computational Mechanics Applications. NASA.doi:10.2514/6.1996-1505.S2CID 15711051. Archived fromthe original(PDF) on 2019-05-03.

External links

[edit]
Early telecommunications
ISO/IEC 8859
Bibliographic use
National standards
ISO/IEC 2022
Mac OSCode pages
("scripts")
DOS code pages
IBM AIX code pages
Windows code pages
EBCDIC code pages
DEC terminals (VTx)
Platform specific
Unicode /ISO/IEC 10646
TeX typesetting system
Miscellaneous code pages
Control character
Related topics
Retrieved from "https://en.wikipedia.org/w/index.php?title=Windows-1252&oldid=1336725700"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp