| MIME / IANA | windows-1256 |
|---|---|
| Alias(es) | cp1256 (Code page 1256) |
| Languages | Arabic,Persian,Urdu,English,French (except capital letters with diacritics) |
| Created by | Microsoft |
| Standard | WHATWG Encoding Standard |
| Classification | extended ASCII,Windows-125x |
Windows-1256 is acode page used underMicrosoft Windows to writeArabic and other languages that useArabic script, such asPersian andUrdu.
This code page isneither compatible withISO/IEC 8859-6 nor theMacArabic encoding.
Windows-1256 encodes everyabstract single letter of the basic Arabic alphabet, not every concrete visual form of isolated, initial, medial, final or ligatured letter shape variants (i.e. it encodes characters, not glyphs). The Arabic letters in the C0-FF range are in Arabic alphabetic order, but some Latin characters are interspersed among them. These are someWindows-1252 Latin characters used forFrench, since this European language has some historic relevance in former French colonies in North Africa such asMorocco andAlgeria. This allowed French and Arabic text to be intermixed when using Windows-1256 without any need for code-page switching (however, upper-case letters with diacritics were not included).
IBM usescode page 1256 (CCSID 1256,euro sign extended CCSID 5352, and the further extended CCSID 9448 for some letters used in modern Persian and Urdu) for Windows-1256.[1][2][3][4]
Unicode is preferred over Windows-1256 in modern applications, especially on the Internet, where the dominantUTF-8 encoding is most used for web pages, including for Arabic (see alsoArabic script in Unicode, for complete coverage, unlike for e.g. Windows-1256 orISO/IEC 8859-6 that do not cover extras). Less than 0.03% of all web pages use Windows-1256 in October 2022,[5][6] and while that encoding is mostly used for Arabic, and second-most popular for it, it is only used for 1.6% of the Arabic text on the web.
Since the originalcode page left 9 byte values marked as "NOT USED" in the original specification (hexadecimal 0x80, 0x8A, 0x8F, 0x98, 0x9A, 0x9F, 0xAA, 0xC0, and 0xFF),[7] these bytes were used later for theeuro sign, and for additional letters in thePerso-Arabic script (for thePersian andUrdu languages).[8]
The following table shows the extended version of Windows-1256. Each character is shown with itsUnicode equivalent and its decimal code.
Here every Arabic letter is shown in isolated form. The actual forms of the letters inside Arabic words are rendered by a combination of software rules and appropriate font support.