![]() | This help page is ahow-to guide. It explains concepts or processes used by the Wikipedia community. It is not one ofWikipedia's policies or guidelines, and may reflect varying levels ofconsensus. |
Many characters not on the standard computer keyboard will be useful—even necessary—for many pages, and for editions of Wikipedia in other languages. This page contains recommendations for which characters are safe to use and how to enter them.
SeeHelp:Entering special characters.
Most current browsers have some level of Unicode support, but some do it better than others. The most commonly encountered problem is that browsers running on Windows XP rely on preconfigured font links in the registry rather than actually searching for a font that can display the character in question. This means that the browser often had to be forced to use particular fonts. On the English Wikipedia, there are a set of templates to do this. For example,{{IPA}} for theInternational Phonetic Alphabet. The stuff inWindows Glyph List 4 should be safe to use without such special measures.
Unicode support is extended through installing the optional standalone Windows Update package KB2729094,[1] available for both32-bit and64-bit versions ofWindows 7 SP1 from the Microsoft Download Center. Thisbackport from Windows 8 updates theSegoe UI font by adding browser support forEmoji and other symbols to Windows 7. More Emoji characters can be installed by copying the Segoe UI Emoji font file, seguiemj.ttf, from another computer running Windows 8 or later, into the Windows 7 computer. Newer Windows versions provide more emoji characters than older versions.
To display Unicode or special characters on web page(s), one or more of theUnicode fonts need to be present or installed in your computer, first. For proper working functionality,setup orconfiguration orsettings from the web page viewing browser software also needs to be modified.
Special symbols should display properly without further configuration withKonqueror,Opera,Safari, and most other recent browsers. An optional step that can be taken for better (and correct) display of characters withligature forms,combined characters, after the previously mentioned steps were followed, is to install arendering engine software.
For displaying individual special characters, HTML decimal orhexadecimal numeric entity codes can be used in the place of thechar. If a paragraph with lots of special Unicode characters needs to be displayed, then,<p>
...</p>
, or,<span>
...</span>
can also be used.
Theclass="Unicode"
is to be used in web page(s), HTML or wiki tags, where various characters from wide range of various Unicode blocks need to be displayed. If the special characters that need to be displayed on web page(s) are mostly covering fewer Unicode blocks, related toLatin scripts, thenclass="latinx"
can be used. For special characters or symbols related toInternational Phonetic Alphabet,class="IPA"
can be used. Forpolytonic (Greek) characters or related symbols,class="polytonic"
can be used.
Some freely available fonts that include manyUnicode blocks areTITUS Cyberbit Basic andGNU Unifont. TheUnicode font article provides a more general overview throughthis table. If you already know what specific blocks are needed,this section may be more useful. Most articles on specific scripts include information on the corresponding Unicode block.
Note: Many websites (including Wikimedia sites) default to serif or sans-serif fonts depending upon the page element (e.g. headings may default to serif, and body text to sans serif) so it may be necessary to usecustom CSS styling if you wish to override this and force a certain font.
Google Chrome allows the user to set default fonts for normal, serif, sans-serif and monospace display modes. Any font that is currently installed on the system may be used. To access this setting, click the three-dot options icon on the top right of the browser window and selectSettings. Scroll to theAppearance section, and clickCustomize fonts. Here, you can select any fonts on your system to use as defaults.
InMozilla Firefox, to change the font, you need to open theSettings window though theTools menu or the menu button. In theGeneral panel, scroll toFonts and Colors and choose an appropriate font. Usually, any font installed on your system should be available. You may also clickAdvanced to disable custom fonts and choose different fonts for proportional, serif, sans-serif and monospace, but this doesn't seem to be always required.
The default font for Latin scripts in older versions of theInternet Explorer (IE) web browser for Windows isTimes New Roman. Older editions of the font don't include manyUnicode blocks. To choose a different font, follow this path from the IE menu bar: Tools > Internet Options > (General tab >) Fonts > Webpage Font:
to a scrolling list of fonts and select a different one, such asLucida Sans Unicode, and then selectOK.
e.g.Phoenician alphabet,Old Italic alphabet,Linear B, etc.
Please download and install one of these freely licensed fonts
If using aDebian-based Linux (e.g. Ubuntu, Linux Mint), these should be already installed by default. If not, please download and installdeb packagettf-ancient-fonts
by entering interminal:
sudo apt-get install ttf-ancient-fonts
MostIPA symbols are not included in the most widely used form ofTimes New Roman (though they are included in the version provided withWindows Vista), the default font for Latin scripts inInternet Explorer forWindows. To properly view IPA symbols in that browser, you must set it to use afont which includes the IPA extensions characters. Such fonts includeLucida Sans Unicode, which comes withWindows XP;Gentium,Charis SIL,Doulos SIL,DejaVu Sans, orTITUS Cyberbit, which arefreely available; orArial Unicode MS, which comes withMicrosoft Office.On this page, we have forced Internet Explorer to use such a font by default, so it should appear correctly, but this has not yet been done to all the other pages containing IPA. This also applies to other pages usingspecial symbols. Bear this in mind if you see error symbols such as "" in articles. This also happens with former Spanish N with a small N above (Nᷠ nᷠ), Yañalif N with descender (Ꞑ ꞑ), and Volapük second umlaut variants of A, O and U (Ꞛ ꞛ, Ꞝ ꞝ, and Ꞟ ꞟ).Google Chrome and otherChromium-based browsers on Windows have an issue in the font-fallback system, when the font lists for each script is hard coded. Chromium assumes these fonts should always be available, thus only search these fonts, mostly OS-specific through their system fonts, and cannot be user-configured other than changing the default fonts for standard, serif, sans-serif, and fixed-width styles, thus reducing flexibility. Thus some unrecognizable newer characters can't be fixed just by installing suitable external fonts, requiring users to update their operating system to those that contains the missing characters in one of the system fonts.[2][3] Special symbols should display properly without further configuration withMozilla Firefox,Konqueror,Opera,Safari and most other recent browsers.
From MediaWiki 1.5, all projects useUnicode (UTF-8)character encoding.Until the end of June 2005, when this new version came into use on Wikimedia projects, the English, Dutch, Danish, and Swedish Wikipedias usedWindows-1252 (they declared themselves to beISO-8859-1 but in reality browsers treat the two as synonymous and the MediaWiki software made no attempt to prevent use of characters exclusive to windows-1252). Pre-upgrade wikitext in their databases remains stored in Windows-1252 and is converted on load (some of it may also have been converted by gradual changes in the way history is stored). Edits made since the upgrade will be stored as UTF-8 in the database. This conversion on load process is invisible to users. It is also invisible to reusers as Wikimedia now usesXML dumps rather than database dumps.
€
and thedecimal character reference€
and thehexadecimal character reference€
. The edit box shows the entered code, the webpage the resulting character. Unavailable characters which are copied into the edit box are first displayed as the character, andautomatically converted to their decimal codes on Preview or Publish changes.é
, although allowed, is not needed.Note that Special:Export exports using UTF-8 even if the database is encoded in ISO 8859-1, at least that was the case for the English Wikipedia, already when it used version 1.4.To find out which character set applies in a project, use the browser's "View Source" feature and look for something like this:
<meta http-equiv="Content-type" content="text/html; charset=iso-8859-1"/>
or
<meta http-equiv="Content-type" content="text/html; charset=utf-8"/>