| Getting Started Common Elements
Mechanics Technical Text Special Pages Special Documents
Creating Graphics Programming Miscellaneous Help and Recommendations Appendices |
This chapter assumes you are using thelatex orpdflatex engines and need to concern yourself with TeX's various encodings.lualatex andxelatex, on the other hand, acceptUnicode input and can usually typeset documents using the correct glyphs without further user intervention. See theFonts chapter's discussion of encoding for additional information. |
In this chapter we will tackle matters related to input encoding, typesetting diacritics and special characters.
In the following document, we will refer tospecial characters for all symbols other than the lowercase letters a–z, uppercase letters A-Z, figures 0–9, and English punctuation marks.
Some languages usually need a dedicated input system to ease document writing. This is the case for Arabic, Chinese, Japanese, Korean and others. This specific matter will be tackled inInternationalization.
The rules for producing characters with diacritical marks, such as accents, differ somewhat depending whether you are in text mode, math mode, or the tabbing environment.
TeX uses ASCII by default. But 128 characters is not enough to support non-English languages. TeX has its own way of doing that with commands for every diacritical marking (seeEscaped codes). But if we want accents and other special characters to appear directly in the source file, we have to tell TeX that we want to use a different encoding.
There are several encodings available to LaTeX:
In the following we will assume that you want to use UTF-8.
There are someimportant steps to specify encoding.
\usepackage[utf8]{inputenc} |
inputenc[1] package tells LaTeX what the text encoding format of your.tex files is.
If you check the character encoding (e.g. using the Unixfile command), be sure that your file contains at least one special character, otherwise it will be recognized as ASCII (which is logical since UTF-8 is a superset of ASCII). |
The inputenc package allows the user to change the encodingwithin the document as well — by means of the command\inputencoding{'encoding name'}.
\usepackage[utf8]{inputenc}% ...% In this area% The UTF-8 encoding is specified.% ...\inputencoding{latin1}% ...% Here the text encoding is specified as ISO Latin-1.% ...\inputencoding{utf8}% Back to the UTF-8 encoding.% ... |
The LaTeX support of UTF-8 is fairly specific: it includes only a limited range of Unicode input characters. It only defines those symbols that are known to be available with the currentfont encoding. You might encounter a situation where using UTF-8 might result in error:
! Package inputenc Error: Unicode char \u8:ũ not set up for use with LaTeX.
This is due to the utf8 definition not necessarily having a mapping of all the character glyphs you are able to enter on your keyboard. Such characters include, for example:
ŷ Ŷ ũ Ũ ẽ Ẽ ĩ Ĩ
In such case, you may try to use theutf8x option to define more character combinations.utf8x is not officially supported, but can be viable in some cases. However, it might break up compatibility with some packages likecsquotes.
Another possibility is to stick withutf8 and to define the characters yourself. This is easy:
\DeclareUnicodeCharacter{'codepoint'}{'TeX sequence'} |
wherecodepoint is the unicode codepoint of the desired character.TeX sequence is what to print when the character matching the codepoint is met.You may find codepoints on thissite. Codepoints are easy to find on the web.Example:
\DeclareUnicodeCharacter{0177}{\^y} |
Now inputting ŷ will effectively print ŷ.
In addition to direct UTF-8 input, LaTeX supports the composition of special characters as well. This is convenient if your keyboard lacks some desired accents and other diacritics.
The following accents may be placed on letters. Although "o" letter is used in most of the examples, the accents may be placed on any letter. Accents may even be placed above a "missing" letter; for example,\~{} produces a tilde over a blank space.
The following commands may be used only in paragraph (default) or LR (left-right) mode.
| LaTeX command | Sample | Description |
|---|---|---|
\`{o} | ò | grave accent |
\'{o} | ó | acute accent |
\^{o} | ô | circumflex |
\"{o} | ö | umlaut, trema or dieresis |
\H{o} | ő | long Hungarian umlaut (double acute) |
\~{o} | õ | tilde |
\c{c} | ç | cedilla |
\k{a} | ą | ogonek |
\l{} | ł | barred l (l with stroke) |
\={o} | ō | macron accent (a bar over the letter) |
\b{o} | o | bar under the letter |
\.{o} | ȯ | dot over the letter |
\d{u} | ụ | dot under the letter |
\r{a} | å | ring over the letter (for å there is also the special command\aa) |
\u{o} | ŏ | breve over the letter |
\v{s} | š | caron/háček ("v") over the letter |
\t{oo} | o͡o | "tie" (inverted u) over the two letters |
\o{} | ø | slashed o (o with stroke) |
{\i} | ı | dotless i (i without tittle) |
Older versions of LaTeX would not remove the dot on top of the i and j letters when adding a diacritic. To correct this, one had to use the dotless version of these letters, by typing\i and\j. For example:
\^{\i} should be used for i-circumflex î;\"{\i} should be used for i-umlaut ï.However,current versions of LaTeX do not need this anymore (and may, in fact, crash with an error).
If a document is to be written completely in a language that requires particular diacritics several times, then using the right configuration allows those characters to be written directly in the document. For example, to achieve easier coding of umlauts, the babel package can be configured as\usepackage[german]{babel}. This provides the short hand"o for\"o. This is very useful if one needs to use some text accents in a label, since no backslash will be accepted otherwise.
More information regarding language configuration can be found in theInternationalization section.
The two symbols '<' and '>' are actually ASCII characters, but you may have noticed that they will print '¡' and '¿' respectively. This is a font encoding issue. If you want them to print their real symbol, you will have to use another font encoding such as T1, loaded with thefontenc package. SeeFonts for more details on font encoding.
Alternatively, they can be printed with dedicated commands:
\textless\textgreater |
When writing about money these days, you need theeuro sign.Thetextcomp package features a\texteuro command which gives you the euro symbol as supplied by your current text font. Depending on your chosen font this may be quite far from the official symbol.
An official version of the euro symbol is provided byeurosym. Load it in the preamble (optionally with theofficial option):
\usepackage[official]{eurosym} |
then you can insert it with the\euro{} command. Finally, if you want a euro symbol that matches with the current font style (e.g., bold, italics, etc.) you can use a different option:
\usepackage[gen]{eurosym} |
again you can insert the euro symbol with\euro{}.
Alternatively, you can use themarvosym package which also provides the official euro symbol.
\usepackage{marvosym}% ...\EUR{} |
Now that you have succeeded in printing a euro sign, you may want the '€' on your keyboard to actually print the euro sign as above.There is a simple method to do that. You must make sure you are using UTF-8 encoding along with a working\euro{} or\EUR{}command.
\DeclareUnicodeCharacter{20AC}{\euro{}}% or\DeclareUnicodeCharacter{20AC}{\EUR{}} |
Complete example:
\usepackage[utf8]{inputenc}\usepackage{marvosym}\DeclareUnicodeCharacter{20AC}{\EUR{}} |
The easiest way to print temperature and angle values is to use the\SI{value}{unit} command from thesiunitx package, which works both in text and math mode:
\usepackage{amsmath}\usepackage{siunitx}%...A$\SI{45}{\degree}$ angle.It is\SI{17}{\degreeCelsius} outside. |
For more information, see thedocumentation of thesiunitx package.
A common mistake is to use the\circ command. It will not print the correct character (though$^\circ$ will). Use thetextcomp package instead, which provides a\textdegree command.
\usepackage{textcomp}%...A$45$\textdegree angle. |
For temperature, you can use the same command or opt for thegensymb package and write
\usepackage{gensymb}\usepackage{textcomp}%...17\,\celsius% best (with textcomp) |
Some keyboard layouts feature the degree symbol, you can use it directly if you are using UTF-8 andtextcomp. For better results in terms of font quality, we recommend the use of an appropriate font, likelmodern:
\usepackage[utf8]{inputenc}\usepackage{lmodern}\usepackage{textcomp}% ...17\,°C17\,℃% best |
LaTeX has many symbols at its disposal. The majority of them are within the mathematical domain, and later chapters will cover how to get access to them. For the more common text symbols, use the following commands:
| Command | Sample | Character |
|---|---|---|
\% | % | |
\$ | $ | |
\{ | { | |
\_ | _ | |
\P | ¶ | |
\ddag | n/a | ‡ |
\textbar | n/a | | |
\textgreater | > | |
\textendash | n/a | – |
\texttrademark | n/a | ™ |
\textexclamdown | n/a | ¡ |
\textsuperscript{a} | a | |
\pounds | n/a | £ |
\# | # | |
\& | & | |
\} | } | |
\S | § | |
\dag | n/a | † |
\textbackslash | n/a | \ |
\textless | < | |
\textemdash | n/a | — |
\textregistered | n/a | ® |
\textquestiondown | n/a | ¿ |
\textcircled{a} | n/a | ⓐ |
\copyright | n/a | © |
Not mentioned in above table, tilde (~) is used in LaTeX code to producenon-breakable space. To get printed tilde sign, either write\~{} or\textasciitilde{}. And a visible space␣ can be created with\textvisiblespace.
For some more interesting symbols, the Postscript ZapfDingbats font is available thanks to thepifont package. Add the declaration to your preamble:\usepackage{pifont}. Next, the command\ding{number}, will print the specified symbol. Here is a table of the available symbols:
Several of the above and some similar accents can also be produced in math mode. The following commands may be used only in math mode.
| LaTeX command | Sample | Description | Text-mode equivalence |
|---|---|---|---|
\hat{o} | circumflex | \^ | |
\widehat{oo} | wide version of\hat over several letters | ||
\check{o} | vee or check | \v | |
\tilde{o} | tilde | \~ | |
\widetilde{oo} | wide version of\tilde over several letters | ||
\acute{o} | acute accent | \' | |
\grave{o} | grave accent | \` | |
\dot{o} | dot over the letter | \. | |
\ddot{o} | two dots over the letter (umlaut in text-mode) | \" | |
\breve{o} | breve | \u | |
\bar{o} | macron | \= | |
\vec{o} | vector (arrow) over the letter |
When applying accents to lettersi andj, you can use\imath and\jmath to keep the dots from interfering with the accents:
| LaTeX command | Sample | Description | Sample with upper dot |
|---|---|---|---|
\hat{\imath} | circumflex on letteri without upper dot | ||
\vec{\jmath} | vector (arrow) on letterj without upper dot |
Some of the accent marks used in running text have other uses in the tabbing environment. In that case they can be created with the following command:
\a' for an acute accent\a` for a grave accent\a= for a macron accent| Wikipedia has related information atUnicode input. |
Some operating systems provide a keyboard combination to input any Unicode code point, the so-calledunicode compose key.
Many X applications (*BSD and GNU/Linux) support theCtrl+Shift+u combination. A "u" symbol should appear. Type the code point and pressenter orspace to actually print the character.Example:
<Ctrl+Shift+u> 20AC <space>
will print the euro character.
Desktop environments like GNOME and KDE may feature a customizable compose key for more memorizable sequences.
Xorg features advanced keyboard layouts with variants that let you enter a lot of characters easily with combination using the appropriate modifier, likeAlt Gr. It highly depends on the selected layout+variant, so we suggest you to play a bit with your keyboard, preceding every key and dead key with theAlt Gr modifier.
In Windows, you can holdAlt and type a<codepoint> to get a desired character. For example,
<Alt> + 0252
will print the German letter ü.
| Previous: List Structures | Index | Next: Internationalization |