Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Help:Entering special characters

From Wikipedia, the free encyclopedia
For help viewing special characters, seeHelp:Special characters.
For help with punctuation characters as used in Wikipedia, seeHelp:Punctuation.
This help page is ahow-to guide.
It explains concepts or processes used by the Wikipedia community. It is not one ofWikipedia's policies or guidelines, and may reflect varying levels ofconsensus.

Manyspecial characters (those not on the standard computer keyboard) are useful—and sometimes necessary—in Wikipedia articles. Even articles that use only English words may use punctuation such as anem dash (—), and symbols such as a section sign (§) or registered mark (®). Articles about or that mention European persons or places may use manyextended Latin characters, and articles about other persons and places may require characters from entirely different alphabets. This article describes several methods for entering such characters.

Entry methods

[edit]

There are several ways to enter a special character into wikitext.

Special character link

[edit]

Use a special-character link to enter aUnicode (UTF-8) character. Links are available underSpecial characters above the edit window, and below the buttons at the bottom of the edit window (for more information on the latter, seeHelp:CharInsert). Clicking a special-character link enters that character at the current position of the cursor in the edit window, so you need to position the cursor where you want it before clicking the link.

Clicking the arrow to the left ofSpecial characters above the edit window opens a list of groups of images of special characters (see Figure 1 below); clicking again on the arrow (which now points down) closes the list. Click on a group name (e.g.,Symbols) to display that group; click on the image of the appropriate character to enter that character at the current cursor position in the edit window. Some of the images of different characters are very similar in appearance, so it is important to use the correct image. For example, the images for the closing single quotation mark (’) and closing double quotation mark (”) are very similar to the images for the single prime (′) and double prime (″) characters.

Figure 1. Special-character links above edit window:Symbol group


Groups for the special-character links below the edit window are displayed one at a time; the default group isInsert, which includes punctuation and some other common symbols (see Figure 2 below), but another group may be shown if you have previously selected it. Click the down-pointing arrow at the right of this box to display other groups; click on the appropriate group to select it. When the cursor is passed over a special-character link, the link is underlined; clicking on the underlined link enters that character at the current cursor position in the edit window.

Figure 2. Special-character links below edit window: defaultInsert group


Russian letters are in theCyrillic group; most other European letters are in theLatin group. You may need to click several categories in both places to find your special character, especially if it’s non-alphabetic: mathematical symbols can be atSymbols,Insert, orMath and logic (the latter two are only at the bottom link), or atWikipedia:Mathematical symbols and its linked articles.

Some character images and links include pairs of opening and closing quotation marks. By default, the character pair is entered at the current cursor position; if a passage of text is selected before the image or link is clicked, the quotation marks are entered at the beginning and end of the selection.

This functionality is provided by MediaWiki'sCharInsert Extension, which has been installed by Wikipedia administrators.

Keyboard code

[edit]

Enter a Unicode character using anAlt code (Windowsoperating system), theOption key (Macintosh computer), or Unicode combination (Linux).

Some keyboards have aCompose key that provides similar functionality with some other operating systems.

Lists of Alt codes and Option key combinations are given in sources linked underExternal links.

On theiPhone andiPad (IOS), special characters are entered using the template{{Unicode|&#x any-four-digit-hex-number ;}}. (Space between {&#x00A0{ should be removed.) This will display more accurately in some browsers, compared with the just&#x any-four-digit-hex-number ; . In this operating system, the menus of characters at the bottom of WP Edit pages are more limited than with Windows.

Windows—Alt code

[edit]

Under Windows, theAlt key is pressed and held down while a decimalcharacter code is entered on thenumeric keypad; theAlt key is then released and the character appears. The numerical code corresponds to the character’scode point in theWindows-1252code page, with a leading zero; for example, an en dash (–) is entered usingAlt+0150. The leading zero is required; if it is omitted, a character corresponding to the code point in the default OEM code page is entered. For example, if the OEM default iscode page 437,Alt+150 gives û.

On a computer running theMicrosoft Windows operating system, many special characters that have decimal equivalent codepoint numbers below 256 can be typed in by using the keyboard'sAlt+decimal equivalent code numbers keys.

For example, the characteré (Small e with acute accent, HTML entity codeé) can be obtained by pressingAlt+130. First press theAlt key (and keep it depressed) with your left hand, then press the digit keys1,3,0, in sequence, one by one, in the right-sidenumeric keypad part of the keyboard, then release theAlt key.

Many special characters, however, for example λ (small lambda), cannot be obtained from their decimal code 955 (or 0955), by using it with theAlt key insideNotepad orInternet Explorer). You'll get a wrong character, "╗" or "»".

TheWordPad editor accepts (decimal numeric entity codepoints) values above 255, so it can be used to obtain the special/Unicode characters, which can then be copied and pasted where those characters are needed.

To correctly obtain special characters which have decimal code points above 255, another option (not available in Internet Explorer) is to use or type a character's hex equivalent code point first, followed by pressing theAlt+X keys. To produce a λ, for example, open or start WordPad,Notepad,Word,LibreOffice Writer etc. editing application software, then type in3BB (the hexadecimal equivalent numeric code point of the characterλ), then pressAlt+X. Hex code3BB will convert/turn into theλ character, which can now be copied and pasted where you want to use it. (InIE use its HTML hexadecimal equivalent codeλ or its HTML decimal equivalent codeλ.)

Macintosh—Option key

[edit]

On a Macintosh computer, the⌥ Opt key (and sometimes another key) is pressed and held down while another key is pressed; the⌥ Opt key (and when applicable, the other key) is then released, and the character appears. For example, an en dash is entered using⌥ Opt+-; an em dash (—) is entered using⇧ Shift+⌥ Opt+-.

Also on a Macintosh pressing and holding certain letters (the vowels and a few other letters) brings up a pop-up menu of related special characters, such as accented versions of vowels, which can be clicked on or selected numerically.

Linux—Unicode

[edit]

On Linux, one of three methods should work:

  • HoldCtrl+⇧ Shift and typeU followed by up to eight hex digits (on main keyboard or numpad). Then releaseCtrl+⇧ Shift.
  • HoldCtrl+⇧ Shift+U and type up to eight hex digits, then releaseCtrl+⇧ Shift+U.
  • TypeCtrl+⇧ Shift+U, then release those and type up to eight hex digits, then type↵ Enter orSpace.

InLibreOffice,OpenOffice.org andInkscape, for example, only the second method works. InGTK only the third method works.

iOS

[edit]

In the iOS operating system, used on theiPhone andiPad, accented characters used in Western European languages are generated by holding the finger down on the character needing adiacritic, which opens a menu. Some of the most common special characters are also generated this way. Holding the finger on the $ key, for example, accesses ₽ (Spanish peseta, pre-Euro Spanish money), ¥ (yen), € (euro), ¢, £, and ₩. Theen dash,em dash, and • are accessed by holding thehyphen key down. The § is accessed by holding the & down. In addition, there are 308 alternate keyboards which are installed via Settings - General - Language and region - Add language. These include Arabic, Russian, Hebrew,Punjabi, and many obscure ones, likeYiddish,Thai, andArmenian.

It is not possible to directly install a new operating system font in iOS. Third-party applications offer fonts, mostly sans-serif decorative fonts not suitable for text, in the form of alternative keyboards. These programs resemble a TSRTerminate and Stay Resident program under MS-DOS: one runs the program to install the font/keyboard, then exits the program. Keyboards installed are selected by the globe to the left of the spacebar. These programs, since third parties can under some conditions access the users' typing, can bring security risks. Other third-party occupations offer fonts that are only usable within the application.

External application

[edit]

Windows

[edit]

Select, copy and paste the character from theCharacter Map application.

Macintosh

[edit]

There are two external options:

  • Enter the character by double-clicking on the character you want in the Special Characters tool, available at the bottom of any Edit menu. You can customize the character sets that are shown, e.g., to add more phonetic alphabet symbols, by following the directionsgiven here.
  • Enable the Input menu (via the 'Input Sources' panel of the 'Keyboard' System Preferences). This gives access to:
    • the Keyboard Viewer, which can be used to view and input characters accessed via the⌥ Option key
    • the Character Viewer, which can be used to access any Unicode character. It is also available from the Special Characters tool

Linux

[edit]

Select, copy, and paste the character using theGNOME Character Map. If not already installed along with GNOME, it is usually available as "gucharmap" (which can be installed with "yum install gucharmap" as root on a Redhat-like Linux distribution, for example).

In KDE, a similar application is named "KCharSelect". In Debian Linux specifically, you can type "sudo apt install kcharselect" to install it.

HTML character reference (not recommended)

[edit]

Use anHTML character reference. The reference can be eithernamed ornumeric; either type begins with an ampersand (&) ends with a semicolon (;). Anamed reference is of the form&name;; for example,à refers to a lower-case Latina with grave accent (à). Because the names are reasonably mnemonic, they are usually easier to remember than numerical codes, and accordingly are easier for other editors to recognize.

Some Unicode characters, such asTurkish letters, do not have HTML names, so a numerical reference is sometimes the only option using HTML. An HTMLnumeric character reference is of the form&#D; or&#xH;;D andH are the character’sUnicode code point in decimal and hexadecimal. For example, either— or— can be entered to give U+2014, em dash (—). Because a character’s Unicode code point is usually given in hexadecimal with a prefixed "U+", the hexadecimal code is arguably more convenient. Of course, when a name exists, a named reference (e.g.,— for an em dash) is usually more convenient (and more easily recognized) than either numerical code.

HTML character names (and the corresponding hexadecimal and decimal codes) are given inList of XML and HTML character entity references.

Problems with HTML references

[edit]

Because a character reference uses onlyASCII characters, it does not require that a Web browser support Unicode, and it is unambiguous when a Web page does not announce its character encoding, when the browser’s encoding is incorrectly manually set, and even when the character does not display properly with some browsers. Accordingly, it is usually the most "Web safe" approach. However, character references are distracting for many editors, and they may cause difficulties with searches in Wikipedia (see below).

Some old browsers incorrectly interpret codes in the range 128–159 as references to the native character set. Because the code points 128 through 159 are not used for displayable glyphs in eitherISO-8859-1 orUnicode, character references in that range (such asƒ) are illegal in HTML and ambiguous, though they are commonly used by many web sites. Almost all browsers treat ISO-8859-1 as Windows-1252, which does have printable characters in that space, and they often found their way into article titles on English projects, which really caused confusion when trying to create interwiki links to said pages.

Generally speaking, Western European languages, such as Spanish, French, and German pose few problems. For specific details about the language inTurkey, see:Help:Turkish characters. (More may be added to this list as contributors in other languages appear, although according tothis deletion andthis discussion, there may be little need for such lists in the future.)

Editing notes for specific writing systems

[edit]

Egyptian Hieroglyphs

[edit]

E.g.,<hiero>P2</hiero> gives

P2

SeeHelp:WikiHiero syntax.

This is not dependent on browser capabilities, because it uses images on the servers.

Hieroglyphs can also be represented in Unicode using theAegyptus font.

Esperanto

[edit]
in edit boxin database and output
SS
SxŜ
SxxSx
SxxxŜx
SxxxxSxx
SxxxxxŜxx

MediaWiki installations configured for Esperanto use UTF-8 for storage and display. However, when editing the text is converted to a form that is designed to be easier to edit with a standard keyboard.

The characters for which this applies are:Ĉĉ, Ĝĝ, Ĥĥ, Ĵĵ, Ŝŝ, Ŭŭ. You may enter these directly in the edit box if you have the facilities to do so. However when you edit the page again you will see them encoded as Sx. This form is referred to as "x-sistemo" or "x-kodo". In order to preserve round-trip capability when one or morexs follow these characters or their non-accented forms (Cc, Gg, Hh, Jj, Ss, Uu), the number ofxs in the edit box is double the number in the actual stored article text.

For example, the interlanguage link[[en:Luxury car]] toen:Luxury car has to be entered in the edit box as[[en:Luxxury car]] oneo:. This has caused problems with interwiki update bots in the past.

Browser issues

[edit]

Some browsers are known to do nasty things to text in the edit box. Most commonly they convert it to an encoding native to the platform (whilst the NT line of Windows is internallyUCS-2LE—2 Byte subset of UTF-16—it has a complete duplicate set of APIs in the Windows ANSI code page and many older apps tend to use these, especially for things like edit boxes). Then they let the user edit it using a standard edit control and convert it back. The result is that any characters that do not exist in the encoding used for editing get replaced with something that does (often a question mark though at least one browser has been reported to actually transliterate text!).

Google Chrome

[edit]

Google Chrome andChromium both have a cross-platform bug that prevents the use offont substitution.[1] This means that even if the user has the correct typeface for a given script installed, it may not display correctly or at all.

Console browsers

[edit]

Lynx,Links (in text mode) and W3M convert to the console character set (Lynx and Links actually using a transliteration engine) for editing and convert back on save. If the console character set is UTF-8 then these browsers are Unicode safe but if not they are unsafe. With Lynx and Links a possible detection method would be to add another edit box to the login form but this won't work for W3M as it doesn't convert the text to the console character set until the user actually attempts to edit it.

The workaround

[edit]
In database and edit
box for normal browsers
In editbox for
trouble browsers
œ&#x153;
&#x153;&#x0153;
&#x0153;&#x00153;

After English Wikipedia switched to UTF-8 and interwiki bots started replacing HTML entities in interwikis with literal Unicode text, edits that broke Unicode characters became so common they could no longer be ignored. A workaround was developed to allow the problematic browsers to edit safely provided that MediaWiki knew they have problems.

Browsers listed in the setting$wgBrowserBlackList (a list of regexps that match against user agent strings) are supplied text for editing in a special form. Existing hexadecimal HTML entities in the page have an extra leading zero added, non-ASCII characters that are stored in the wikitext are represented as hexadecimal HTML entities with no leading zeros.

Currently the default settings only have IE Mac and a specific version of Netscape 4.x for Linux in the blacklist. Nevertheless it seems to have stopped most of the problems. Hopefully the default list will be expanded in future but that relies on getting someone with CVS access to commit the changes.

Please take into consideration

[edit]

Linking text with special characters

[edit]

Many users have settings giving underlined links. When linking a special character, in some cases the result may be mistaken for another character with a different meaning:

Linking + − < > ⊂ ⊃ gives+<> which may look like ± = ≤ ≥ ⊆ ⊇. In such cases one can better use a separate link:

There is less risk of confusion if more than one character is linked, e.g.x > 3.

Special characters and searches

[edit]

Wikipedia searches are easier if a special character is entered as Unicode. If an HTML entity is used, a word like Odiliënberg can only be found by searching forOdili,euml,nberg or combination thereof; this is actually a bug that should be fixed—the entities should be folded into their raw character equivalents so all searches on them are equivalent. See alsoHelp:Searching.

See also

[edit]

References

[edit]
  1. ^"Font substitution fails on runic unicode characters".Chromium project. Dec 24, 2011. RetrievedNovember 29, 2012.

External links

[edit]
General
technical help
Special
page
-related
Wikitext
Links anddiffs
Media files: images,
videos and sounds
Other graphics
Templates and
Lua modules
Data structure
HTML andCSS
Customisation
and tools
Automated editing
Retrieved from "https://en.wikipedia.org/w/index.php?title=Help:Entering_special_characters&oldid=1315988828"
Categories:
Hidden category:

[8]ページ先頭

©2009-2025 Movatter.jp