Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Template:Unichar

Permanently protected template
From Wikipedia, the free encyclopedia
Template documentation[view] [edit] [history] [purge]
This template usesLua:

This template produces a formatted description of theUnicode character for a givencodepoint, to be used inline or otherwise with regular text.

  • The character{{unichar|a9}} is about intellectual property.
    The characterU+00A9 ©COPYRIGHT SIGN is about intellectual property.

(To provide the equivalent output for a given character, use{{unichar2}}: this will also return its codepoint value.)

Usage

The {{unichar}} template takes the Unicode hexadecimalcode point value as input. Thus, for example,{{unichar|00A9}}U+00A9 ©COPYRIGHT SIGN.

This template produces aformatted description of aUnicode character, to be usedin-line with regular text. It follows the standard Unicode presentation of a character, using the "U+" prefix for displaying the hex code point, followed by its glyph, then optionally by the character name, using Unicode's inline formatting recommendation. In running text such as the Unicode Standard, Wikipedia, or other rich-text environments, the character name is preferredly displayed inSMALL-CAPS STYLE. (The all-caps presentation is mainly designed for plain-text environments.)

The hexadecimal value is required (e.g. A9), other input is optional. The actual glyph is rendered using a font that contains the character. This can be set to something more specific, e.g. to language- orIPA-specific fonts. To show the glyph, the font character can be overridden with an image. A wikilink to an article on the character or set of characters, and another to the articleUnicode can be created. It is also possible to add (bracketed like this), the calculated decimal value, HTML character codes, and a custom note.

Some specialcode points are given extra care, like control and space characters. These are handled automatically (by theunichar/gc sub-template) without user intervention.

Examples

  • {{unichar|00A9}}
    U+00A9 ©COPYRIGHT SIGN
  • {{unichar|00A9|nlink=}}
    U+00A9 ©COPYRIGHT SIGN
  • {{unichar|00A9|nlink=|note={{crossref|See also[[Copyleft]] symbol}}}}
    U+00A9 ©COPYRIGHT SIGN (See alsoCopyleft symbol)
  • {{unichar|00A9|nlink=|html=}}
    U+00A9 ©COPYRIGHT SIGN (©, ©)
  • {{unichar|030D|cwith=}} or{{unichar|030D|cwith=◌}}
    U+030D ◌̍COMBINING VERTICAL LINE ABOVE
  • {{unichar|4E95|note={{zh|p=jǐng|labels=no}}}}
    U+4E95 CJK UNIFIED IDEOGRAPH-4E95 (jǐng)
  • {{unichar|0}}
    U+0000 <control-0000>
    • control characters are handled specially
  • {{unichar|0|ulink=C0 control characters|note=[[Null character|NULL]]}}
    U+0000 <control-0000> (NULL)
    • control characters don't have a canonical name

Parameters

The blank template, with all parameters, is as follows:

{{unichar|ulink=|image=|cwith=|suffix=|size=|use=|use2=|nlink=|html=|note=|name=|alias=}}

Inline version:

{{unichar|<!--hex value-->|ulink=|image=|cwith=|suffix=|size=|use=|use2=|nlink=|html=|note=}}
  • First unnamed parameter or|1=(Required) The hexadecimal value of the code point, e.g.00A9. Prefixing with0x orU+ is allowed.
    Notes: The parameter accepts input likeA9,a9 and00A9 as hexadecimal value.Warning Decimal values are not detected as being decimal, and will give unexpected results(see also§ Possible errors, below).
  • Second unnamed parameter(The Unicode Consortium's canonical name is fetched fromWikimedia Commons, there is no longer any need to specify it manually. If supplied, it is ignored. )
  • nlink=<name> Optional hyperlink to the target article. Name of the Wikipedia page that will be linked to. If used, the Unicode name has a wikilink to the article.
Notes:
  • The formnlink= (without any detail) is the most common way to use this option, to link to the article about the symbol using its canonical name.
  • When used without a name (i.e.,|nlink=, blank with no value), the link points to the article about the character itself except when that causes a problem withtechnical restrictions on naming, in which case the name of the character is used or an error is produced if no such name exists (see§ Presentation effects).
  • The name of the page is case-sensitive as with all Wikipedia pages.
  • It is possible to give a Wiktionary page here, using the syntaxnlink=wikt:<target article>, which may be appropriate if there is no suitable Wikipedia article. For example:
  • Use of this parameter to link to any article other than the one at the canonical name (even if that is a redirect) is potentially anintuitive linking violation, so such use is exceptional and must have a clear justification. ['Copyright sign' and 'copyright symbol' are used here for illustration only andnlink would not normally be used in this case.]
  • cwith= Optional. The only valid content is ◌ or (or its HTML code, &#x25CC;). It may also be used without any content (i.e.,|cwith=, blank with no value). This parameter is useful when the Unicode character iscombining (such as acombining diacritic). Using|cwith=◌, the character will be combined with the placeholder symbol,U+25CC DOTTED CIRCLE.
    without a|cwith=:
    {{unichar|0485}}U+0485 ҅COMBINING CYRILLIC DASIA PNEUMATA
    |cwith= with dotted circle:
    {{unichar|0485|cwith=◌}}U+0485 ◌҅COMBINING CYRILLIC DASIA PNEUMATAor
    {{unichar|0485|cwith=&#x25CC;}}U+0485 ◌҅COMBINING CYRILLIC DASIA PNEUMATA
    |cwith= without an argument:
    {{unichar|0485|cwith=}}U+0485 ◌҅COMBINING CYRILLIC DASIA PNEUMATA
    • Note thatcwith=◌◌ alone does not provide the desired result if the intention is to display a diacritic that spans two characters (such as those in the range U+035C to U+0362): the diacritic will be offset. In such cases, editors must usecwith=◌ together withsuffix=◌ to place two dotted circles – one on either side of the codepoint for the combining diacritic.
      {{unichar|035F|cwith=◌|suffix=◌}}U+035F ◌͟◌COMBINING DOUBLE MACRON BELOW
    • Use of any other character except dotted circle as input to|cwith= is deprecated; this restriction is not currently enforced but if any other character is used, the output (grapheme and description) is at best misleading.
    • For scripts other than Latin, use of the parametersuse=lang anduse2= may additionally be needed for better rendering.
  • suffix= Optional. Its contents directly follow the character. The only supported uses of this parameter are for diacritics that span two characters (see|cwith= above) and for appending variation selectors to certain characters.
    For example, usingU+FE0E VARIATION SELECTOR-15 which forces emoji to display as text, one can have:
    {{unichar|01F604|suffix=&#xFE0E;|note=with {{unichar|FE0E}}}}U+01F604 😄︎SMILING FACE WITH OPEN MOUTH AND SMILING EYES (withU+FE0E VARIATION SELECTOR-15)
    Since variation selectors change the appearance of the character, their presence should always be mentioned in a|note= or in prose.
  • html= Optional. If anamed character reference exists, like"&nbsp;", it is displayed after the character name. Otherwise nothing happens. In particular,numeric character references like&#160; are never displayed. Only the existence of an|html= parameter is checked, everything after the equal sign is ignored so you can leave it blank.
    • {{unichar|160|html=}} →U+0160 ŠLATIN CAPITAL LETTER S WITH CARON (&Scaron;)
  • note= Optional. Adds a comment such as a clarification or explanatory note. For example, as the canonical names of ideographs are not generally helpful, thenote= option permits an added comment such asU+4E95 CJK UNIFIED IDEOGRAPH-4E95 (jǐng)
  • ulink= Optional. Creates a wikilink from theU+ prefix. When used without a name (i.e.,|ulink=, blank with no value), the articleUnicode is used as the default value in the output:[[Unicode|U+]] producingU+. This only needs to change if you have a reason to link elsewhere thanUnicode, e.g. to an article on a subset of Unicode characters.
  • use= Optional. Sets the font-hinting template to get the glyph, since the character may not be present in a regular browser font. By default no template is used; the available options are{{IPA}},{{lang}} and{{script}}.
  • use2= Optional. When setting|use=lang or|use=script,|use2= should be used to set the language (e.g.|use2=fr) or the script (e.g.|use2=Cyrs). A glyph may still not show as expected due to browser effects. For a detailed description, see each template's documentation.
    {{unichar|0485|cwith=◌|use=script|use2=Cyrs}}U+0485 ◌҅COMBINING CYRILLIC DASIA PNEUMATA
    {{unichar|3099|cwith=◌|use=lang|use2=ja}}U+3099 ◌゙COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK (If use+use2 are not used, this is the (undesirable) effect:U+3099 ◌゙COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK: theJapanese diacriticdakuten is not shown properly.)
  • image= Optional. Allows for a graphic image file to represent the glyph; overrides the font completely. The filename should include the extension (like.svg or.png), butnot the prefixFile:.
  • size= Optional. Can be used to set the sizeof the glyph. The default value is125%. For the font, all CSS font-size style inputs are accepted:7px,150%,2em,larger.
    • For example,{{unichar|0041|size=2em}}U+0041 ALATIN CAPITAL LETTER A
    • When using animage (file) instead of a font, this size can only accept sizes inpx like12px. Default for images is10px.
  • name=. Optional; if used, the only permitted content isnone. This parameter is provided for the rare cases where only the code-point and the corresponding character are wanted.
    • For example, {{unichar|a9|name=none}} producesU+00A9 ©.
  • alias=. Optional; if used, the only permitted content isyes. The purpose of this parameter is to handle the very rare cases where the Unicode Consortium has identified that a name is seriously defective and misleading, or has a serious typographical error, and has defined a formal alias that applications are encouraged to use in place of the official character name. (SeeUnicode#Alias for details.)
    • For example, U+A015YI SYLLABLE WU has the formal aliasYI SYLLABLE ITERATION MARK. Thus, rather than {{unichar|A015}} →U+A015 YI SYLLABLE WU, the style {{unichar|A015|alias=yes}} →U+A015 YI SYLLABLE ITERATION MARK is preferred in most contexts.

Presentation effects

Since this template is aimed at presenting aformatted, inline description, some effects are introduced to sustain this target.

  • Showing space characters: All space characters (those withGeneral Category: Zs) are presented with a light-blue background, to show their actual presence and width:U+00A0  NO-BREAK SPACE.
    Incidentally, the regular space  is replaced with&#00A0; (NBSP) to prevent wiki-markup deleting it as repeated spaces.
  • Removing formatting characters: Formatting characters (those withGeneral Category: Cf, Zl and Zp) are removed from the output. By definition, formatting characters have no glyph. By removing them they cannot have a formatting effect.
    Exception: five Arabic Cf/formatting number markings U+0600..U+0603 and U+60DD, are shown. While Cf formatting characters usually have no glyph, these five have. By internally adding "(visible)" to the category, these characters are shown.
  • Removing whitespace: The template removes formatting code and surrounding whitespace from the input. A <Return> in the Name-input (possibly unintended) would frustrate the in-line behaviour expectation.
  • Showing a label like <control-0007>: Unicode states that a code point hasno name when it is one of these: a control character, a private use character, a surrogate, a not assigned code point (reserved), or a non-character. These code points instead should be referred to by using a "Code Point Label", such as <private-use> or <private-use-E000>. In this situation, this templatereplaces the glyph with that label. This way, the correct presentation wins it over Unicode-usage to the letter of the law.
    • "Control" general category=Cc:<control> or<control-0007>
    • "Surrogate" general category=Cs:<surrogate> or<surrogate-D800>
    • "Private Use": general category=Co:<private-use> or<private-use-E000>
    • "Not a character" (minus the reserved code points, see below): general category=Cn:<not-a-character>,<non-character> or<not-a-character-FFFE>
    The second parameter (Unicode name) is not presented, since it cannot exist. It is possible to create a link to an article.
    Note: A <reserved> (unassigned) code point cannot be detected yet, and so is not presented with this label. These code points too are given Cn category.
    (Background on <>-labels: A Name can never have <>-brackets at all. These rules prevent mixing up a name with an actual control-character. So it will not happen that abell rings when a page is opened that contains a Name of U+0007).

Possible errors

  • The template produces anError-message when|1= (or first unnamed parameter), the hex value, is missing, empty, or invalid.
  • A non-hexadecimal input like00G9 produces an error (becauseG org is not hexadecimal).
  • The glyph may be overruled and changed into alabel like<control-0007>. These characters have no Unicode name. Annlink will be directly to the article (entered in a form like|nlink=Bell signal). A blank value of just|nlink= cannot work for<label-hhhh> characters (there is no character name at all to make into a link). This produces an error.
  • A decimal-value input like|1=98 will be read as being hexadecimal value0098. There isno way that the template can detect you intended to enter9810=6216. No warning is issued, and the wrong character,U+009816, will be shown (notU+0062).
  • Thealias= cannot be used to create an unofficial alias.
  • Ifalias=yes is used but the code point does not have an official alias, no name whatever will be displayed.
  • The text provided innlink= should be the normal name of an article. Do not type it in all caps as a red link will result.

Tracking

Technical notes

The string "unichar" is used only in English Wikipedia, as a name for this template. It has no meaning outside this context.

The template uses these subtemplates:

  • {{unichar/main}} Accepts all the input from{{unichar}}. Calls several subtemplates to produce the textstrings, and then strings them together. Also checks for the error non-hex input.
  • {{unichar/ulink}} Creates a piped link for theU+ prefix.
  • {{unichar/gc}} Determines the Unicode general category, when this category is special (like, for control characters).
  • {{unichar/glyph}} For rendering the glyph by font. Accepts|image=, which overrides the font. Also processes|use=,|use2=,|size=,|cwith=.
  • {{unichar/name}} Produces the formatted name of the character insmallcaps. Accepts the|nlink= to create a piped wikilink to an article. When thegeneral category (gc) is special, the name will change into a<label-hhhh>.
  • {{unichar/notes}} Shows notes in parentheses (round brackets):HTML (from|html= named entity like&nbsp; if that exists, using{{#invoke:LoadData|Numcr2namecr}}); and the free-text|note=.
  • Using the main template as an easy-input feature, there are few calculations done (actually only two hex2dec), and allows for adding default values not too deep in the templates.
  • The value<#salted#> is used internally to pass through a non-defined input parameter. This value is correct when about the Unicode name, because it cannot have the characters <##>, and sosalted is the right word (meaning uninhabitable). For ease of code maintenance, it is used in various places in the code.
  • Named entities forU+22C1 N-ARY LOGICAL OR:{{#invoke:LoadData|Numcr2namecr|0x22C1}} → &bigvee;, &Vee;, &xvee;

Issues

  • Unassigned code points, to be labelled <reserved>, cannot be detected.
  • When using|use-script=, then|use2= needs lowercase (e.g. 0485, Cyrs or cyrs)[clarification needed]
  • When using for one of the RTL formatting marks, its effect may break out of the template (text following the template goes RTL, too). As it is now, this requires extra code.

Code charts

Key to the Unicode Code Charts (Ch 24)[1]
SymbolMeaningExamples
Character name alias※ LATIN SMALL LETTER GHA
=Informative alias(es)= barred o, o bar
Informative note
  • • lowercase is 0275 ɵ
  • • Portuguese, Estonian
  • • this is a spacing character
Cross-reference→ 0283 ʃ latin small letter esh
Canonical decomposition mapping≡ 0075 u 031B ◌̛
Compatibility decomposition mapping≈ 006E n 006A j
~Standardized variation sequence~ 2205 FE00 zero with long diagonal stroke overlay form

TemplateData

This is theTemplateData for this template used byTemplateWizard,VisualEditor and other tools.See a monthly parameter usage report forTemplate:Unichar in articles based on its TemplateData.

TemplateData for Unichar

Template data

Formats a Unicode character description inline.

Template parameters[Edit template data]

This template prefers inline formatting of parameters.

ParameterDescriptionTypeStatus
Hex value1

Hexadecimal unicode codepoint

Example
031A
Stringrequired
Character name2

The canonical name is fetched from Wikimedia Commons, there is no longer any need to specify it manually. If supplied, it is ignored.

Example
COMBINING LEFT ANGLE ABOVE
Stringdeprecated
ulinkulink

Add link to the Unicode HEX code point

Example
Phonetic symbols in Unicode
Lineoptional
nlinknlink

Add link to the Unicode character name

Stringoptional
cwithcwith

(for combing characters only) add the following character before this combining character:

Suggested values
Example
Stringoptional
sizesize

Relative size of rendered character

Example
200%
Stringoptional
imageimage

no description

Unknownoptional
useuse

no description

Stringoptional
use2use2

no description

Stringoptional
HTML codehtml

When present, shows HTML named entity

Example
html= shows "&copy;"
Stringoptional
notenote

no description

Lineoptional
aliasalias

no description

Suggested values
yes
Unknownoptional
namename

Hide the name of the character

Suggested values
none
Unknownoptional
suffixsuffix

no description

Unknownoptional

See also

U+00A9169dec&copy;(as literal code, not the character)

External research links

Useful links for researching Unicode characters:

  • Unicode.org charts in PDF format, showing the U+ hex values.
  • Fileformat.info search, to search byname (whole or partial), by U+hex value ordecimal value, or by the fontsymbol (copy-paste it). Extra information provided per character. One character only.
  • branah.com's a multi-character Unicode converter.
  • Unicode properties overview, e.g comma U+002C:[2]
The abovedocumentation istranscluded fromTemplate:Unichar/doc.(edit |history)
Editors can experiment in this template'ssandbox(edit |diff) andtestcases(edit) pages.
Add categories to the/doc subpage.Subpages of this template.
Retrieved from "https://en.wikipedia.org/w/index.php?title=Template:Unichar&oldid=1296498010"
Category:
Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp