Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Module:Unicode convert

Permanently protected module
From Wikipedia, the free encyclopedia
Module documentation[view] [edit] [history] [purge]
Thismodule is rated asready for general use. It has reached a mature state, is considered relatively stable and bug-free, and may be used wherever appropriate. It can be mentioned onhelp pages and other Wikipedia resources as an option for new users. To minimise server load and avoid disruptive output, improvements should be developed throughsandbox testing rather than repeated trial-and-error editing.
Page semi-protectedEditing of this module bynew orunregistered users is currentlydisabled.
See theprotection policy andprotection log for more details. If you cannot edit this module and you wish to make a change, you cansubmit an edit request, discuss changes on thetalk page,request unprotection,log in, orcreate an account.

Usage

Converts Unicode character codes, always given in hexadecimal, to their UTF-8 or UTF-16 representation in upper-case hex or decimal. Can also reverse this for UTF-8. The UTF-16 form will accept and pass through unpaired surrogates e.g.{{#invoke:Unicode convert|getUTF8|D835}} → D835. The reverse functionfromUTF8 accepts multiple characters, and can have both input and output set to decimal.

When using from another module, you may call these functions as e.g.unicodeConvert.getUTF8{ args = {'1F345'} }, without a properframe object.

To find the character code of a given symbol (in decimal), use e.g. {{#invoke:ustring|codepoint|\🐱}} → 128049.

CodeOutput
{{#invoke:Unicode convert|getUTF8|1F345}}F0 9F 8D 85
{{#invoke:Unicode convert|getUTF8|1F345|base=dec}}240 159 141 133
{{#invoke:Unicode convert|fromUTF8|F0 9F 8D 85}}1F345
{{#invoke:Unicode convert|fromUTF8|240 159 141 133|base=dec|basein=dec}}127813
{{#invoke:Unicode convert|getUTF16|1F345}}D83C DF45
{{#invoke:Unicode convert|getUTF16|1F345|base=dec}}55356 57157

See also

The abovedocumentation istranscluded fromModule:Unicode convert/doc.(edit |history)
Editors can experiment in this module's sandbox(create |mirror) and testcases(create) pages.
Subpages of this module.

localp={}-- NOTE: all these functions use frame solely for its args member.-- Modules using them may therefore call them with a fake frame table-- containing only args.p.getUTF8=function(frame)localch=mw.ustring.char(tonumber(frame.args[1]or'0',16)or0)localbytes={mw.ustring.byte(ch,1,-1)}localformat=({['10']='%d',dec='%d'})[frame.args['base']]or'%02X'fori=1,#bytesdobytes[i]=format:format(bytes[i])endreturntable.concat(bytes,' ')endp.getUTF16=function(frame)localcodepoint=tonumber(frame.args[1]or'0',16)or0localformat=({-- TODO reduce the number of options.['10']='%d',dec='%d'})[frame.args['base']]or'%04X'ifcodepoint<=0xFFFFthen-- NB this also returns lone surrogate charactersreturnformat:format(codepoint)elseifcodepoint>0x10FFFFthen-- There are no codepoints above thisreturn''endcodepoint=codepoint-0x10000bit32=require('bit32')return(format..' '..format):format(bit32.rshift(codepoint,10)+0xD800,bit32.band(codepoint,0x3FF)+0xDC00)endp.fromUTF8=function(frame)localbasein=frame.args['basein']=='dec'and10or16localformat=frame.args['base']=='dec'and'%d 'or'%02X 'localbytes={}forbyteinmw.text.gsplit(frame.args[1],'%s')dotable.insert(bytes,tonumber(byte,basein))endlocalchars={mw.ustring.codepoint(string.char(unpack(bytes)),1,-1)}returnformat:rep(#chars):sub(1,-2):format(unpack(chars))endreturnp
Retrieved from "https://en.wikipedia.org/w/index.php?title=Module:Unicode_convert&oldid=1017205304"
Categories:
Hidden category:

[8]ページ先頭

©2009-2025 Movatter.jp