Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WiktionaryThe Free Dictionary
Search

User:Octahedron80

From Wiktionary, the free dictionary
Wiktionary:Babel
thผู้ใช้คนนี้ใช้ภาษาไทยเป็นภาษาแม่
en-3This user speaksEnglish at anadvanced level.
zh-1該用戶能以基本中文進行交流。
该用户能以基本中文进行交流。
cmn-1該用戶能以基本官話進行交流。
该用户能以基本官话进行交流。
ja-1この利用者は初級日本語ができます。
pi-1This user speaksPali at abasic level.
{{t}}-3This user has anadvanced knowledge ofwiki templates, and can modify code written by others.
JS-3This user has anadvanced knowledge ofJavaScript, and can modify code written by others.
Lua-3This user has anadvanced knowledge ofLua, and can modify code written by others.
Python-2This user has anintermediate command ofPython, and can understand most code written by others.
Searchuser languages orscripts
UTC+7This user's time zone isUTC+7.
  • 🏠 Thai Wiktionary, much more than Thai Wikipedia
  • 🇹🇭󠁔󠁈 Thailand
  • 🎓 B.Sc. (Computer Science) and B.B.A. (Marketing major)
  • 🤖User:OctraBot (now uses PAWS)
  • 😅 Sorry if I re-edit too much on any page.

Links

[edit]

Needs cleanup

[edit]

Notes

[edit]
  • My source collection
  • The termsone, two, three, etc in languages must be put undernumeral (notnumber) categories or they will not be lemmas.
  • Loan blend or hybrid loanword is the compound offoreign word andnative word. We must use bothcom andbor templates.
  • ArbitraryPUAs are not allowed to name a page title. For unencoded Han characters, useIDS instead. SeeCategory:Terms containing unencoded characters.
    • The latest IDS has 17 operators: U3.0⿰ ⿱ ⿲ ⿳ ⿴ ⿵ ⿶ ⿷ ⿸ ⿹ ⿺ ⿻, U15.1⿼ ⿽ ⿾ ⿿ ㇯. Additionally, can also be used but it is discouraged. They are used in theprefix notation. Each operator might require 1, 2, or 3 glyphes.

Fonts

[edit]

Technical issues

[edit]
  • Invisible characters can be detected in URLs or transliteration.
    • ZWSP (U+200B) and other invisible formatting marks are often removed by MS Word or machine learning because they are used as word/line breaks. They have no effect on search results, but they prevent word associations from being found and can lead to duplicate page creation.
    • ZWNJ (U+200C) and ZWJ (U+200D) should be retained because they are related to character transformations (Persian, Urdu, Sinhala, emoji, etc). Using and not using ZWNJ/ZWJ will result in different pages. If normal typing does not need to use ZWNJ/ZWJ, it should be completely removed.
  • Proto-Southwestern Tai & Proto-Tai tone symbols:ᴬ ᴮ ꟲ ᴰ꟱ ᴰᴸ ¹ ² ³ ⁴.
    • Unicode 14.0 has the superscript capital C: ꟲ (U+A7F2). (Windows 11 supports.)
    • Unicode 17.0 has the superscript capital S: ꟱ (U+A7F1). (Waiting if any font supports.)
  • OpenType features do not fully function in browsers.
    • In Tai Tham script, when there is more than one consonant, such as a double consonant or a cluster (e.g., ᨣᩕ), and then a front vowel (e.g., ᩮ), but some fonts/OS will not move the front vowel to the front. For example, ᨣ+ᩕ+ᩮ should be ᩮ‍ᨣᩕ, not ᨣᩕ‍ᩮ.
    • In the New Tai Lue and Tai Viet scripts, the current standard requires that the front vowel must be input before its consonant (the visual ordering), but some fonts/OS wrongly bring the front vowel that follows a consonant forward. For example, ᦃ+ᦶ+ᦈ should be ᦃ‌ᦶᦈ, not ᦶᦃᦈ, resulting in misspelled words even though we entered them correctly.
    • A preliminary solution is to insert ZWNJ (between syllables) or ZWJ (within syllables) as a separator in the word (for display purposes only, not for page names).
  • Theunfootedฐ/ญ forms, which are specially used for Pali/Sanskrit. TheNoto Sans Thai Looped (the only font at the moment) is able to display them by tagging with pi/sa language codes. For page titles, just use the normalฐ/ญ.
    • Early computers used fonts with theunfootedฐ/ญ glyphes that separate from the normal letters. When Unicode came around, they were defined in the PUA range, which made them unusable on Wiktionary due to its policy.
    • Some dialects also use theunfooted form, such as Pattani Malay, but it is still not possible to display by tagging.
  • Some Han characters may have 2 indices. Here is the list (Unicode 15.0.0):
    {"㚇":["夊06","攴06"],"䁀":["目07","日08"],"䍇":["缶05","缶04"],"䧹":["隹05","广10"],"丬":["爿00","丬00"],"亀":["乙10","龜00"],"初":["刀05","衣02"],"勗":["日07","曰07"],"卤":["卜05","卤00"],"卿":["卩08","卩10"],"唾":["口08","口09"],"囊":["口19","衣17"],"垂":["土05","士05"],"捶":["手08","手09"],"斉":["文04","齊00"],"欽":["欠08","金04"],"歯":["止08","齒00"],"氽":["入04","水02"],"渠":["水09","木08"],"牢":["牛03","宀04"],"着":["目07","羊05"],"竜":["立05","龍00"],"耆":["老04","日06"],"辶":["辵00","辶00"],"迸":["辵06","辵08"],"遷":["辵12","辵11"],"靖":["靑05","立08"],"黄":["黃00","黄00"],"齐":["齐00","文02"],"龽":["火11","魚04"],"鿂":["車11","鳥07"],"着":["目07","羊05"],"靖":["靑05","立08"],"𠁣":["丨03","門-04"],"𠁤":["囗02","襾-01"],"𠂙":["丿04","耒-01"],"𠃛":["乙03","門-04"],"𠎤":["人12","龠-03"],"𠔾":["冂02","舟-02"],"𡙶":["大11","龍04"],"𢆡":["乙13","子11"],"𢘅":["心05","矛04"],"𣯡":["毛10","龍04"],"𤄯":["水18","龍11"],"𤏲":["火12","火11"],"𤓰":["爪00","瓜-01"],"𤛿":["牛15","黍07"],"𤮖":["瓦08","瓦10"],"𥉩":["目10","龍05"],"𥪝":["立09","龍04"],"𥪞":["立09","龍04"],"𧢛":["見14","目16"],"𪁠":["鳥07","鳥06"],"𪇺":["鳥15","黍14"],"𪥛":["大10","龍03"],"𪪩":["广15","廾15"],"𪱯":["月17","龍11"],"𪷹":["水15","龍08"],"𪽞":["田10","龍05"],"𪿁":["目14","龍09"],"𫏽":["車10","龍07"],"𫠠":["弋-01","戈-02"],"𫡏":["丿02","攴-01"],"𫣙":["木09","口10"],"𫣚":["木09","口10"],"𬂀":["月07","肉07"],"𬂙":["月17","龍11"],"𬔔":["穴13","龍08"],"𬺱":["一02","木-01"],"𬼉":["丿05","缶-01"],"𭅟":["匸07","艸06"],"𭠍":["手00","戈-01"],"𮄿":["立25","龍20"],"灰":["厂04","火02"],"𱍐":["一04","示00"],"𱑖":["卜07","宀06"],"𱜦":["巾07","卜08"]}
  • 12 code points in CJK Compatibility Ideographs range
    , , , , , , , , , , ,
    (U+FA0E, U+FA0F, U+FA11, U+FA13, U+FA14, U+FA1F, U+FA21, U+FA23, U+FA24, U+FA27, U+FA28, and U+FA29)
    lack a canonical Decomposition_Mapping value in UnicodeData.txt and so are not true CJK Compatibility Ideographs. These twelve characters should be treated as proper CJK Unified Ideographs.
Retrieved from "https://en.wiktionary.org/w/index.php?title=User:Octahedron80&oldid=86793895"
Categories:
Hidden category:

[8]ページ先頭

©2009-2025 Movatter.jp