User:Octahedron80

ไทย

From Wiktionary, the free dictionary

th	ผู้ใช้คนนี้ใช้ภาษาไทยเป็นภาษาแม่

en-3

This user speaksEnglish at anadvanced level.

zh-1

該用戶能以基本的中文進行交流。
该用户能以基本的中文进行交流。

cmn-1

該用戶能以基本的官話進行交流。
该用户能以基本的官话进行交流。

ja-1

この利用者は初級の日本語ができます。

pi-1

This user speaksPali at abasic level.

{{t}}-3

This user has anadvanced knowledge ofwiki templates, and can modify code written by others.

JS-3

This user has anadvanced knowledge ofJavaScript, and can modify code written by others.

Lua-3

This user has anadvanced knowledge ofLua, and can modify code written by others.

Python-2

This user has anintermediate command ofPython, and can understand most code written by others.

Searchuser languages orscripts

UTC+7

This user's time zone isUTC+7.

🏠 Thai Wiktionary, much more than Thai Wikipedia
🇹🇭󠁔󠁈 Thailand
🎓 B.Sc. (Computer Science) and B.B.A. (Marketing major)
🤖User:OctraBot (now uses PAWS)
😅 Sorry if I re-edit too much on any page.

Notes

[edit]

My source collection
The termsone, two, three, etc in languages must be put undernumeral (notnumber) categories or they will not be lemmas.
Loan blend or hybrid loanword is the compound offoreign word andnative word. We must use bothcom andbor templates.
ArbitraryPUAs are not allowed to name a page title. For unencoded Han characters, useIDS instead. SeeCategory:Terms containing unencoded characters.
- The latest IDS has 17 operators: U3.0⿰ ⿱ ⿲ ⿳ ⿴ ⿵ ⿶ ⿷ ⿸ ⿹ ⿺ ⿻, U15.1⿼ ⿽ ⿾ ⿿ ㇯. Additionally,〾 can also be used but it is discouraged. They are used in theprefix notation. Each operator might require 1, 2, or 3 glyphes.

Fonts

[edit]

Google Fonts (Noto) Download fonts for Unicode blocks you need.
BabelStone & BabelMap, fromAndrew West
- BabelStone Han, supports upto CJK extension I (Some characters may not present.)
- BabelStone Han PUA, unencoded characters queuing to be approved (You can make entries with IDS.)
- BabelMap
TH-Feon (v4.0.0) supports upto CJK extension H (Some characters may not present.)
- If you see elsewhere, please check its version.

Technical issues

[edit]

Invisible characters can be detected in URLs or transliteration.
- ZWSP (U+200B) and other invisible formatting marks are often removed by MS Word or machine learning because they are used as word/line breaks. They have no effect on search results, but they prevent word associations from being found and can lead to duplicate page creation.
- ZWNJ (U+200C) and ZWJ (U+200D) should be retained because they are related to character transformations (Persian, Urdu, Sinhala, emoji, etc). Using and not using ZWNJ/ZWJ will result in different pages. If normal typing does not need to use ZWNJ/ZWJ, it should be completely removed.
Proto-Southwestern Tai & Proto-Tai tone symbols:ᴬ ᴮ ꟲ ᴰ꟱ ᴰᴸ ¹ ² ³ ⁴.
- Unicode 14.0 has the superscript capital C: ꟲ (U+A7F2). (Windows 11 supports.)
- Unicode 17.0 has the superscript capital S: ꟱ (U+A7F1). (Waiting if any font supports.)
OpenType features do not fully function in browsers.
- In Tai Tham script, when there is more than one consonant, such as a double consonant or a cluster (e.g., ᨣᩕ), and then a front vowel (e.g., ᩮ), but some fonts/OS will not move the front vowel to the front. For example, ᨣ+ᩕ+ᩮ should be ᩮ‍ᨣᩕ, not ᨣᩕ‍ᩮ.
- In the New Tai Lue and Tai Viet scripts, the current standard requires that the front vowel must be input before its consonant (the visual ordering), but some fonts/OS wrongly bring the front vowel that follows a consonant forward. For example, ᦃ+ᦶ+ᦈ should be ᦃ‌ᦶᦈ, not ᦶᦃᦈ, resulting in misspelled words even though we entered them correctly.
- A preliminary solution is to insert ZWNJ (between syllables) or ZWJ (within syllables) as a separator in the word (for display purposes only, not for page names).
Theunfootedฐ/ญ forms, which are specially used for Pali/Sanskrit. TheNoto Sans Thai Looped (the only font at the moment) is able to display them by tagging with pi/sa language codes. For page titles, just use the normalฐ/ญ.
- Early computers used fonts with theunfootedฐ/ญ glyphes that separate from the normal letters. When Unicode came around, they were defined in the PUA range, which made them unusable on Wiktionary due to its policy.
- Some dialects also use theunfootedญ form, such as Pattani Malay, but it is still not possible to display by tagging.

Some Han characters may have 2 indices. Here is the list (Unicode 15.0.0):

{"㚇":["夊06","攴06"],"䁀":["目07","日08"],"䍇":["缶05","缶04"],"䧹":["隹05","广10"],"丬":["爿00","丬00"],"亀":["乙10","龜00"],"初":["刀05","衣02"],"勗":["日07","曰07"],"卤":["卜05","卤00"],"卿":["卩08","卩10"],"唾":["口08","口09"],"囊":["口19","衣17"],"垂":["土05","士05"],"捶":["手08","手09"],"斉":["文04","齊00"],"欽":["欠08","金04"],"歯":["止08","齒00"],"氽":["入04","水02"],"渠":["水09","木08"],"牢":["牛03","宀04"],"着":["目07","羊05"],"竜":["立05","龍00"],"耆":["老04","日06"],"辶":["辵00","辶00"],"迸":["辵06","辵08"],"遷":["辵12","辵11"],"靖":["靑05","立08"],"黄":["黃00","黄00"],"齐":["齐00","文02"],"龽":["火11","魚04"],"鿂":["車11","鳥07"],"着":["目07","羊05"],"靖":["靑05","立08"],"𠁣":["丨03","門-04"],"𠁤":["囗02","襾-01"],"𠂙":["丿04","耒-01"],"𠃛":["乙03","門-04"],"𠎤":["人12","龠-03"],"𠔾":["冂02","舟-02"],"𡙶":["大11","龍04"],"𢆡":["乙13","子11"],"𢘅":["心05","矛04"],"𣯡":["毛10","龍04"],"𤄯":["水18","龍11"],"𤏲":["火12","火11"],"𤓰":["爪00","瓜-01"],"𤛿":["牛15","黍07"],"𤮖":["瓦08","瓦10"],"𥉩":["目10","龍05"],"𥪝":["立09","龍04"],"𥪞":["立09","龍04"],"𧢛":["見14","目16"],"𪁠":["鳥07","鳥06"],"𪇺":["鳥15","黍14"],"𪥛":["大10","龍03"],"𪪩":["广15","廾15"],"𪱯":["月17","龍11"],"𪷹":["水15","龍08"],"𪽞":["田10","龍05"],"𪿁":["目14","龍09"],"𫏽":["車10","龍07"],"𫠠":["弋-01","戈-02"],"𫡏":["丿02","攴-01"],"𫣙":["木09","口10"],"𫣚":["木09","口10"],"𬂀":["月07","肉07"],"𬂙":["月17","龍11"],"𬔔":["穴13","龍08"],"𬺱":["一02","木-01"],"𬼉":["丿05","缶-01"],"𭅟":["匸07","艸06"],"𭠍":["手00","戈-01"],"𮄿":["立25","龍20"],"灰":["厂04","火02"],"𱍐":["一04","示00"],"𱑖":["卜07","宀06"],"𱜦":["巾07","卜08"]}

12 code points in CJK Compatibility Ideographs range
﨎, 﨏, 﨑, 﨓, 﨔, 﨟, 﨡, 﨣, 﨤, 﨧, 﨨, 﨩
(U+FA0E, U+FA0F, U+FA11, U+FA13, U+FA14, U+FA1F, U+FA21, U+FA23, U+FA24, U+FA27, U+FA28, and U+FA29)
lack a canonical Decomposition_Mapping value in UnicodeData.txt and so are not true CJK Compatibility Ideographs. These twelve characters should be treated as proper CJK Unified Ideographs.

Retrieved from "https://en.wiktionary.org/w/index.php?title=User:Octahedron80&oldid=86793895"

Categories:

Hidden category:

Pages using deprecated source tags/hidden

ページ先頭

Movatterモバイル変換

User:Octahedron80

Links

Needs cleanup

Notes

Fonts

Technical issues