Movatterモバイル変換

[0]ホーム

Jump to content

Bidirectional text

Edit links

From Wikipedia, the free encyclopedia

Text that contains both LTR and RTL text

This articleneeds additional citations forverification. Please helpimprove this article byadding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Bidirectional text" – news ·newspapers ·books ·scholar ·JSTOR(July 2015) (Learn how and when to remove this message)

Some web browsers may display the Hebrew text in this article in the reverse direction.

Abidirectional text contains twotext directionalities,right-to-left (RTL) andleft-to-right (LTR). It generally involves text containing different types ofalphabets, but may also refer toboustrophedon, which is changing text direction in each row.

An example is the RTL Hebrew name Sarah:שרה, spelled sin (ש) on the right, resh (ר) in the middle, and heh (ה) on the left. Many computer programs failed to display this correctly, because they were designed to display text in one direction only.

Some so-calledright-to-left scripts such as thePersian script and Arabic are mostly, but not exclusively, right-to-left—mathematical expressions, numeric dates and numbers bearing units are embedded from left to right. That also happens if text from a left-to-right language such as English is embedded in them; or vice versa, if Arabic is embedded in a left-to-right script such as English.

Type^[2]	Description	Strength	Directionality	General scope	Bidi_Control character^[3]
L	Left-to-Right	Strong	L-to-R	Most alphabetic and syllabic characters, Chinese characters, non-European or non-Arabic digits, LRM character, ...	U+200E LEFT-TO-RIGHT MARK (LRM)
R	Right-to-Left	Strong	R-to-L	Adlam, Garay, Hebrew, Mandaic, Mende Kikakui, N'Ko, Samaritan, ancient scripts like Kharoshthi and Nabataean, RLM character, ...	U+200F RIGHT-TO-LEFT MARK (RLM)
AL	Arabic Letter	Strong	R-to-L	Arabic, Hanifi Rohingya, Sogdian, Syriac, and Thaana alphabets, and most punctuation specific to those scripts, ALM character, ...	U+061C ARABIC LETTER MARK (ALM)
EN	European Number	Weak		European digits, Eastern Arabic-Indic digits, Coptic epact numbers, ...
ES	European Separator	Weak		plus sign,minus sign, ...
ET	European Number Terminator	Weak		degree sign, currency symbols, ...
AN	Arabic Number	Weak		Arabic-Indic digits, Arabic decimal and thousands separators, Rumi digits, Hanifi Rohingya digits, ...
CS	Common Number Separator	Weak		colon,comma,full stop,no-break space, ...
NSM	Nonspacing Mark	Weak		Characters in General Categories Mark, nonspacing, and Mark, enclosing (Mn, Me)
BN	Boundary Neutral	Weak		Default ignorables, non-characters, control characters other than those explicitly given other types
B	Paragraph Separator	Neutral		paragraph separator, appropriate Newline Functions, higher-level protocol paragraph determination
S	Segment Separator	Neutral		Tabs
WS	Whitespace	Neutral		space,figure space,line separator,form feed, General Punctuation block spaces (smaller set than theUnicode whitespace list)
ON	Other Neutrals	Neutral		All other characters, includingobject replacement character
LRE	Left-to-Right Embedding	Explicit	L-to-R	LRE character only	U+202A LEFT-TO-RIGHT EMBEDDING (LRE)
LRO	Left-to-Right Override	Explicit	L-to-R	LRO character only	U+202D LEFT-TO-RIGHT OVERRIDE (LRO)
RLE	Right-to-Left Embedding	Explicit	R-to-L	RLE character only	U+202B RIGHT-TO-LEFT EMBEDDING (RLE)
RLO	Right-to-Left Override	Explicit	R-to-L	RLO character only	U+202E RIGHT-TO-LEFT OVERRIDE (RLO)
PDF	Pop Directional Format	Explicit		PDF character only	U+202C POP DIRECTIONAL FORMATTING (PDF)
LRI	Left-to-Right Isolate	Explicit	L-to-R	LRI character only	U+2066 LEFT-TO-RIGHT ISOLATE (LRI)
RLI	Right-to-Left Isolate	Explicit	R-to-L	RLI character only	U+2067 RIGHT-TO-LEFT ISOLATE (RLI)
FSI	First Strong Isolate	Explicit		FSI character only	U+2068 FIRST STRONG ISOLATE (FSI)
PDI	Pop Directional Isolate	Explicit		PDI character only	U+2069 POP DIRECTIONAL ISOLATE (PDI)
Notes 1.^Unicode Bidirectional Algorithm (UAX#9), As of Unicode version 16.0 2.^Possible Bidirectional character types for character property: Bidi_Class or 'type' 3.^Bidi_Control characters: Twelve Bidi_Control formatting characters are defined. They are invisible, and have no effect apart from directionality. Nine of them have a unique, overruling BiDi-type that is used by the algorithm. Their type is also their acronym (e.g. character 'LRE' has BiDi type 'LRE').

Movatterモバイル変換

Bidirectional script support

Unicode bidi support

Strong characters

Weak characters

Neutral characters

Explicit formatting

Marks

Embeddings

Isolates

Overrides

Using Unicode to override

Pops

Runs

Table of possible BiDi character types

Security

Scripts using bidirectional text

Egyptian hieroglyphs

Chinese characters and other CJK scripts

Boustrophedon

Moon type

See also

References

External links