C API: Bidi algorithm.More...
Go to the source code of this file.
Namespaces | |
| icu | |
| Filecoll.h. | |
Macros | |
| #define | UBIDI_DEFAULT_LTR 0xfe |
| Paragraph level setting.More... | |
| #define | UBIDI_DEFAULT_RTL 0xff |
| Paragraph level setting.More... | |
| #define | UBIDI_MAX_EXPLICIT_LEVEL 125 |
| Maximum explicit embedding level.More... | |
| #define | UBIDI_LEVEL_OVERRIDE 0x80 |
| Bit flag for level input.More... | |
| #define | UBIDI_MAP_NOWHERE (-1) |
| Special value which can be returned by the mapping functions when a logical index has no corresponding visual index or vice-versa.More... | |
| #define | UBIDI_KEEP_BASE_COMBINING 1 |
| option flags forubidi_writeReordered()More... | |
| #define | UBIDI_DO_MIRRORING 2 |
| option bit forubidi_writeReordered(): replace characters with the "mirrored" property in RTL runs by their mirror-image mappingsMore... | |
| #define | UBIDI_INSERT_LRM_FOR_NUMERIC 4 |
| option bit forubidi_writeReordered(): surround the run with LRMs if necessary; this is part of the approximate "inverse Bidi" algorithmMore... | |
| #define | UBIDI_REMOVE_BIDI_CONTROLS 8 |
| option bit forubidi_writeReordered(): remove Bidi control characters (this does not affectUBIDI_INSERT_LRM_FOR_NUMERIC)More... | |
| #define | UBIDI_OUTPUT_REVERSE 16 |
| option bit forubidi_writeReordered(): write the output in reverse orderMore... | |
| #define | U_BIDI_CLASS_DEFAULT U_CHAR_DIRECTION_COUNT |
Value returned byUBiDiClassCallback callbacks when there is no need to override the standard Bidi class for a given code point.More... | |
Typedefs | |
| typedef uint8_t | UBiDiLevel |
| UBiDiLevel is the type of the level values in this Bidi implementation.More... | |
| typedef enumUBiDiDirection | UBiDiDirection |
| typedef structUBiDi | UBiDi |
| typedef enumUBiDiReorderingMode | UBiDiReorderingMode |
UBiDiReorderingMode values indicate which variant of the Bidi algorithm to use.More... | |
| typedef enumUBiDiReorderingOption | UBiDiReorderingOption |
UBiDiReorderingOption values indicate which options are specified to affect the Bidi algorithm.More... | |
| typedefUCharDirection | UBiDiClassCallback(const void *context,UChar32 c) |
| Callback type declaration for overriding default Bidi class values with custom ones.More... | |
Enumerations | |
| enum | UBiDiDirection {UBIDI_LTR,UBIDI_RTL,UBIDI_MIXED,UBIDI_NEUTRAL } |
UBiDiDirection values indicate the text direction.More... | |
| enum | UBiDiReorderingMode { UBIDI_REORDER_DEFAULT = 0,UBIDI_REORDER_NUMBERS_SPECIAL,UBIDI_REORDER_GROUP_NUMBERS_WITH_R,UBIDI_REORDER_RUNS_ONLY, UBIDI_REORDER_INVERSE_NUMBERS_AS_L,UBIDI_REORDER_INVERSE_LIKE_DIRECT,UBIDI_REORDER_INVERSE_FOR_NUMBERS_SPECIAL,UBIDI_REORDER_COUNT } |
UBiDiReorderingMode values indicate which variant of the Bidi algorithm to use.More... | |
| enum | UBiDiReorderingOption {UBIDI_OPTION_DEFAULT = 0,UBIDI_OPTION_INSERT_MARKS = 1,UBIDI_OPTION_REMOVE_CONTROLS = 2,UBIDI_OPTION_STREAMING = 4 } |
UBiDiReorderingOption values indicate which options are specified to affect the Bidi algorithm.More... | |
Functions | |
| U_CAPIUBiDi * | ubidi_open (void) |
Allocate aUBiDi structure.More... | |
| U_CAPIUBiDi * | ubidi_openSized (int32_t maxLength, int32_t maxRunCount,UErrorCode *pErrorCode) |
Allocate aUBiDi structure with preallocated memory for internal structures.More... | |
| U_CAPI void | ubidi_close (UBiDi *pBiDi) |
ubidi_close() must be called to free the memory associated with a UBiDi object.More... | |
| U_CAPI void | ubidi_setInverse (UBiDi *pBiDi,UBool isInverse) |
| Modify the operation of the Bidi algorithm such that it approximates an "inverse Bidi" algorithm.More... | |
| U_CAPIUBool | ubidi_isInverse (UBiDi *pBiDi) |
| Is this Bidi object set to perform the inverse Bidi algorithm?More... | |
| U_CAPI void | ubidi_orderParagraphsLTR (UBiDi *pBiDi,UBool orderParagraphsLTR) |
| Specify whether block separators must be allocated level zero, so that successive paragraphs will progress from left to right.More... | |
| U_CAPIUBool | ubidi_isOrderParagraphsLTR (UBiDi *pBiDi) |
| Is this Bidi object set to allocate level 0 to block separators so that successive paragraphs progress from left to right?More... | |
| U_CAPI void | ubidi_setReorderingMode (UBiDi *pBiDi,UBiDiReorderingMode reorderingMode) |
| Modify the operation of the Bidi algorithm such that it implements some variant to the basic Bidi algorithm or approximates an "inverse Bidi" algorithm, depending on different values of the "reordering mode".More... | |
| U_CAPIUBiDiReorderingMode | ubidi_getReorderingMode (UBiDi *pBiDi) |
| What is the requested reordering mode for a given Bidi object?More... | |
| U_CAPI void | ubidi_setReorderingOptions (UBiDi *pBiDi, uint32_t reorderingOptions) |
| Specify which of the reordering options should be applied during Bidi transformations.More... | |
| U_CAPI uint32_t | ubidi_getReorderingOptions (UBiDi *pBiDi) |
| What are the reordering options applied to a given Bidi object?More... | |
| U_CAPI void | ubidi_setContext (UBiDi *pBiDi, constUChar *prologue, int32_t proLength, constUChar *epilogue, int32_t epiLength,UErrorCode *pErrorCode) |
| Set the context before a call toubidi_setPara().More... | |
| U_CAPI void | ubidi_setPara (UBiDi *pBiDi, constUChar *text, int32_t length,UBiDiLevel paraLevel,UBiDiLevel *embeddingLevels,UErrorCode *pErrorCode) |
| Perform the Unicode Bidi algorithm.More... | |
| U_CAPI void | ubidi_setLine (constUBiDi *pParaBiDi, int32_t start, int32_t limit,UBiDi *pLineBiDi,UErrorCode *pErrorCode) |
ubidi_setLine() sets aUBiDi to contain the reordering information, especially the resolved levels, for all the characters in a line of text.More... | |
| U_CAPIUBiDiDirection | ubidi_getDirection (constUBiDi *pBiDi) |
| Get the directionality of the text.More... | |
| U_CAPIUBiDiDirection | ubidi_getBaseDirection (constUChar *text, int32_t length) |
| Gets the base direction of the text provided according to the Unicode Bidirectional Algorithm.More... | |
| U_CAPI constUChar * | ubidi_getText (constUBiDi *pBiDi) |
| Get the pointer to the text.More... | |
| U_CAPI int32_t | ubidi_getLength (constUBiDi *pBiDi) |
| Get the length of the text.More... | |
| U_CAPIUBiDiLevel | ubidi_getParaLevel (constUBiDi *pBiDi) |
| Get the paragraph level of the text.More... | |
| U_CAPI int32_t | ubidi_countParagraphs (UBiDi *pBiDi) |
| Get the number of paragraphs.More... | |
| U_CAPI int32_t | ubidi_getParagraph (constUBiDi *pBiDi, int32_t charIndex, int32_t *pParaStart, int32_t *pParaLimit,UBiDiLevel *pParaLevel,UErrorCode *pErrorCode) |
| Get a paragraph, given a position within the text.More... | |
| U_CAPI void | ubidi_getParagraphByIndex (constUBiDi *pBiDi, int32_t paraIndex, int32_t *pParaStart, int32_t *pParaLimit,UBiDiLevel *pParaLevel,UErrorCode *pErrorCode) |
| Get a paragraph, given the index of this paragraph.More... | |
| U_CAPIUBiDiLevel | ubidi_getLevelAt (constUBiDi *pBiDi, int32_t charIndex) |
| Get the level for one character.More... | |
| U_CAPI constUBiDiLevel * | ubidi_getLevels (UBiDi *pBiDi,UErrorCode *pErrorCode) |
| Get an array of levels for each character.More... | |
| U_CAPI void | ubidi_getLogicalRun (constUBiDi *pBiDi, int32_t logicalPosition, int32_t *pLogicalLimit,UBiDiLevel *pLevel) |
| Get a logical run.More... | |
| U_CAPI int32_t | ubidi_countRuns (UBiDi *pBiDi,UErrorCode *pErrorCode) |
| Get the number of runs.More... | |
| U_CAPIUBiDiDirection | ubidi_getVisualRun (UBiDi *pBiDi, int32_t runIndex, int32_t *pLogicalStart, int32_t *pLength) |
| Get one run's logical start, length, and directionality, which can be 0 for LTR or 1 for RTL.More... | |
| U_CAPI int32_t | ubidi_getVisualIndex (UBiDi *pBiDi, int32_t logicalIndex,UErrorCode *pErrorCode) |
| Get the visual position from a logical text position.More... | |
| U_CAPI int32_t | ubidi_getLogicalIndex (UBiDi *pBiDi, int32_t visualIndex,UErrorCode *pErrorCode) |
| Get the logical text position from a visual position.More... | |
| U_CAPI void | ubidi_getLogicalMap (UBiDi *pBiDi, int32_t *indexMap,UErrorCode *pErrorCode) |
| Get a logical-to-visual index map (array) for the characters in the UBiDi (paragraph or line) object.More... | |
| U_CAPI void | ubidi_getVisualMap (UBiDi *pBiDi, int32_t *indexMap,UErrorCode *pErrorCode) |
| Get a visual-to-logical index map (array) for the characters in the UBiDi (paragraph or line) object.More... | |
| U_CAPI void | ubidi_reorderLogical (constUBiDiLevel *levels, int32_t length, int32_t *indexMap) |
| This is a convenience function that does not use a UBiDi object.More... | |
| U_CAPI void | ubidi_reorderVisual (constUBiDiLevel *levels, int32_t length, int32_t *indexMap) |
| This is a convenience function that does not use a UBiDi object.More... | |
| U_CAPI void | ubidi_invertMap (const int32_t *srcMap, int32_t *destMap, int32_t length) |
| Invert an index map.More... | |
| U_CAPI int32_t | ubidi_getProcessedLength (constUBiDi *pBiDi) |
Get the length of the source text processed by the last call toubidi_setPara().More... | |
| U_CAPI int32_t | ubidi_getResultLength (constUBiDi *pBiDi) |
Get the length of the reordered text resulting from the last call toubidi_setPara().More... | |
| U_CAPIUCharDirection | ubidi_getCustomizedClass (UBiDi *pBiDi,UChar32 c) |
| Retrieve the Bidi class for a given code point.More... | |
| U_CAPI void | ubidi_setClassCallback (UBiDi *pBiDi,UBiDiClassCallback *newFn, const void *newContext,UBiDiClassCallback **oldFn, const void **oldContext,UErrorCode *pErrorCode) |
| Set the callback function and callback data used by the UBA implementation for Bidi class determination.More... | |
| U_CAPI void | ubidi_getClassCallback (UBiDi *pBiDi,UBiDiClassCallback **fn, const void **context) |
| Get the current callback function used for Bidi class determination.More... | |
| U_CAPI int32_t | ubidi_writeReordered (UBiDi *pBiDi,UChar *dest, int32_t destSize, uint16_t options,UErrorCode *pErrorCode) |
Take aUBiDi object containing the reordering information for a piece of text (one or more paragraphs) set byubidi_setPara() or for a line of text set byubidi_setLine() and write a reordered string to the destination buffer.More... | |
| U_CAPI int32_t | ubidi_writeReverse (constUChar *src, int32_t srcLength,UChar *dest, int32_t destSize, uint16_t options,UErrorCode *pErrorCode) |
| Reverse a Right-To-Left run of Unicode text.More... | |
C API: Bidi algorithm.
This is an implementation of the Unicode Bidirectional Algorithm. The algorithm is defined in theUnicode Standard Annex #9.
Note: Libraries that perform a bidirectional algorithm and reorder strings accordingly are sometimes called "Storage Layout Engines". ICU's Bidi and shaping (u_shapeArabic()) APIs can be used at the core of such "Storage Layout Engines".
In functions with an error code parameter, thepErrorCode pointer must be valid and the value that it points to must not indicate a failure before the function call. Otherwise, the function returns immediately. After the function call, the value indicates success or failure.
The "limit" of a sequence of characters is the position just after their last character, i.e., one more than that position.
Some of the API functions provide access to "runs". Such a "run" is defined as a sequence of characters that are at the same embedding level after performing the Bidi algorithm.
This is (hypothetical) sample code that illustrates how the ICU Bidi API could be used to render a paragraph of text. Rendering code depends highly on the graphics system, therefore this sample code must make a lot of assumptions, which may or may not match any existing graphics system's properties.
The basic assumptions are:
*#include <unicode/ubidi.h>*typedefenum {styleNormal=0, styleSelected=1,styleBold=2, styleItalics=4,styleSuper=8, styleSub=16*} Style;*typedefstruct{ int32_t limit; Style style; } StyleRun;const StyleRun *styleRuns,int styleRunCount);// set *pLimit and *pStyleRunLimit for a line// from text[start] and from styleRuns[styleRunStart]// using ubidi_getLogicalRun(para, ...)UBiDi *para,const StyleRun *styleRuns,int styleRunStart,int *pStyleRunLimit,int *pLineWidth);// render runs on a line sequentially, always from left to right// prepare rendering a new line// render a run of text and advance to the right by the run width// the text[start..limit-1] is always in logical orderUBiDiDirection textDirection, Style style);// We could compute a cross-product// from the style runs with the directional runs// and then reorder it.// Instead, here we iterate over each run type// and render the intersections -// with shortcuts in simple (and common) cases.// renderParagraph() is the main function.// render a directional run with// (possibly) multiple style runs intersecting with itint32_t start, int32_t limit,UBiDiDirection direction,const StyleRun *styleRuns,int styleRunCount) {int i;// iterate over style runsif(direction==UBIDI_LTR) {int styleLimit;for(i=0; i<styleRunCount; ++i) {styleLimit=styleRuns[i].limit;if(start<styleLimit) {if(styleLimit>limit) { styleLimit=limit; }renderRun(text, start, styleLimit,direction, styleRuns[i].style);if(styleLimit==limit) {break; }start=styleLimit;}}}else {int styleStart;for(i=styleRunCount-1; i>=0; --i) {if(i>0) {styleStart=styleRuns[i-1].limit;}else {styleStart=0;}if(limit>=styleStart) {if(styleStart<start) { styleStart=start; }renderRun(text, styleStart, limit,direction, styleRuns[i].style);if(styleStart==start) {break; }limit=styleStart;}}}}// the line object represents text[start..limit-1]int32_t start, int32_t limit,const StyleRun *styleRuns,int styleRunCount,UErrorCode *pErrorCode) {UBiDiDirection direction=ubidi_getDirection(line);if(direction!=UBIDI_MIXED) {// unidirectionalif(styleRunCount<=1) {renderRun(text, start, limit, direction, styleRuns[0].style);}else {renderDirectionalRun(text, start, limit,direction, styleRuns, styleRunCount);}}else {// mixed-directionalint32_t count, i, length;UBiDiLevel level;count=ubidi_countRuns(line, pErrorCode);if(U_SUCCESS(*pErrorCode)) {if(styleRunCount<=1) {Style style=styleRuns[0].style;// iterate over directional runsfor(i=0; i<count; ++i) {direction=ubidi_getVisualRun(line, i, &start, &length);renderRun(text, start, start+length, direction, style);}}else {int32_t j;// iterate over both directional and style runsfor(i=0; i<count; ++i) {direction=ubidi_getVisualRun(line, i, &start, &length);renderDirectionalRun(text, start, start+length,direction, styleRuns, styleRunCount);}}}}}UBiDiDirection textDirection,const StyleRun *styleRuns,int styleRunCount,int lineWidth,UErrorCode *pErrorCode) {UBiDi *para;return;}para=ubidi_openSized(length, 0, pErrorCode);ubidi_setPara(para, text, length,textDirection ?UBIDI_DEFAULT_RTL :UBIDI_DEFAULT_LTR,NULL, pErrorCode);if(U_SUCCESS(*pErrorCode)) {UBiDiLevel paraLevel=1&ubidi_getParaLevel(para);StyleRun styleRun={ length, styleNormal };int width;if(styleRuns==NULL || styleRunCount<=0) {styleRunCount=1;styleRuns=&styleRun;}// assume styleRuns[styleRunCount-1].limit>=lengthwidth=getTextWidth(text, 0, length, styleRuns, styleRunCount);if(width<=lineWidth) {// everything fits onto one line// prepare rendering a new line from either left or rightstartLine(paraLevel, width);renderLine(para, text, 0, length,styleRuns, styleRunCount, pErrorCode);}else {UBiDi *line;// we need to render several linesline=ubidi_openSized(length, 0, pErrorCode);if(line!=NULL) {int32_t start=0, limit;int styleRunStart=0, styleRunLimit;for(;;) {limit=length;styleRunLimit=styleRunCount;getLineBreak(text, start, &limit, para,styleRuns, styleRunStart, &styleRunLimit,&width);ubidi_setLine(para, start, limit, line, pErrorCode);if(U_SUCCESS(*pErrorCode)) {// prepare rendering a new line// from either left or rightstartLine(paraLevel, width);renderLine(line, text, start, limit,styleRuns+styleRunStart,styleRunLimit-styleRunStart, pErrorCode);}if(limit==length) {break; }start=limit;styleRunStart=styleRunLimit-1;if(start>=styleRuns[styleRunStart].limit) {++styleRunStart;}}ubidi_close(line);}}}ubidi_close(para);*}*U_CAPI UBiDi * ubidi_openSized(int32_t maxLength, int32_t maxRunCount, UErrorCode *pErrorCode)Allocate a UBiDi structure with preallocated memory for internal structures.U_CAPI int32_t ubidi_countRuns(UBiDi *pBiDi, UErrorCode *pErrorCode)Get the number of runs.U_CAPI void ubidi_setLine(const UBiDi *pParaBiDi, int32_t start, int32_t limit, UBiDi *pLineBiDi, UErrorCode *pErrorCode)ubidi_setLine() sets a UBiDi to contain the reordering information, especially the resolved levels,...U_CAPI void ubidi_close(UBiDi *pBiDi)ubidi_close() must be called to free the memory associated with a UBiDi object.U_CAPI void ubidi_setPara(UBiDi *pBiDi, const UChar *text, int32_t length, UBiDiLevel paraLevel, UBiDiLevel *embeddingLevels, UErrorCode *pErrorCode)Perform the Unicode Bidi algorithm.U_CAPI UBiDiDirection ubidi_getDirection(const UBiDi *pBiDi)Get the directionality of the text.uint8_t UBiDiLevelUBiDiLevel is the type of the level values in this Bidi implementation.Definition:ubidi.h:340U_CAPI UBiDiLevel ubidi_getParaLevel(const UBiDi *pBiDi)Get the paragraph level of the text.U_CAPI UBiDiDirection ubidi_getVisualRun(UBiDi *pBiDi, int32_t runIndex, int32_t *pLogicalStart, int32_t *pLength)Get one run's logical start, length, and directionality, which can be 0 for LTR or 1 for RTL.#define NULLDefine NULL if necessary, to nullptr for C++ and to ((void *)0) for C.Definition:utypes.h:203
Definition in fileubidi.h.
| #define U_BIDI_CLASS_DEFAULT U_CHAR_DIRECTION_COUNT |
Value returned byUBiDiClassCallback callbacks when there is no need to override the standard Bidi class for a given code point.
This constant is deprecated; use u_getIntPropertyMaxValue(UCHAR_BIDI_CLASS)+1 instead.
| #define UBIDI_DEFAULT_LTR 0xfe |
Paragraph level setting.
Constant indicating that the base direction depends on the first strong directional character in the text according to the Unicode Bidirectional Algorithm. If no strong directional character is present, then set the paragraph level to 0 (left-to-right).
If this value is used in conjunction with reordering modesUBIDI_REORDER_INVERSE_LIKE_DIRECT orUBIDI_REORDER_INVERSE_FOR_NUMBERS_SPECIAL, the text to reorder is assumed to be visual LTR, and the text after reordering is required to be the corresponding logical string with appropriate contextual direction. The direction of the result string will be RTL if either the righmost or leftmost strong character of the source text is RTL or Arabic Letter, the direction will be LTR otherwise.
If reordering optionUBIDI_OPTION_INSERT_MARKS is set, an RLM may be added at the beginning of the result string to ensure round trip (that the result string, when reordered back to visual, will produce the original source text).
| #define UBIDI_DEFAULT_RTL 0xff |
Paragraph level setting.
Constant indicating that the base direction depends on the first strong directional character in the text according to the Unicode Bidirectional Algorithm. If no strong directional character is present, then set the paragraph level to 1 (right-to-left).
If this value is used in conjunction with reordering modesUBIDI_REORDER_INVERSE_LIKE_DIRECT orUBIDI_REORDER_INVERSE_FOR_NUMBERS_SPECIAL, the text to reorder is assumed to be visual LTR, and the text after reordering is required to be the corresponding logical string with appropriate contextual direction. The direction of the result string will be RTL if either the righmost or leftmost strong character of the source text is RTL or Arabic Letter, or if the text contains no strong character; the direction will be LTR otherwise.
If reordering optionUBIDI_OPTION_INSERT_MARKS is set, an RLM may be added at the beginning of the result string to ensure round trip (that the result string, when reordered back to visual, will produce the original source text).
| #define UBIDI_DO_MIRRORING 2 |
option bit forubidi_writeReordered(): replace characters with the "mirrored" property in RTL runs by their mirror-image mappings
| #define UBIDI_INSERT_LRM_FOR_NUMERIC 4 |
option bit forubidi_writeReordered(): surround the run with LRMs if necessary; this is part of the approximate "inverse Bidi" algorithm
This option does not imply corresponding adjustment of the index mappings.
| #define UBIDI_KEEP_BASE_COMBINING 1 |
option flags forubidi_writeReordered()
option bit forubidi_writeReordered(): keep combining characters after their base characters in RTL runs
| #define UBIDI_LEVEL_OVERRIDE 0x80 |
| #define UBIDI_MAP_NOWHERE (-1) |
Special value which can be returned by the mapping functions when a logical index has no corresponding visual index or vice-versa.
This may happen for the logical-to-visual mapping of a Bidi control when optionUBIDI_OPTION_REMOVE_CONTROLS is specified. This can also happen for the visual-to-logical mapping of a Bidi mark (LRM or RLM) inserted by optionUBIDI_OPTION_INSERT_MARKS.
| #define UBIDI_MAX_EXPLICIT_LEVEL 125 |
Maximum explicit embedding level.
Same as the max_depth value in theUnicode Bidirectional Algorithm. (The maximum resolved level can be up toUBIDI_MAX_EXPLICIT_LEVEL+1).
| #define UBIDI_OUTPUT_REVERSE 16 |
option bit forubidi_writeReordered(): write the output in reverse order
This has the same effect as callingubidi_writeReordered() first without this option, and then callingubidi_writeReverse() without mirroring. Doing this in the same step is faster and avoids a temporary buffer. An example for using this option is output to a character terminal that is designed for RTL scripts and stores text in reverse order.
| #define UBIDI_REMOVE_BIDI_CONTROLS 8 |
option bit forubidi_writeReordered(): remove Bidi control characters (this does not affectUBIDI_INSERT_LRM_FOR_NUMERIC)
This option does not imply corresponding adjustment of the index mappings.
| typedefUCharDirection UBiDiClassCallback(const void *context,UChar32 c) |
Callback type declaration for overriding default Bidi class values with custom ones.
Usually, the function pointer will be propagated to aUBiDi object by calling theubidi_setClassCallback() function; then the callback will be invoked by the UBA implementation any time the class of a character is to be determined.
| context | is a pointer to the callback private data. |
| c | is the code point to get a Bidi class for. |
c if the default class has been overridden, oru_getIntPropertyMaxValue(UCHAR_BIDI_CLASS)+1 if the standard Bidi class value forc is to be used.| typedef enumUBiDiDirectionUBiDiDirection |
| typedef uint8_tUBiDiLevel |
UBiDiLevel is the type of the level values in this Bidi implementation.
It holds an embedding level and indicates the visual direction by its bit 0 (even/odd value).
It can also hold non-level values for theparaLevel andembeddingLevels arguments ofubidi_setPara(); there:
embeddingLevels[] value indicates whether the using application is specifying the level of a character tooverride whatever the Bidi implementation would resolve it to.paraLevel can be set to the pseudo-level valuesUBIDI_DEFAULT_LTR andUBIDI_DEFAULT_RTL.The related constants are not real, valid level values.UBIDI_DEFAULT_XXX can be used to specify a default for the paragraph level for when theubidi_setPara() function shall determine it but there is no strongly typed character in the input.
Note that the value forUBIDI_DEFAULT_LTR is even and the one forUBIDI_DEFAULT_RTL is odd, just like with normal LTR and RTL level values - these special values are designed that way. Also, the implementation assumes that UBIDI_MAX_EXPLICIT_LEVEL is odd.
Note: The numeric values of the related constants will not change: They are tied to the use of 7-bit byte values (plus the override bit) and of the UBiDiLevel=uint8_t data type in this API.
| typedef enumUBiDiReorderingModeUBiDiReorderingMode |
UBiDiReorderingMode values indicate which variant of the Bidi algorithm to use.
| typedef enumUBiDiReorderingOptionUBiDiReorderingOption |
UBiDiReorderingOption values indicate which options are specified to affect the Bidi algorithm.
| enumUBiDiDirection |
UBiDiDirection values indicate the text direction.
| Enumerator | |
|---|---|
| UBIDI_LTR | Left-to-right text. This is a 0 value.
|
| UBIDI_RTL | Right-to-left text. This is a 1 value.
|
| UBIDI_MIXED | Mixed-directional text. As return value for
|
| UBIDI_NEUTRAL | No strongly directional text. As return value for
|
UBiDiReorderingMode values indicate which variant of the Bidi algorithm to use.
| Enumerator | |
|---|---|
| UBIDI_REORDER_DEFAULT | Regular Logical to Visual Bidi algorithm according to Unicode. This is a 0 value.
|
| UBIDI_REORDER_NUMBERS_SPECIAL | Logical to Visual algorithm which handles numbers in a way which mimics the behavior of Windows XP.
|
| UBIDI_REORDER_GROUP_NUMBERS_WITH_R | Logical to Visual algorithm grouping numbers with adjacent R characters (reversible algorithm).
|
| UBIDI_REORDER_RUNS_ONLY | Reorder runs only to transform a Logical LTR string to the Logical RTL string with the same display, or vice-versa.
|
| UBIDI_REORDER_INVERSE_NUMBERS_AS_L | Visual to Logical algorithm which handles numbers like L (same algorithm as selected by
|
| UBIDI_REORDER_INVERSE_LIKE_DIRECT | Visual to Logical algorithm equivalent to the regular Logical to Visual algorithm.
|
| UBIDI_REORDER_INVERSE_FOR_NUMBERS_SPECIAL | Inverse Bidi (Visual to Logical) algorithm for the
|
| UBIDI_REORDER_COUNT | Number of values for reordering mode.
|
UBiDiReorderingOption values indicate which options are specified to affect the Bidi algorithm.
| Enumerator | |
|---|---|
| UBIDI_OPTION_DEFAULT | option value for
|
| UBIDI_OPTION_INSERT_MARKS | option bit for This option must be set or reset before calling This option is significant only with reordering modes which generate a result with Logical order, specifically:
If this option is set in conjunction with reordering mode For other reordering modes, a minimum number of LRM or RLM characters will be added to the source text after reordering it so as to ensure round trip, i.e. when applying the inverse reordering mode on the resulting logical text with removal of Bidi marks (option This option will be ignored if specified together with option
|
| UBIDI_OPTION_REMOVE_CONTROLS | option bit for This option must be set or reset before calling This option nullifies option
|
| UBIDI_OPTION_STREAMING | option bit for This option must be set or reset before calling This option specifies that the caller is interested in processing large text object in parts. The results of the successive calls are expected to be concatenated by the caller. Only the call for the last part will have this option bit off. When this option bit is on,
In all cases, this option should be turned off before processing the last part of the text. When the
|
ubidi_close() must be called to free the memory associated with a UBiDi object.
Important: A parentUBiDi object must not be destroyed or reused if it still has children. If aUBiDi object has become thechild of another one (itsparent) by callingubidi_setLine(), then the child object must be destroyed (closed) or reused (by callingubidi_setPara() orubidi_setLine()) before the parent object.
| pBiDi | is aUBiDi object. |
Get the number of paragraphs.
| pBiDi | is the paragraph or lineUBiDi object. |
| U_CAPI int32_t ubidi_countRuns | ( | UBiDi * | pBiDi, |
| UErrorCode * | pErrorCode | ||
| ) |
Get the number of runs.
This function may invoke the actual reordering on theUBiDi object, afterubidi_setPara() may have resolved only the levels of the text. Therefore,ubidi_countRuns() may have to allocate memory, and may fail doing so.
| pBiDi | is the paragraph or lineUBiDi object. |
| pErrorCode | must be a valid pointer to an error code value. |
| U_CAPIUBiDiDirection ubidi_getBaseDirection | ( | constUChar * | text, |
| int32_t | length | ||
| ) |
Gets the base direction of the text provided according to the Unicode Bidirectional Algorithm.
The base direction is derived from the first character in the string with bidirectional character type L, R, or AL. If the first such character has type L,UBIDI_LTR is returned. If the first such character has type R or AL,UBIDI_RTL is returned. If the string does not contain any character of these types, thenUBIDI_NEUTRAL is returned.
This is a lightweight function for use when only the base direction is needed and no further bidi processing of the text is needed.
| text | is a pointer to the text whose base direction is needed. Note: the text must be (at least)length long. |
| length | is the length of the text; iflength==-1 then the text must be zero-terminated. |
UBIDI_LTR,UBIDI_RTL,UBIDI_NEUTRAL| U_CAPI void ubidi_getClassCallback | ( | UBiDi * | pBiDi, |
| UBiDiClassCallback ** | fn, | ||
| const void ** | context | ||
| ) |
Get the current callback function used for Bidi class determination.
| pBiDi | is the paragraphUBiDi object. |
| fn | fillin: Returns the callback function pointer. |
| context | fillin: Returns the callback's private context. |
| U_CAPIUCharDirection ubidi_getCustomizedClass | ( | UBiDi * | pBiDi, |
| UChar32 | c | ||
| ) |
Retrieve the Bidi class for a given code point.
If aUBiDiClassCallback callback is defined and returns a value other thanu_getIntPropertyMaxValue(UCHAR_BIDI_CLASS)+1, that value is used; otherwise the default class determination mechanism is invoked.
| pBiDi | is the paragraphUBiDi object. |
| c | is the code point whose Bidi class must be retrieved. |
c based on the givenpBiDi instance.| U_CAPIUBiDiDirection ubidi_getDirection | ( | constUBiDi * | pBiDi | ) |
Get the directionality of the text.
| pBiDi | is the paragraph or lineUBiDi object. |
UBIDI_LTR,UBIDI_RTL orUBIDI_MIXED that indicates if the entire text represented by this object is unidirectional, and which direction, or if it is mixed-directional. Note - The valueUBIDI_NEUTRAL is never returned from this method.Referenced byicu::ParagraphLayout::getTextDirection().
Get the length of the text.
| pBiDi | is the paragraph or lineUBiDi object. |
| U_CAPIUBiDiLevel ubidi_getLevelAt | ( | constUBiDi * | pBiDi, |
| int32_t | charIndex | ||
| ) |
Get the level for one character.
| pBiDi | is the paragraph or lineUBiDi object. |
| charIndex | the index of a character. It must be in the range [0..ubidi_getProcessedLength(pBiDi)]. |
| U_CAPI constUBiDiLevel* ubidi_getLevels | ( | UBiDi * | pBiDi, |
| UErrorCode * | pErrorCode | ||
| ) |
Get an array of levels for each character.
Note that this function may allocate memory under some circumstances, unlikeubidi_getLevelAt().
| pBiDi | is the paragraph or lineUBiDi object, whose text length must be strictly positive. |
| pErrorCode | must be a valid pointer to an error code value. |
NULL if an error occurs.| U_CAPI int32_t ubidi_getLogicalIndex | ( | UBiDi * | pBiDi, |
| int32_t | visualIndex, | ||
| UErrorCode * | pErrorCode | ||
| ) |
Get the logical text position from a visual position.
If such a mapping is used many times on the sameUBiDi object, then callingubidi_getVisualMap() is more efficient.
The value returned may beUBIDI_MAP_NOWHERE if there is no logical position because the corresponding text character is a Bidi mark inserted in the output by optionUBIDI_OPTION_INSERT_MARKS.
This is the inverse function toubidi_getVisualIndex().
When the visual output is altered by using options ofubidi_writeReordered() such asUBIDI_INSERT_LRM_FOR_NUMERIC,UBIDI_KEEP_BASE_COMBINING,UBIDI_OUTPUT_REVERSE,UBIDI_REMOVE_BIDI_CONTROLS, the logical position returned may not be correct. It is advised to use, when possible, reordering options such asUBIDI_OPTION_INSERT_MARKS andUBIDI_OPTION_REMOVE_CONTROLS.
| pBiDi | is the paragraph or lineUBiDi object. |
| visualIndex | is the visual position of a character. |
| pErrorCode | must be a valid pointer to an error code value. |
| U_CAPI void ubidi_getLogicalMap | ( | UBiDi * | pBiDi, |
| int32_t * | indexMap, | ||
| UErrorCode * | pErrorCode | ||
| ) |
Get a logical-to-visual index map (array) for the characters in the UBiDi (paragraph or line) object.
Some values in the map may beUBIDI_MAP_NOWHERE if the corresponding text characters are Bidi controls removed from the visual output by the optionUBIDI_OPTION_REMOVE_CONTROLS.
When the visual output is altered by using options ofubidi_writeReordered() such asUBIDI_INSERT_LRM_FOR_NUMERIC,UBIDI_KEEP_BASE_COMBINING,UBIDI_OUTPUT_REVERSE,UBIDI_REMOVE_BIDI_CONTROLS, the visual positions returned may not be correct. It is advised to use, when possible, reordering options such asUBIDI_OPTION_INSERT_MARKS andUBIDI_OPTION_REMOVE_CONTROLS.
Note that in right-to-left runs, this mapping places second surrogates before first ones (which is generally a bad idea) and combining characters before base characters. Use ofubidi_writeReordered(), optionally with theUBIDI_KEEP_BASE_COMBINING option can be considered instead of using the mapping, in order to avoid these issues.
| pBiDi | is the paragraph or lineUBiDi object. |
| indexMap | is a pointer to an array ofubidi_getProcessedLength() indexes which will reflect the reordering of the characters. If optionUBIDI_OPTION_INSERT_MARKS is set, the number of elements allocated inindexMap must be no less thanubidi_getResultLength(). The array does not need to be initialized.The index map will result in indexMap[logicalIndex]==visualIndex. |
| pErrorCode | must be a valid pointer to an error code value. |
| U_CAPI void ubidi_getLogicalRun | ( | constUBiDi * | pBiDi, |
| int32_t | logicalPosition, | ||
| int32_t * | pLogicalLimit, | ||
| UBiDiLevel * | pLevel | ||
| ) |
Get a logical run.
This function returns information about a run and is used to retrieve runs in logical order.
This is especially useful for line-breaking on a paragraph.
| pBiDi | is the paragraph or lineUBiDi object. |
| logicalPosition | is a logical position within the source text. |
| pLogicalLimit | will receive the limit of the corresponding run. The l-value that you point to here may be the same expression (variable) as the one forlogicalPosition. This pointer can beNULL if this value is not necessary. |
| pLevel | will receive the level of the corresponding run. This pointer can beNULL if this value is not necessary. |
| U_CAPI int32_t ubidi_getParagraph | ( | constUBiDi * | pBiDi, |
| int32_t | charIndex, | ||
| int32_t * | pParaStart, | ||
| int32_t * | pParaLimit, | ||
| UBiDiLevel * | pParaLevel, | ||
| UErrorCode * | pErrorCode | ||
| ) |
Get a paragraph, given a position within the text.
This function returns information about a paragraph.
Note: if the paragraph index is known, it is more efficient to retrieve the paragraph information usingubidi_getParagraphByIndex().
| pBiDi | is the paragraph or lineUBiDi object. |
| charIndex | is the index of a character within the text, in the range[0..ubidi_getProcessedLength(pBiDi)-1]. |
| pParaStart | will receive the index of the first character of the paragraph in the text. This pointer can beNULL if this value is not necessary. |
| pParaLimit | will receive the limit of the paragraph. The l-value that you point to here may be the same expression (variable) as the one forcharIndex. This pointer can beNULL if this value is not necessary. |
| pParaLevel | will receive the level of the paragraph. This pointer can beNULL if this value is not necessary. |
| pErrorCode | must be a valid pointer to an error code value. |
| U_CAPI void ubidi_getParagraphByIndex | ( | constUBiDi * | pBiDi, |
| int32_t | paraIndex, | ||
| int32_t * | pParaStart, | ||
| int32_t * | pParaLimit, | ||
| UBiDiLevel * | pParaLevel, | ||
| UErrorCode * | pErrorCode | ||
| ) |
Get a paragraph, given the index of this paragraph.
This function returns information about a paragraph.
| pBiDi | is the paragraphUBiDi object. |
| paraIndex | is the number of the paragraph, in the range[0..ubidi_countParagraphs(pBiDi)-1]. |
| pParaStart | will receive the index of the first character of the paragraph in the text. This pointer can beNULL if this value is not necessary. |
| pParaLimit | will receive the limit of the paragraph. This pointer can beNULL if this value is not necessary. |
| pParaLevel | will receive the level of the paragraph. This pointer can beNULL if this value is not necessary. |
| pErrorCode | must be a valid pointer to an error code value. |
| U_CAPIUBiDiLevel ubidi_getParaLevel | ( | constUBiDi * | pBiDi | ) |
Get the paragraph level of the text.
| pBiDi | is the paragraph or lineUBiDi object. |
Referenced byicu::ParagraphLayout::getParagraphLevel().
Get the length of the source text processed by the last call toubidi_setPara().
This length may be different from the length of the source text if optionUBIDI_OPTION_STREAMING has been set.
Note that whenever the length of the text affects the execution or the result of a function, it is the processed length which must be considered, except forubidi_setPara (which receives unprocessed source text) andubidi_getLength (which returns the original length of the source text).
In particular, the processed length is the one to consider in the following cases:
limit argument ofubidi_setLinecharIndex argument ofubidi_getParagraphcharIndex argument ofubidi_getLevelAtubidi_getLevelslogicalStart argument ofubidi_getLogicalRunlogicalIndex argument ofubidi_getVisualIndex*indexMap argument ofubidi_getLogicalMapubidi_writeReordered| pBiDi | is the paragraphUBiDi object. |
ubidi_setPara.| U_CAPIUBiDiReorderingMode ubidi_getReorderingMode | ( | UBiDi * | pBiDi | ) |
What is the requested reordering mode for a given Bidi object?
| pBiDi | is aUBiDi object. |
What are the reordering options applied to a given Bidi object?
| pBiDi | is aUBiDi object. |
Get the length of the reordered text resulting from the last call toubidi_setPara().
This length may be different from the length of the source text if optionUBIDI_OPTION_INSERT_MARKS or optionUBIDI_OPTION_REMOVE_CONTROLS has been set.
This resulting length is the one to consider in the following cases:
visualIndex argument ofubidi_getLogicalIndex*indexMap argument ofubidi_getVisualMapNote that this length stays identical to the source text length if Bidi marks are inserted or removed using option bits ofubidi_writeReordered, or if optionUBIDI_REORDER_INVERSE_NUMBERS_AS_L has been set.
| pBiDi | is the paragraphUBiDi object. |
ubidi_setPara.Get the pointer to the text.
| pBiDi | is the paragraph or lineUBiDi object. |
| U_CAPI int32_t ubidi_getVisualIndex | ( | UBiDi * | pBiDi, |
| int32_t | logicalIndex, | ||
| UErrorCode * | pErrorCode | ||
| ) |
Get the visual position from a logical text position.
If such a mapping is used many times on the sameUBiDi object, then callingubidi_getLogicalMap() is more efficient.
The value returned may beUBIDI_MAP_NOWHERE if there is no visual position because the corresponding text character is a Bidi control removed from output by the optionUBIDI_OPTION_REMOVE_CONTROLS.
When the visual output is altered by using options ofubidi_writeReordered() such asUBIDI_INSERT_LRM_FOR_NUMERIC,UBIDI_KEEP_BASE_COMBINING,UBIDI_OUTPUT_REVERSE,UBIDI_REMOVE_BIDI_CONTROLS, the visual position returned may not be correct. It is advised to use, when possible, reordering options such asUBIDI_OPTION_INSERT_MARKS andUBIDI_OPTION_REMOVE_CONTROLS.
Note that in right-to-left runs, this mapping places second surrogates before first ones (which is generally a bad idea) and combining characters before base characters. Use ofubidi_writeReordered(), optionally with theUBIDI_KEEP_BASE_COMBINING option can be considered instead of using the mapping, in order to avoid these issues.
| pBiDi | is the paragraph or lineUBiDi object. |
| logicalIndex | is the index of a character in the text. |
| pErrorCode | must be a valid pointer to an error code value. |
| U_CAPI void ubidi_getVisualMap | ( | UBiDi * | pBiDi, |
| int32_t * | indexMap, | ||
| UErrorCode * | pErrorCode | ||
| ) |
Get a visual-to-logical index map (array) for the characters in the UBiDi (paragraph or line) object.
Some values in the map may beUBIDI_MAP_NOWHERE if the corresponding text characters are Bidi marks inserted in the visual output by the optionUBIDI_OPTION_INSERT_MARKS.
When the visual output is altered by using options ofubidi_writeReordered() such asUBIDI_INSERT_LRM_FOR_NUMERIC,UBIDI_KEEP_BASE_COMBINING,UBIDI_OUTPUT_REVERSE,UBIDI_REMOVE_BIDI_CONTROLS, the logical positions returned may not be correct. It is advised to use, when possible, reordering options such asUBIDI_OPTION_INSERT_MARKS andUBIDI_OPTION_REMOVE_CONTROLS.
| pBiDi | is the paragraph or lineUBiDi object. |
| indexMap | is a pointer to an array ofubidi_getResultLength() indexes which will reflect the reordering of the characters. If optionUBIDI_OPTION_REMOVE_CONTROLS is set, the number of elements allocated inindexMap must be no less thanubidi_getProcessedLength(). The array does not need to be initialized.The index map will result in indexMap[visualIndex]==logicalIndex. |
| pErrorCode | must be a valid pointer to an error code value. |
| U_CAPIUBiDiDirection ubidi_getVisualRun | ( | UBiDi * | pBiDi, |
| int32_t | runIndex, | ||
| int32_t * | pLogicalStart, | ||
| int32_t * | pLength | ||
| ) |
Get one run's logical start, length, and directionality, which can be 0 for LTR or 1 for RTL.
In an RTL run, the character at the logical start is visually on the right of the displayed run. The length is the number of characters in the run.
ubidi_countRuns() should be called before the runs are retrieved.
| pBiDi | is the paragraph or lineUBiDi object. |
| runIndex | is the number of the run in visual order, in the range[0..ubidi_countRuns(pBiDi)-1]. |
| pLogicalStart | is the first logical character index in the text. The pointer may beNULL if this index is not needed. |
| pLength | is the number of characters (at least one) in the run. The pointer may beNULL if this is not needed. |
UBIDI_LTR==0 orUBIDI_RTL==1, neverUBIDI_MIXED, neverUBIDI_NEUTRAL.Example:
int32_t i, count=ubidi_countRuns(pBiDi),logicalStart, visualIndex=0, length;for(i=0; i<count; ++i) {do {// LTRshow_char(text[logicalStart++], visualIndex++);}while(--length>0);}else {logicalStart+=length;// logicalLimitdo {// RTLshow_char(text[--logicalStart], visualIndex++);}while(--length>0);}}*
Note that in right-to-left runs, code like this places second surrogates before first ones (which is generally a bad idea) and combining characters before base characters.
Use ofubidi_writeReordered(), optionally with theUBIDI_KEEP_BASE_COMBINING option, can be considered in order to avoid these issues.
| U_CAPI void ubidi_invertMap | ( | const int32_t * | srcMap, |
| int32_t * | destMap, | ||
| int32_t | length | ||
| ) |
Invert an index map.
The index mapping of the first map is inverted and written to the second one.
| srcMap | is an array withlength elements which defines the original mapping from a source array containinglength elements to a destination array. Some elements of the source array may have no mapping in the destination array. In that case, their value will be the special valueUBIDI_MAP_NOWHERE. All elements must be >=0 or equal toUBIDI_MAP_NOWHERE. Some elements may have a value >=length, if the destination array has more elements than the source array. There must be no duplicate indexes (two or more elements with the same value exceptUBIDI_MAP_NOWHERE). |
| destMap | is an array with a number of elements equal to 1 + the highest value insrcMap.destMap will be filled with the inverse mapping. If element with index i insrcMap has a value k different fromUBIDI_MAP_NOWHERE, this means that element i of the source array maps to element k in the destination array. The inverse map will have value i in its k-th element. For all elements of the destination array which do not map to an element in the source array, the corresponding element in the inverse map will have a value equal toUBIDI_MAP_NOWHERE. |
| length | is the length of each array. |
Is this Bidi object set to perform the inverse Bidi algorithm?
Note: calling this function after setting the reordering mode withubidi_setReorderingMode will returntrue if the reordering mode was set toUBIDI_REORDER_INVERSE_NUMBERS_AS_L,false for all other values.
| pBiDi | is aUBiDi object. |
Is this Bidi object set to allocate level 0 to block separators so that successive paragraphs progress from left to right?
| pBiDi | is aUBiDi object. |
Allocate aUBiDi structure.
Such an object is initially empty. It is assigned the Bidi properties of a piece of text containing one or more paragraphs byubidi_setPara() or the Bidi properties of a line within a paragraph byubidi_setLine().
This object can be reused for as long as it is not deallocated by callingubidi_close().
ubidi_setPara() andubidi_setLine() will allocate additional memory for internal structures as necessary.
UBiDi object.| U_CAPIUBiDi* ubidi_openSized | ( | int32_t | maxLength, |
| int32_t | maxRunCount, | ||
| UErrorCode * | pErrorCode | ||
| ) |
Allocate aUBiDi structure with preallocated memory for internal structures.
This function provides aUBiDi object likeubidi_open() with no arguments, but it also preallocates memory for internal structures according to the sizings supplied by the caller.
Subsequent functions will not allocate any more memory, and are thus guaranteed not to fail because of lack of memory.
The preallocation can be limited to some of the internal memory by setting some values to 0 here. That means that if, e.g.,maxRunCount cannot be reasonably predetermined and should not be set tomaxLength (the only failproof value) to avoid wasting memory, thenmaxRunCount could be set to 0 here and the internal structures that are associated with it will be allocated on demand, just like withubidi_open().
| maxLength | is the maximum text or line length that internal memory will be preallocated for. An attempt to associate this object with a longer text will fail, unless this value is 0, which leaves the allocation up to the implementation. |
| maxRunCount | is the maximum anticipated number of same-level runs that internal memory will be preallocated for. An attempt to access visual runs on an object that was not preallocated for as many runs as the text was actually resolved to will fail, unless this value is 0, which leaves the allocation up to the implementation. The number of runs depends on the actual text and maybe anywhere between 1 and maxLength. It is typically small. |
| pErrorCode | must be a valid pointer to an error code value. |
UBiDi object with preallocated memory.Specify whether block separators must be allocated level zero, so that successive paragraphs will progress from left to right.
This function must be called beforeubidi_setPara(). Paragraph separators (B) may appear in the text. Setting them to level zero means that all paragraph separators (including one possibly appearing in the last text position) are kept in the reordered text after the text that they follow in the source text. When this feature is not enabled, a paragraph separator at the last position of the text before reordering will go to the first position of the reordered text when the paragraph level is odd.
| pBiDi | is aUBiDi object. |
| orderParagraphsLTR | specifies whether paragraph separators (B) must receive level 0, so that successive paragraphs progress from left to right. |
| U_CAPI void ubidi_reorderLogical | ( | constUBiDiLevel * | levels, |
| int32_t | length, | ||
| int32_t * | indexMap | ||
| ) |
This is a convenience function that does not use a UBiDi object.
It is intended to be used for when an application has determined the levels of objects (character sequences) and just needs to have them reordered (L2). This is equivalent to usingubidi_getLogicalMap() on aUBiDi object.
| levels | is an array withlength levels that have been determined by the application. |
| length | is the number of levels in the array, or, semantically, the number of objects to be reordered. It must belength>0. |
| indexMap | is a pointer to an array oflength indexes which will reflect the reordering of the characters. The array does not need to be initialized. |
The index map will result inindexMap[logicalIndex]==visualIndex.
| U_CAPI void ubidi_reorderVisual | ( | constUBiDiLevel * | levels, |
| int32_t | length, | ||
| int32_t * | indexMap | ||
| ) |
This is a convenience function that does not use a UBiDi object.
It is intended to be used for when an application has determined the levels of objects (character sequences) and just needs to have them reordered (L2). This is equivalent to usingubidi_getVisualMap() on aUBiDi object.
| levels | is an array withlength levels that have been determined by the application. |
| length | is the number of levels in the array, or, semantically, the number of objects to be reordered. It must belength>0. |
| indexMap | is a pointer to an array oflength indexes which will reflect the reordering of the characters. The array does not need to be initialized. |
The index map will result inindexMap[visualIndex]==logicalIndex.
| U_CAPI void ubidi_setClassCallback | ( | UBiDi * | pBiDi, |
| UBiDiClassCallback * | newFn, | ||
| const void * | newContext, | ||
| UBiDiClassCallback ** | oldFn, | ||
| const void ** | oldContext, | ||
| UErrorCode * | pErrorCode | ||
| ) |
Set the callback function and callback data used by the UBA implementation for Bidi class determination.
This may be useful for assigning Bidi classes to PUA characters, or for special application needs. For instance, an application may want to handle all spaces like L or R characters (according to the base direction) when creating the visual ordering of logical lines which are part of a report organized in columns: there should not be interaction between adjacent cells.
| pBiDi | is the paragraphUBiDi object. |
| newFn | is the new callback function pointer. |
| newContext | is the new callback context pointer. This can be NULL. |
| oldFn | fillin: Returns the old callback function pointer. This can be NULL. |
| oldContext | fillin: Returns the old callback's context. This can be NULL. |
| pErrorCode | must be a valid pointer to an error code value. |
| U_CAPI void ubidi_setContext | ( | UBiDi * | pBiDi, |
| constUChar * | prologue, | ||
| int32_t | proLength, | ||
| constUChar * | epilogue, | ||
| int32_t | epiLength, | ||
| UErrorCode * | pErrorCode | ||
| ) |
Set the context before a call toubidi_setPara().
ubidi_setPara() computes the left-right directionality for a given piece of text which is supplied as one of its arguments. Sometimes this piece of text (the "main text") should be considered in context, because text appearing before ("prologue") and/or after ("epilogue") the main text may affect the result of this computation.
This function specifies the prologue and/or the epilogue for the next call toubidi_setPara(). The characters specified as prologue and epilogue should not be modified by the calling program until the call toubidi_setPara() has returned. If successive calls toubidi_setPara() all need specification of a context,ubidi_setContext() must be called before each call toubidi_setPara(). In other words, a context is not "remembered" after the following successful call toubidi_setPara().
If a call toubidi_setPara() specifies UBIDI_DEFAULT_LTR or UBIDI_DEFAULT_RTL as paraLevel and is preceded by a call toubidi_setContext() which specifies a prologue, the paragraph level will be computed taking in consideration the text in the prologue.
Whenubidi_setPara() is called without a previous call to ubidi_setContext, the main text is handled as if preceded and followed by strong directional characters at the current paragraph level. Callingubidi_setContext() with specification of a prologue will change this behavior by handling the main text as if preceded by the last strong character appearing in the prologue, if any. Callingubidi_setContext() with specification of an epilogue will change the behavior ofubidi_setPara() by handling the main text as if followed by the first strong character or digit appearing in the epilogue, if any.
Note 1: ifubidi_setContext is called repeatedly without callingubidi_setPara, the earlier calls have no effect, only the last call will be remembered for the next call toubidi_setPara.
Note 2: callingubidi_setContext(pBiDi, NULL, 0, NULL, 0, &errorCode) cancels any previous setting of non-empty prologue or epilogue. The next call toubidi_setPara() will process no prologue or epilogue.
Note 3: users must be aware that even after setting the context before a call toubidi_setPara() to perform e.g. a logical to visual transformation, the resulting string may not be identical to what it would have been if all the text, including prologue and epilogue, had been processed together.
Example (upper case letters represent RTL characters):
prologue = "<code>abc DE</code>"
epilogue = none
main text = "<code>FGH xyz</code>"
paraLevel = UBIDI_LTR
display without prologue = "<code>HGF xyz</code>" ("HGF" is adjacent to "xyz")
display with prologue = "<code>abc HGFED xyz</code>" ("HGF" is not adjacent to "xyz")
| pBiDi | is a paragraphUBiDi object. |
| prologue | is a pointer to the text which precedes the text that will be specified in a coming call toubidi_setPara(). If there is no prologue to consider, thenproLength must be zero and this pointer can be NULL. |
| proLength | is the length of the prologue; ifproLength==-1 then the prologue must be zero-terminated. Otherwise proLength must be >= 0. IfproLength==0, it means that there is no prologue to consider. |
| epilogue | is a pointer to the text which follows the text that will be specified in a coming call toubidi_setPara(). If there is no epilogue to consider, thenepiLength must be zero and this pointer can be NULL. |
| epiLength | is the length of the epilogue; ifepiLength==-1 then the epilogue must be zero-terminated. Otherwise epiLength must be >= 0. IfepiLength==0, it means that there is no epilogue to consider. |
| pErrorCode | must be a valid pointer to an error code value. |
Modify the operation of the Bidi algorithm such that it approximates an "inverse Bidi" algorithm.
This function must be called beforeubidi_setPara().
The normal operation of the Bidi algorithm as described in the Unicode Technical Report is to take text stored in logical (keyboard, typing) order and to determine the reordering of it for visual rendering. Some legacy systems store text in visual order, and for operations with standard, Unicode-based algorithms, the text needs to be transformed to logical order. This is effectively the inverse algorithm of the described Bidi algorithm. Note that there is no standard algorithm for this "inverse Bidi" and that the current implementation provides only an approximation of "inverse Bidi".
WithisInverse set totrue, this function changes the behavior of some of the subsequent functions in a way that they can be used for the inverse Bidi algorithm. Specifically, runs of text with numeric characters will be treated in a special way and may need to be surrounded with LRM characters when they are written in reordered sequence.
Output runs should be retrieved usingubidi_getVisualRun(). Since the actual input for "inverse Bidi" is visually ordered text andubidi_getVisualRun() gets the reordered runs, these are actually the runs of the logically ordered output.
Calling this function with argumentisInverse set totrue is equivalent to callingubidi_setReorderingMode with argumentreorderingMode set toUBIDI_REORDER_INVERSE_NUMBERS_AS_L.
Calling this function with argumentisInverse set tofalse is equivalent to callingubidi_setReorderingMode with argumentreorderingMode set toUBIDI_REORDER_DEFAULT.
| pBiDi | is aUBiDi object. |
| isInverse | specifies "forward" or "inverse" Bidi operation. |
| U_CAPI void ubidi_setLine | ( | constUBiDi * | pParaBiDi, |
| int32_t | start, | ||
| int32_t | limit, | ||
| UBiDi * | pLineBiDi, | ||
| UErrorCode * | pErrorCode | ||
| ) |
ubidi_setLine() sets aUBiDi to contain the reordering information, especially the resolved levels, for all the characters in a line of text.
This line of text is specified by referring to aUBiDi object representing this information for a piece of text containing one or more paragraphs, and by specifying a range of indexes in this text.
In the new line object, the indexes will range from 0 tolimit-start-1.
This is used after callingubidi_setPara() for a piece of text, and after line-breaking on that text. It is not necessary if each paragraph is treated as a single line.
After line-breaking, rules (L1) and (L2) for the treatment of trailing WS and for reordering are performed on aUBiDi object that represents a line.
Important:pLineBiDi shares data withpParaBiDi. You must destroy or reusepLineBiDi beforepParaBiDi. In other words, you must destroy or reuse theUBiDi object for a line before the object for its parent paragraph.
The text pointer that was stored inpParaBiDi is also copied, andstart is added to it so that it points to the beginning of the line for this object.
| pParaBiDi | is the parent paragraph object. It must have been set by a successful call to ubidi_setPara. |
| start | is the line's first index into the text. |
| limit | is just behind the line's last index into the text (its last index +1). It must be 0<=start<limit<=containing paragraph limit. If the specified line crosses a paragraph boundary, the function will terminate with error code U_ILLEGAL_ARGUMENT_ERROR. |
| pLineBiDi | is the object that will now represent a line of the text. |
| pErrorCode | must be a valid pointer to an error code value. |
| U_CAPI void ubidi_setPara | ( | UBiDi * | pBiDi, |
| constUChar * | text, | ||
| int32_t | length, | ||
| UBiDiLevel | paraLevel, | ||
| UBiDiLevel * | embeddingLevels, | ||
| UErrorCode * | pErrorCode | ||
| ) |
Perform the Unicode Bidi algorithm.
It is defined in theUnicode Standard Annex #9, version 13, also described in The Unicode Standard, Version 4.0 .
This function takes a piece of plain text containing one or more paragraphs, with or without externally specified embedding levels fromstyled text and computes the left-right-directionality of each character.
If the entire text is all of the same directionality, then the function may not perform all the steps described by the algorithm, i.e., some levels may not be the same as if all steps were performed. This is not relevant for unidirectional text.
For example, in pure LTR text with numbers the numbers would get a resolved level of 2 higher than the surrounding text according to the algorithm. This implementation may set all resolved levels to the same value in such a case.
The text can be composed of multiple paragraphs. Occurrence of a block separator in the text terminates a paragraph, and whatever comes next starts a new paragraph. The exception to this rule is when a Carriage Return (CR) is followed by a Line Feed (LF). Both CR and LF are block separators, but in that case, the pair of characters is considered as terminating the preceding paragraph, and a new paragraph will be started by a character coming after the LF.
| pBiDi | AUBiDi object allocated withubidi_open() which will be set to contain the reordering information, especially the resolved levels for all the characters intext. |
| text | is a pointer to the text that the Bidi algorithm will be performed on. This pointer is stored in the UBiDi object and can be retrieved withubidi_getText().Note: the text must be (at least) length long. |
| length | is the length of the text; iflength==-1 then the text must be zero-terminated. |
| paraLevel | specifies the default level for the text; it is typically 0 (LTR) or 1 (RTL). If the function shall determine the paragraph level from the text, thenparaLevel can be set to eitherUBIDI_DEFAULT_LTR orUBIDI_DEFAULT_RTL; if the text contains multiple paragraphs, the paragraph level shall be determined separately for each paragraph; if a paragraph does not include any strongly typed character, then the desired default is used (0 for LTR or 1 for RTL). Any other value between 0 andUBIDI_MAX_EXPLICIT_LEVEL is also valid, with odd levels indicating RTL. |
| embeddingLevels | (in) may be used to preset the embedding and override levels, ignoring characters like LRE and PDF in the text. A level overrides the directional property of its corresponding (same index) character if the level has theUBIDI_LEVEL_OVERRIDE bit set.Aside from that bit, it must be paraLevel<=embeddingLevels[]<=UBIDI_MAX_EXPLICIT_LEVEL, except that level 0 is always allowed. Level 0 for a paragraph separator prevents reordering of paragraphs; this only works reliably ifUBIDI_LEVEL_OVERRIDE is also set for paragraph separators. Level 0 for other characters is treated as a wildcard and is lifted up to the resolved level of the surrounding paragraph.Caution:A copy of this pointer, not of the levels, will be stored in the UBiDi object; theembeddingLevels array must not be deallocated before theUBiDi structure is destroyed or reused, and theembeddingLevels should not be modified to avoid unexpected results on subsequent Bidi operations. However, theubidi_setPara() andubidi_setLine() functions may modify some or all of the levels.After the UBiDi object is reused or destroyed, the caller must take care of the deallocation of theembeddingLevels array.Note: the embeddingLevels array must be at leastlength long. This pointer can beNULL if this value is not necessary. |
| pErrorCode | must be a valid pointer to an error code value. |
| U_CAPI void ubidi_setReorderingMode | ( | UBiDi * | pBiDi, |
| UBiDiReorderingMode | reorderingMode | ||
| ) |
Modify the operation of the Bidi algorithm such that it implements some variant to the basic Bidi algorithm or approximates an "inverse Bidi" algorithm, depending on different values of the "reordering mode".
This function must be called beforeubidi_setPara(), and stays in effect until called again with a different argument.
The normal operation of the Bidi algorithm as described in the Unicode Standard Annex #9 is to take text stored in logical (keyboard, typing) order and to determine how to reorder it for visual rendering.
With the reordering mode set to a value other thanUBIDI_REORDER_DEFAULT, this function changes the behavior of some of the subsequent functions in a way such that they implement an inverse Bidi algorithm or some other algorithm variants.
Some legacy systems store text in visual order, and for operations with standard, Unicode-based algorithms, the text needs to be transformed into logical order. This is effectively the inverse algorithm of the described Bidi algorithm. Note that there is no standard algorithm for this "inverse Bidi", so a number of variants are implemented here.
In other cases, it may be desirable to emulate some variant of the Logical to Visual algorithm (e.g. one used in MS Windows), or perform a Logical to Logical transformation.
When the reordering mode is set toUBIDI_REORDER_DEFAULT, the standard Bidi Logical to Visual algorithm is applied.
When the reordering mode is set toUBIDI_REORDER_NUMBERS_SPECIAL, the algorithm used to perform Bidi transformations when callingubidi_setPara should approximate the algorithm used in Microsoft Windows XP rather than strictly conform to the Unicode Bidi algorithm.
The differences between the basic algorithm and the algorithm addressed by this option are as follows:
When the reordering mode is set toUBIDI_REORDER_GROUP_NUMBERS_WITH_R, numbers located between LTR text and RTL text are associated with the RTL text. For instance, an LTR paragraph with content "abc 123 DEF" (where upper case letters represent RTL characters) will be transformed to "abc FED 123" (and not "abc 123 FED"), "DEF 123 abc" will be transformed to "123 FED abc" and "123 FED abc" will be transformed to "DEF 123 abc". This makes the algorithm reversible and makes it useful when round trip (from visual to logical and back to visual) must be achieved without adding LRM characters. However, this is a variation from the standard Unicode Bidi algorithm.
The source text should not contain Bidi control characters other than LRM or RLM.
When the reordering mode is set toUBIDI_REORDER_RUNS_ONLY, a "Logical to Logical" transformation must be performed:
paraLevel inubidi_setPara) is even, the source text will be handled as LTR logical text and will be transformed to the RTL logical text which has the same LTR visual display.This mode may be needed when logical text which is basically Arabic or Hebrew, with possible included numbers or phrases in English, has to be displayed as if it had an even embedding level (this can happen if the displaying application treats all text as if it was basically LTR).
This mode may also be needed in the reverse case, when logical text which is basically English, with possible included phrases in Arabic or Hebrew, has to be displayed as if it had an odd embedding level.
Both cases could be handled by adding LRE or RLE at the head of the text, if the display subsystem supports these formatting controls. If it does not, the problem may be handled by transforming the source text in this mode before displaying it, so that it will be displayed properly.
The source text should not contain Bidi control characters other than LRM or RLM.
When the reordering mode is set toUBIDI_REORDER_INVERSE_NUMBERS_AS_L, an "inverse Bidi" algorithm is applied. Runs of text with numeric characters will be treated like LTR letters and may need to be surrounded with LRM characters when they are written in reordered sequence (the optionUBIDI_INSERT_LRM_FOR_NUMERIC can be used with functionubidi_writeReordered to this end. This mode is equivalent to callingubidi_setInverse() with argumentisInverse set totrue.
When the reordering mode is set toUBIDI_REORDER_INVERSE_LIKE_DIRECT, the "direct" Logical to Visual Bidi algorithm is used as an approximation of an "inverse Bidi" algorithm. This mode is similar to modeUBIDI_REORDER_INVERSE_NUMBERS_AS_L but is closer to the regular Bidi algorithm.
For example, an LTR paragraph with the content "FED 123 456 CBA" (where upper case represents RTL characters) will be transformed to "ABC 456 123 DEF", as opposed to "DEF 123 456 ABC" with modeUBIDI_REORDER_INVERSE_NUMBERS_AS_L.
When used in conjunction with optionUBIDI_OPTION_INSERT_MARKS, this mode generally adds Bidi marks to the output significantly more sparingly than modeUBIDI_REORDER_INVERSE_NUMBERS_AS_L with optionUBIDI_INSERT_LRM_FOR_NUMERIC in calls toubidi_writeReordered.
UBIDI_REORDER_INVERSE_FOR_NUMBERS_SPECIAL, the Logical to Visual Bidi algorithm used in Windows XP is used as an approximation of an "inverse Bidi" algorithm.In all the reordering modes specifying an "inverse Bidi" algorithm (i.e. those with a name starting withUBIDI_REORDER_INVERSE), output runs should be retrieved usingubidi_getVisualRun(), and the output text withubidi_writeReordered(). The caller should keep in mind that in "inverse Bidi" modes the input is actually visually ordered text and reordered output returned byubidi_getVisualRun() orubidi_writeReordered() are actually runs or character string of logically ordered output.
For all the "inverse Bidi" modes, the source text should not contain Bidi control characters other than LRM or RLM.
Note that optionUBIDI_OUTPUT_REVERSE ofubidi_writeReordered has no useful meaning and should not be used in conjunction with any value of the reordering mode specifying "inverse Bidi" or with valueUBIDI_REORDER_RUNS_ONLY.
| pBiDi | is aUBiDi object. |
| reorderingMode | specifies the required variant of the Bidi algorithm. |
Specify which of the reordering options should be applied during Bidi transformations.
| pBiDi | is aUBiDi object. |
| reorderingOptions | is a combination of zero or more of the following options:UBIDI_OPTION_DEFAULT,UBIDI_OPTION_INSERT_MARKS,UBIDI_OPTION_REMOVE_CONTROLS,UBIDI_OPTION_STREAMING. |
| U_CAPI int32_t ubidi_writeReordered | ( | UBiDi * | pBiDi, |
| UChar * | dest, | ||
| int32_t | destSize, | ||
| uint16_t | options, | ||
| UErrorCode * | pErrorCode | ||
| ) |
Take aUBiDi object containing the reordering information for a piece of text (one or more paragraphs) set byubidi_setPara() or for a line of text set byubidi_setLine() and write a reordered string to the destination buffer.
This function preserves the integrity of characters with multiple code units and (optionally) combining characters. Characters in RTL runs can be replaced by mirror-image characters in the destination buffer. Note that "real" mirroring has to be done in a rendering engine by glyph selection and that for many "mirrored" characters there are no Unicode characters as mirror-image equivalents. There are also options to insert or remove Bidi control characters; see the description of thedestSize andoptions parameters and of the option bit flags.
| pBiDi | A pointer to aUBiDi object that is set byubidi_setPara() orubidi_setLine() and contains the reordering information for the text that it was defined for, as well as a pointer to that text.The text was aliased (only the pointer was stored without copying the contents) and must not have been modified since the ubidi_setPara() call. |
| dest | A pointer to where the reordered text is to be copied. The source text anddest[destSize] must not overlap. |
| destSize | The size of thedest buffer, in number of UChars. If theUBIDI_INSERT_LRM_FOR_NUMERIC option is set, then the destination length could be as large asubidi_getLength(pBiDi)+2*ubidi_countRuns(pBiDi). If theUBIDI_REMOVE_BIDI_CONTROLS option is set, then the destination length may be less thanubidi_getLength(pBiDi). If none of these options is set, then the destination length will be exactlyubidi_getProcessedLength(pBiDi). |
| options | A bit set of options for the reordering that control how the reordered text is written. The options include mirroring the characters on a code point basis and inserting LRM characters, which is used especially for transforming visually stored text to logically stored text (although this is still an imperfect implementation of an "inverse Bidi" algorithm because it uses the "forward Bidi" algorithm at its core). The available options are:UBIDI_DO_MIRRORING,UBIDI_INSERT_LRM_FOR_NUMERIC,UBIDI_KEEP_BASE_COMBINING,UBIDI_OUTPUT_REVERSE,UBIDI_REMOVE_BIDI_CONTROLS |
| pErrorCode | must be a valid pointer to an error code value. |
| U_CAPI int32_t ubidi_writeReverse | ( | constUChar * | src, |
| int32_t | srcLength, | ||
| UChar * | dest, | ||
| int32_t | destSize, | ||
| uint16_t | options, | ||
| UErrorCode * | pErrorCode | ||
| ) |
Reverse a Right-To-Left run of Unicode text.
This function preserves the integrity of characters with multiple code units and (optionally) combining characters. Characters can be replaced by mirror-image characters in the destination buffer. Note that "real" mirroring has to be done in a rendering engine by glyph selection and that for many "mirrored" characters there are no Unicode characters as mirror-image equivalents. There are also options to insert or remove Bidi control characters.
This function is the implementation for reversing RTL runs as part ofubidi_writeReordered(). For detailed descriptions of the parameters, see there. Since no Bidi controls are inserted here, the output string length will never exceedsrcLength.
| src | A pointer to the RTL run text. |
| srcLength | The length of the RTL run. |
| dest | A pointer to where the reordered text is to be copied.src[srcLength] anddest[destSize] must not overlap. |
| destSize | The size of thedest buffer, in number of UChars. If theUBIDI_REMOVE_BIDI_CONTROLS option is set, then the destination length may be less thansrcLength. If this option is not set, then the destination length will be exactlysrcLength. |
| options | A bit set of options for the reordering that control how the reordered text is written. See theoptions parameter inubidi_writeReordered(). |
| pErrorCode | must be a valid pointer to an error code value. |