Movatterモバイル変換

Old Unicode normalization API.More...

#include <normlzr.h>

Inheritance diagram for icu::Normalizer:

Public Types
enum	{DONE =0xffff }
	If DONE is returned from an iteration function that returns a code point, then there are no more normalization results available.More...

Public Member Functions
	Normalizer (constUnicodeString &str,UNormalizationMode mode)
	Creates a new`Normalizer` object for iterating over the normalized form of a given string.More...

	Normalizer (ConstChar16Ptr str, int32_t length,UNormalizationMode mode)
	Creates a new`Normalizer` object for iterating over the normalized form of a given string.More...

	Normalizer (constCharacterIterator &iter,UNormalizationMode mode)
	Creates a new`Normalizer` object for iterating over the normalized form of the given text.More...

	Normalizer (constNormalizer &copy)
	Copy constructor.More...

virtual	~Normalizer ()
	Destructor.More...

UChar32	current ()
	Return the current character in the normalized text.More...

UChar32	first ()
	Return the first character in the normalized text.More...

UChar32	last ()
	Return the last character in the normalized text.More...

UChar32	next ()
	Return the next character in the normalized text.More...

UChar32	previous ()
	Return the previous character in the normalized text and decrement.More...

void	setIndexOnly (int32_t index)
	Set the iteration position in the input text that is being normalized, without any immediate normalization.More...

void	reset ()
	Reset the index to the beginning of the text.More...

int32_t	getIndex () const
	Retrieve the current iteration position in the input text that is being normalized.More...

int32_t	startIndex () const
	Retrieve the index of the start of the input text.More...

int32_t	endIndex () const
	Retrieve the index of the end of the input text.More...

bool	operator== (constNormalizer &that) const
	Returns true when both iterators refer to the same character in the same input text.More...

bool	operator!= (constNormalizer &that) const
	Returns false when both iterators refer to the same character in the same input text.More...

Normalizer *	clone () const
	Returns a pointer to a newNormalizer that is a clone of this one.More...

int32_t	hashCode () const
	Generates a hash code for this iterator.More...

void	setMode (UNormalizationMode newMode)
	Set the normalization mode for this object.More...

UNormalizationMode	getUMode () const
	Return the normalization mode for this object.More...

void	setOption (int32_t option,UBool value)
	Set options that affect this`Normalizer`'s operation.More...

UBool	getOption (int32_t option) const
	Determine whether an option is turned on or off.More...

void	setText (constUnicodeString &newText,UErrorCode &status)
	Set the input text over which this`Normalizer` will iterate.More...

void	setText (constCharacterIterator &newText,UErrorCode &status)
	Set the input text over which this`Normalizer` will iterate.More...

void	setText (ConstChar16Ptr newText, int32_t length,UErrorCode &status)
	Set the input text over which this`Normalizer` will iterate.More...

void	getText (UnicodeString &result)
	Copies the input text into theUnicodeString argument.More...

virtualUClassID	getDynamicClassID () const override
	ICU "poor man's RTTI", returns a UClassID for the actual class.More...

Public Member Functions inherited fromicu::UObject
virtual	~UObject ()
	Destructor.More...

Static Public Member Functions
static void	normalize (constUnicodeString &source,UNormalizationMode mode, int32_t options,UnicodeString &result,UErrorCode &status)
	Normalizes a`UnicodeString` according to the specified normalization mode.More...

static void	compose (constUnicodeString &source,UBool compat, int32_t options,UnicodeString &result,UErrorCode &status)
	Compose a`UnicodeString`.More...

static void	decompose (constUnicodeString &source,UBool compat, int32_t options,UnicodeString &result,UErrorCode &status)
	Static method to decompose a`UnicodeString`.More...

staticUNormalizationCheckResult	quickCheck (constUnicodeString &source,UNormalizationMode mode,UErrorCode &status)
	Performing quick check on a string, to quickly determine if the string is in a particular normalization format.More...

staticUNormalizationCheckResult	quickCheck (constUnicodeString &source,UNormalizationMode mode, int32_t options,UErrorCode &status)
	Performing quick check on a string; same as the other version of quickCheck but takes an extra options parameter like most normalization functions.More...

staticUBool	isNormalized (constUnicodeString &src,UNormalizationMode mode,UErrorCode &errorCode)
	Test if a string is in a given normalization form.More...

staticUBool	isNormalized (constUnicodeString &src,UNormalizationMode mode, int32_t options,UErrorCode &errorCode)
	Test if a string is in a given normalization form; same as the other version of isNormalized but takes an extra options parameter like most normalization functions.More...

staticUnicodeString &	concatenate (constUnicodeString &left, constUnicodeString &right,UnicodeString &result,UNormalizationMode mode, int32_t options,UErrorCode &errorCode)
	Concatenate normalized strings, making sure that the result is normalized as well.More...

static int32_t	compare (constUnicodeString &s1, constUnicodeString &s2, uint32_t options,UErrorCode &errorCode)
	Compare two strings for canonical equivalence.More...

staticUClassID	getStaticClassID ()
	ICU "poor man's RTTI", returns a UClassID for this class.More...

Detailed Description

Old Unicode normalization API.

This API has been replaced by theNormalizer2 class and is only available for backward compatibility. This class simply delegates to theNormalizer2 class. There is one exception: The new API does not provide a replacement forNormalizer::compare().

TheNormalizer class supports the standard normalization forms described inUnicode Standard Annex #15: Unicode Normalization Forms.

TheNormalizer class consists of two parts:

static functions that normalize strings or test if strings are normalized
aNormalizer object is an iterator that takes any kind of text and provides iteration over its normalized form

TheNormalizer class is not suitable for subclassing.

For basic information about normalization forms and details about the C API please see the documentation inunorm.h.

The iterator API with theNormalizer constructors and the non-static functions use aCharacterIterator as input. It is possible to pass a string which is then internally wrapped in aCharacterIterator. The input text is not normalized all at once, but incrementally where needed (providing efficient random access). This allows to pass in a large text but spend only a small amount of time normalizing a small part of that text. However, if the entire text is normalized, then the iterator will be slower than normalizing the entire text at once and iterating over the result. A possible use of theNormalizer iterator is also to report an index into the original text that is close to where the normalized characters come from.

Important: The iterator API was cleaned up significantly for ICU 2.0. The earlier implementation reported thegetIndex() inconsistently, andprevious() could not be used after setIndex(),next(),first(), andcurrent().

Normalizer allows to start normalizing from anywhere in the input text by callingsetIndexOnly(),first(), orlast(). Without calling any of these, the iterator will start at the beginning of the text.

At any time,next() returns the next normalized code point (UChar32), with post-increment semantics (likeCharacterIterator::next32PostInc()).previous() returns the previous normalized code point (UChar32), with pre-decrement semantics (likeCharacterIterator::previous32()).

current() returns the current code point (respectively the one at the newly set index) without moving thegetIndex(). Note that if the text at the current position needs to be normalized, then these functions will do that. (This is whycurrent() is not const.) It is more efficient to callsetIndexOnly() instead, which does not normalize.

getIndex() always refers to the position in the input text where the normalized code points are returned from. It does not always change with each returned code point. The code point that is returned from any of the functions corresponds to text at or aftergetIndex(), according to the function's iteration semantics (post-increment or pre-decrement).

next() returns a code point from at or after thegetIndex() from before thenext() call. After thenext() call, thegetIndex() might have moved to where the next code point will be returned from (from anext() orcurrent() call). This is semantically equivalent to array access with array[index++] (post-increment semantics).

previous() returns a code point from at or after thegetIndex() from after theprevious() call. This is semantically equivalent to array access with array[–index] (pre-decrement semantics).

Internally, theNormalizer iterator normalizes a small piece of text starting at thegetIndex() and ending at a following "safe" index. The normalized results is stored in an internal string buffer, and the code points are iterated from there. With multiple iteration calls, this is repeated until the next piece of text needs to be normalized, and thegetIndex() needs to be moved.

The following "safe" index, the internal buffer, and the secondary iteration index into that buffer are not exposed on the API. This also means that it is currently not practical to return to a particular, arbitrary position in the text because one would need to know, and be able to set, in addition to thegetIndex(), at least also the current index into the internal buffer. It is currently only possible to observe whengetIndex() changes (with careful consideration of the iteration semantics), at which time the internal index will be 0. For example, ifgetIndex() is different afternext() than before it, then the internal index is 0 and one can return to thisgetIndex() later withsetIndexOnly().

Note: While the setIndex() andgetIndex() refer to indices in the underlying Unicode input text, thenext() andprevious() methods iterate through characters in the normalized output. This means that there is not necessarily a one-to-one correspondence between characters returned bynext() andprevious() and the indices passed to and returned from setIndex() andgetIndex(). It is for this reason thatNormalizer does not implement theCharacterIterator interface.

Author: Laura Werner, Mark Davis, Markus Scherer

Stable:: ICU 2.0

Definition at line136 of filenormlzr.h.

Member Enumeration Documentation

◆ anonymous enum

anonymous enum

If DONE is returned from an iteration function that returns a code point, then there are no more normalization results available.

Deprecated:: ICU 56 UseNormalizer2 instead.

Definition at line144 of filenormlzr.h.

Constructor & Destructor Documentation

◆ Normalizer()[1/4]

icu::Normalizer::Normalizer	(	constUnicodeString &	str,
		UNormalizationMode	mode
	)

Creates a newNormalizer object for iterating over the normalized form of a given string.

Parameters

str	The string to be normalized. The normalization will start at the beginning of the string.
mode	The normalization mode.

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ Normalizer()[2/4]

icu::Normalizer::Normalizer	(	ConstChar16Ptr	str,
		int32_t	length,
		UNormalizationMode	mode
	)

Creates a newNormalizer object for iterating over the normalized form of a given string.

Parameters

str	The string to be normalized. The normalization will start at the beginning of the string.
length	Length of the string, or -1 if NUL-terminated.
mode	The normalization mode.

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ Normalizer()[3/4]

icu::Normalizer::Normalizer	(	constCharacterIterator &	iter,
		UNormalizationMode	mode
	)

Creates a newNormalizer object for iterating over the normalized form of the given text.

Parameters

iter	The input text to be normalized. The normalization will start at the beginning of the string.
mode	The normalization mode.

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ Normalizer()[4/4]

icu::Normalizer::Normalizer

(

constNormalizer &

copy

)

Copy constructor.

Parameters

copy	The object to be copied.

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ ~Normalizer()

virtual icu::Normalizer::~Normalizer

(

)

virtual

Destructor.

Deprecated:: ICU 56 UseNormalizer2 instead.

Member Function Documentation

◆ clone()

Normalizer* icu::Normalizer::clone

(

)

const

Returns a pointer to a newNormalizer that is a clone of this one.

The caller is responsible for deleting the new clone.

Returns: a pointer to a newNormalizer

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ compare()

int32_t icu::Normalizer::compare	(	constUnicodeString &	s1,
		constUnicodeString &	s2,
		uint32_t	options,
		UErrorCode &	errorCode
	)

inlinestatic

Compare two strings for canonical equivalence.

Further options include case-insensitive comparison and code point order (as opposed to code unit order).

Canonical equivalence between two strings is defined as their normalized forms (NFD or NFC) being identical. This function compares strings incrementally instead of normalizing (and optionally case-folding) both strings entirely, improving performance significantly.

Bulk normalization is only necessary if the strings do not fulfill the FCD conditions. Only in this case, and only if the strings are relatively long, is memory allocated temporarily. For FCD strings and short non-FCD strings there is no memory allocation.

Semantically, this is equivalent to strcmp[CodePointOrder](NFD(foldCase(s1)), NFD(foldCase(s2))) where code point order and foldCase are all optional.

UAX 21 2.5 Caseless Matching specifies that for a canonical caseless match the case folding must be performed first, then the normalization.

Parameters

s1	First source string.
s2	Second source string.
options	A bit set of options: U_FOLD_CASE_DEFAULT or 0 is used for default options: Case-sensitive comparison in code unit order, and the input strings are quick-checked for FCD. UNORM_INPUT_IS_FCD Set if the caller knows that both s1 and s2 fulfill the FCD conditions. If not set, the function will quickCheck for FCD and normalize if necessary. U_COMPARE_CODE_POINT_ORDER Set to choose code point order instead of code unit order (see u_strCompare for details). U_COMPARE_IGNORE_CASE Set to compare strings case-insensitively using case folding, instead of case-sensitively. If set, then the following case folding options are used. Options as used with case-insensitive comparisons, currently: U_FOLD_CASE_EXCLUDE_SPECIAL_I (see u_strCaseCompare for details) regular normalization options shifted left by UNORM_COMPARE_NORM_OPTIONS_SHIFT
errorCode	ICU error code in/out parameter. Must fulfill U_SUCCESS before the function call.

Returns: <0 or 0 or >0 as usual for string comparisons

See also: unorm_compare; normalize; UNORM_FCD; u_strCompare; u_strCaseCompare

Stable:: ICU 2.2

Definition at line800 of filenormlzr.h.

Referencesicu::UnicodeString::getBuffer(),icu::UnicodeString::length(), andunorm_compare().

◆ compose()

static void icu::Normalizer::compose	(	constUnicodeString &	source,
		UBool	compat,
		int32_t	options,
		UnicodeString &	result,
		UErrorCode &	status
	)

static

Compose aUnicodeString.

This is equivalent tonormalize() with mode UNORM_NFC or UNORM_NFKC. This is a wrapper forunorm_normalize(), usingUnicodeString's.

Theoptions parameter specifies which optionalNormalizer features are to be enabled for this operation.

Parameters

source	the string to be composed.
compat	Perform compatibility decomposition before composition. If this argument is`false`, only canonical decomposition will be performed.
options	the optional features to be enabled (0 for no options)
result	The composed string (on output).
status	The error code.

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ concatenate()

staticUnicodeString& icu::Normalizer::concatenate	(	constUnicodeString &	left,
		constUnicodeString &	right,
		UnicodeString &	result,
		UNormalizationMode	mode,
		int32_t	options,
		UErrorCode &	errorCode
	)

static

Concatenate normalized strings, making sure that the result is normalized as well.

If both the left and the right strings are in the normalization form according to "mode/options", then the result will be

dest=normalize(left+right, mode, options)

icu::Normalizer::normalize

static void normalize(const UnicodeString &source, UNormalizationMode mode, int32_t options, UnicodeString &result, UErrorCode &status)

Normalizes a UnicodeString according to the specified normalization mode.

For details see unorm_concatenate inunorm.h.

Parameters

left	Left source string.
right	Right source string.
result	The output string.
mode	The normalization mode.
options	A bit set of normalization options.
errorCode	ICU error code in/out parameter. Must fulfill U_SUCCESS before the function call.

Returns: result

See also: unorm_concatenate; normalize; unorm_next; unorm_previous

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ current()

UChar32 icu::Normalizer::current

(

)

Return the current character in the normalized text.

current() may need to normalize some text atgetIndex(). ThegetIndex() is not changed.

Returns: the current normalized code point

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ decompose()

static void icu::Normalizer::decompose	(	constUnicodeString &	source,
		UBool	compat,
		int32_t	options,
		UnicodeString &	result,
		UErrorCode &	status
	)

static

Static method to decompose aUnicodeString.

This is equivalent tonormalize() with mode UNORM_NFD or UNORM_NFKD. This is a wrapper forunorm_normalize(), usingUnicodeString's.

Theoptions parameter specifies which optionalNormalizer features are to be enabled for this operation.

Parameters

source	the string to be decomposed.
compat	Perform compatibility decomposition. If this argument is`false`, only canonical decomposition will be performed.
options	the optional features to be enabled (0 for no options)
result	The decomposed string (on output).
status	The error code.

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ endIndex()

int32_t icu::Normalizer::endIndex

(

)

const

Retrieve the index of the end of the input text.

This is the end index of theCharacterIterator or the length of the string over which thisNormalizer is iterating. This end index is exclusive, i.e., theNormalizer operates only on characters before this index.

Returns: the first index in the input text where theNormalizer does not operate

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ first()

UChar32 icu::Normalizer::first

(

)

Return the first character in the normalized text.

This is equivalent to setIndexOnly(startIndex()) followed bynext(). (Post-increment semantics.)

Returns: the first normalized code point

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ getDynamicClassID()

virtualUClassID icu::Normalizer::getDynamicClassID

(

)

const

overridevirtual

ICU "poor man's RTTI", returns a UClassID for the actual class.

Returns: a UClassID for the actual class.

Deprecated:: ICU 56 UseNormalizer2 instead.

Reimplemented fromicu::UObject.

◆ getIndex()

int32_t icu::Normalizer::getIndex

(

)

const

Retrieve the current iteration position in the input text that is being normalized.

A following call tonext() will return a normalized code point from the input text at or after this index.

After a call toprevious(),getIndex() will point at or before the position in the input text where the normalized code point was returned from withprevious().

Returns: the current index in the input text

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ getOption()

UBool icu::Normalizer::getOption

(

int32_t

option

)

const

Determine whether an option is turned on or off.

If multiple options are specified, then the result is true if any of them are set.

Parameters

option

the option(s) that are to be checked

Returns: true if any of the option(s) are set

See also: setOption

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ getStaticClassID()

staticUClassID icu::Normalizer::getStaticClassID

(

)

static

ICU "poor man's RTTI", returns a UClassID for this class.

Returns: a UClassID for this class.

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ getText()

void icu::Normalizer::getText

(

UnicodeString &

result

)

Copies the input text into theUnicodeString argument.

Parameters

result

Receives a copy of the text under iteration.

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ getUMode()

UNormalizationMode icu::Normalizer::getUMode

(

)

const

Return the normalization mode for this object.

This is an unusual name because there used to be a getMode() that returned a different type.

Returns: the mode for thisNormalizer

See also: setMode

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ hashCode()

int32_t icu::Normalizer::hashCode

(

)

const

Generates a hash code for this iterator.

Returns: the hash code

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ isNormalized()[1/2]

staticUBool icu::Normalizer::isNormalized	(	constUnicodeString &	src,
		UNormalizationMode	mode,
		int32_t	options,
		UErrorCode &	errorCode
	)

static

Test if a string is in a given normalization form; same as the other version of isNormalized but takes an extra options parameter like most normalization functions.

Parameters

src	String that is to be tested if it is in a normalization format.
mode	Which normalization form to test for.
options	the optional features to be enabled (0 for no options)
errorCode	ICU error code in/out parameter. Must fulfill U_SUCCESS before the function call.

Returns: Boolean value indicating whether the source string is in the "mode" normalization form.

See also: quickCheck

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ isNormalized()[2/2]

UBool icu::Normalizer::isNormalized	(	constUnicodeString &	src,
		UNormalizationMode	mode,
		UErrorCode &	errorCode
	)

inlinestatic

Test if a string is in a given normalization form.

This is semantically equivalent to source.equals(normalize(source, mode)) .

Unlikeunorm_quickCheck(), this function returns a definitive result, never a "maybe". For NFD, NFKD, and FCD, both functions work exactly the same. For NFC and NFKC where quickCheck may return "maybe", this function will perform further tests to arrive at a true/false result.

Parameters

src	String that is to be tested if it is in a normalization format.
mode	Which normalization form to test for.
errorCode	ICU error code in/out parameter. Must fulfill U_SUCCESS before the function call.

Returns: Boolean value indicating whether the source string is in the "mode" normalization form.

See also: quickCheck

Deprecated:: ICU 56 UseNormalizer2 instead.

Definition at line792 of filenormlzr.h.

◆ last()

UChar32 icu::Normalizer::last

(

)

Return the last character in the normalized text.

This is equivalent to setIndexOnly(endIndex()) followed byprevious(). (Pre-decrement semantics.)

Returns: the last normalized code point

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ next()

UChar32 icu::Normalizer::next

(

)

Return the next character in the normalized text.

(Post-increment semantics.) If the end of the text has already been reached, DONE is returned. The DONE value could be confused with a U+FFFF non-character code point in the text. If this is possible, you can testgetIndex()<endIndex() before callingnext(), or (getIndex()<endIndex() ||last()!=DONE) after callingnext(). (Callinglast() will change the iterator state!)

The C APIunorm_next() is more efficient and does not have this ambiguity.

Returns: the next normalized code point

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ normalize()

static void icu::Normalizer::normalize	(	constUnicodeString &	source,
		UNormalizationMode	mode,
		int32_t	options,
		UnicodeString &	result,
		UErrorCode &	status
	)

static

Normalizes aUnicodeString according to the specified normalization mode.

This is a wrapper forunorm_normalize(), usingUnicodeString's.

Theoptions parameter specifies which optionalNormalizer features are to be enabled for this operation.

Parameters

source	the input string to be normalized.
mode	the normalization mode
options	the optional features to be enabled (0 for no options)
result	The normalized string (on output).
status	The error code.

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ operator!=()

bool icu::Normalizer::operator!=

(

constNormalizer &

that

)

const

inline

Returns false when both iterators refer to the same character in the same input text.

Parameters

that	aNormalizer object to compare this one to

Returns: comparison result

Deprecated:: ICU 56 UseNormalizer2 instead.

Definition at line781 of filenormlzr.h.

Referencesicu::operator==().

◆ operator==()

bool icu::Normalizer::operator==

(

constNormalizer &

that

)

const

Returns true when both iterators refer to the same character in the same input text.

Parameters

that	aNormalizer object to compare this one to

Returns: comparison result

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ previous()

UChar32 icu::Normalizer::previous

(

)

Return the previous character in the normalized text and decrement.

(Pre-decrement semantics.) If the beginning of the text has already been reached, DONE is returned. The DONE value could be confused with a U+FFFF non-character code point in the text. If this is possible, you can test (getIndex()>startIndex() ||first()!=DONE). (Callingfirst() will change the iterator state!)

The C APIunorm_previous() is more efficient and does not have this ambiguity.

Returns: the previous normalized code point

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ quickCheck()[1/2]

staticUNormalizationCheckResult icu::Normalizer::quickCheck	(	constUnicodeString &	source,
		UNormalizationMode	mode,
		int32_t	options,
		UErrorCode &	status
	)

static

Performing quick check on a string; same as the other version of quickCheck but takes an extra options parameter like most normalization functions.

Parameters

source	string for determining if it is in a normalized format
mode	normalization format
options	the optional features to be enabled (0 for no options)
status	A reference to a UErrorCode to receive any errors

Returns: UNORM_YES, UNORM_NO or UNORM_MAYBE

See also: isNormalized

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ quickCheck()[2/2]

UNormalizationCheckResult icu::Normalizer::quickCheck	(	constUnicodeString &	source,
		UNormalizationMode	mode,
		UErrorCode &	status
	)

inlinestatic

Performing quick check on a string, to quickly determine if the string is in a particular normalization format.

This is a wrapper forunorm_quickCheck(), using aUnicodeString.

Three types of result can be returned UNORM_YES, UNORM_NO or UNORM_MAYBE. Result UNORM_YES indicates that the argument string is in the desired normalized format, UNORM_NO determines that argument string is not in the desired normalized format. A UNORM_MAYBE result indicates that a more thorough check is required, the user may have to put the string in its normalized form and compare the results.

Parameters

source	string for determining if it is in a normalized format
mode	normalization format
status	A reference to a UErrorCode to receive any errors

Returns: UNORM_YES, UNORM_NO or UNORM_MAYBE

See also: isNormalized

Deprecated:: ICU 56 UseNormalizer2 instead.

Definition at line785 of filenormlzr.h.

◆ reset()

void icu::Normalizer::reset

(

)

Reset the index to the beginning of the text.

This is equivalent to setIndexOnly(startIndex)).

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ setIndexOnly()

void icu::Normalizer::setIndexOnly

(

int32_t

index

)

Set the iteration position in the input text that is being normalized, without any immediate normalization.

AftersetIndexOnly(),getIndex() will return the same index that is specified here.

Parameters

index

the desired index in the input text.

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ setMode()

void icu::Normalizer::setMode

(

UNormalizationMode

newMode

)

Set the normalization mode for this object.

Note:If the normalization mode is changed while iterating over a string, calls tonext() andprevious() may return previously buffers characters in the old normalization mode until the iteration is able to re-sync at the next base character. It is safest to callsetIndexOnly,reset(),setText,first(),last(), etc. after callingsetMode.

Parameters

newMode the new mode for thisNormalizer.

See also: getUMode

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ setOption()

void icu::Normalizer::setOption	(	int32_t	option,
		UBool	value
	)

Set options that affect thisNormalizer's operation.

Options do not change the basic composition or decomposition operation that is being performed, but they control whether certain optional portions of the operation are done. Currently the only available option is obsolete.

It is possible to specify multiple options that are all turned on or off.

Parameters

option	the option(s) whose value is/are to be set.
value	the new setting for the option. Use`true` to turn the option(s) on and`false` to turn it/them off.

See also: getOption

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ setText()[1/3]

void icu::Normalizer::setText	(	constCharacterIterator &	newText,
		UErrorCode &	status
	)

Set the input text over which thisNormalizer will iterate.

The iteration position is set to the beginning.

Parameters

newText	aCharacterIterator object that replaces the current input text
status	a UErrorCode

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ setText()[2/3]

void icu::Normalizer::setText	(	constUnicodeString &	newText,
		UErrorCode &	status
	)

Set the input text over which thisNormalizer will iterate.

The iteration position is set to the beginning.

Parameters

newText	a string that replaces the current input text
status	a UErrorCode

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ setText()[3/3]

void icu::Normalizer::setText	(	ConstChar16Ptr	newText,
		int32_t	length,
		UErrorCode &	status
	)

Set the input text over which thisNormalizer will iterate.

The iteration position is set to the beginning.

Parameters

newText	a string that replaces the current input text
length	the length of the string, or -1 if NUL-terminated
status	a UErrorCode

Deprecated:: ICU 56 UseNormalizer2 instead.

◆ startIndex()

int32_t icu::Normalizer::startIndex

(

)

const

Retrieve the index of the start of the input text.

This is the begin index of theCharacterIterator or the start (i.e. index 0) of the string over which thisNormalizer is iterating.

Returns: the smallest index in the input text where theNormalizer operates

Deprecated:: ICU 56 UseNormalizer2 instead.

The documentation for this class was generated from the following file:

common/unicode/normlzr.h

Movatterモバイル変換

Public Types

Public Member Functions

Static Public Member Functions

Detailed Description

Member Enumeration Documentation

◆ anonymous enum

Constructor & Destructor Documentation

◆ Normalizer()[1/4]

◆ Normalizer()[2/4]

◆ Normalizer()[3/4]

◆ Normalizer()[4/4]

◆ ~Normalizer()

Member Function Documentation

◆ clone()

◆ compare()

◆ compose()

◆ concatenate()

◆ current()

◆ decompose()

◆ endIndex()

◆ first()

◆ getDynamicClassID()

◆ getIndex()

◆ getOption()

◆ getStaticClassID()

◆ getText()

◆ getUMode()

◆ hashCode()

◆ isNormalized()[1/2]

◆ isNormalized()[2/2]

◆ last()

◆ next()

◆ normalize()

◆ operator!=()

◆ operator==()

◆ previous()

◆ quickCheck()[1/2]

◆ quickCheck()[2/2]

◆ reset()

◆ setIndexOnly()

◆ setMode()

◆ setOption()

◆ setText()[1/3]

◆ setText()[2/3]

◆ setText()[3/3]

◆ startIndex()