NotificationsYou must be signed in to change notification settings
Fork6
Star31

Commit16c302f

committed

Simplify code for getting a unicode codepoint's canonical class.

Three places of unicode_norm.c use a similar logic for getting thecombining class from a codepoint. Commit2991ac5 has added the functionget_canonical_class() for this purpose, but it was only called by thebackend. This commit refactors the code to use this function in allthe places where the combining class is retrieved from a givencodepoint.Author: John NaylorDiscussion:https://postgr.es/m/CAFBsxsHUV7s7YrOm6hFz-Jq8Sc7K_yxTkfNZxsDV-DuM-k-gwg@mail.gmail.com

1 parentdf99ddc commit16c302fCopy full SHA for 16c302f

File tree

1 file changed

+22

-25

lines changed

src/common
- unicode_norm.c

1 file changed

+22

-25

lines changed

`‎src/common/unicode_norm.c`

Lines changed: 22 additions & 25 deletions

Original file line number	Diff line number	Diff line change
`@@ -105,6 +105,23 @@ get_code_entry(pg_wchar code)`
`105`	`105`	`#endif`
`106`	`106`	`}`
`107`	`107`
	`108`	`+/*`
	`109`	`+ * Get the combining class of the given codepoint.`
	`110`	`+ */`
	`111`	`+staticuint8`
	`112`	`+get_canonical_class(pg_wcharcode)`
	`113`	`+{`
	`114`	`+constpg_unicode_decomposition*entry=get_code_entry(code);`
	`115`	`+`
	`116`	`+/*`
	`117`	`+ * If no entries are found, the character used is either an Hangul`
	`118`	`+ * character or a character with a class of 0 and no decompositions.`
	`119`	`+ */`
	`120`	`+if (!entry)`
	`121`	`+return0;`
	`122`	`+else`
	`123`	`+returnentry->comb_class;`
	`124`	`+}`
`108`	`125`
`109`	`126`	`/*`
`110`	`127`	`* Given a decomposition entry looked up earlier, get the decomposed`
`@@ -430,16 +447,8 @@ unicode_normalize(UnicodeNormalizationForm form, const pg_wchar *input)`
`430`	`447`	`pg_wcharprev=decomp_chars[count-1];`
`431`	`448`	`pg_wcharnext=decomp_chars[count];`
`432`	`449`	`pg_wchartmp;`
`433`		`-constpg_unicode_decomposition*prevEntry=get_code_entry(prev);`
`434`		`-constpg_unicode_decomposition*nextEntry=get_code_entry(next);`
`435`		`-`
`436`		`-/*`
`437`		`- * If no entries are found, the character used is either an Hangul`
`438`		`- * character or a character with a class of 0 and no decompositions,`
`439`		`- * so move to next result.`
`440`		`- */`
`441`		`-if (prevEntry==NULL\|\|nextEntry==NULL)`
`442`		`-continue;`
	`450`	`+constuint8prevClass=get_canonical_class(prev);`
	`451`	`+constuint8nextClass=get_canonical_class(next);`
`443`	`452`
`444`	`453`	`/*`
`445`	`454`	`* Per Unicode (https://www.unicode.org/reports/tr15/tr15-18.html)`
`@@ -449,10 +458,10 @@ unicode_normalize(UnicodeNormalizationForm form, const pg_wchar *input)`
`449`	`458`	`* combining class for the second, and the second is not a starter. A`
`450`	`459`	`* character is a starter if its combining class is 0.`
`451`	`460`	`*/`
`452`		`-if (nextEntry->comb_class==0x0\|\|prevEntry->comb_class==0x0)`
	`461`	`+if (prevClass==0\|\|nextClass==0)`
`453`	`462`	`continue;`
`454`	`463`
`455`		`-if (prevEntry->comb_class <=nextEntry->comb_class)`
	`464`	`+if (prevClass <=nextClass)`
`456`	`465`	`continue;`
`457`	`466`
`458`	`467`	`/* exchange can happen */`
`@@ -489,8 +498,7 @@ unicode_normalize(UnicodeNormalizationForm form, const pg_wchar *input)`
`489`	`498`	`for (count=1;count<decomp_size;count++)`
`490`	`499`	`{`
`491`	`500`	`pg_wcharch=decomp_chars[count];`
`492`		`-constpg_unicode_decomposition*ch_entry=get_code_entry(ch);`
`493`		`-intch_class= (ch_entry==NULL) ?0 :ch_entry->comb_class;`
	`501`	`+intch_class=get_canonical_class(ch);`
`494`	`502`	`pg_wcharcomposite;`
`495`	`503`
`496`	`504`	`if (last_class<ch_class&&`
`@@ -527,17 +535,6 @@ unicode_normalize(UnicodeNormalizationForm form, const pg_wchar *input)`
`527`	`535`	`/* We only need this in the backend. */`
`528`	`536`	`#ifndefFRONTEND`
`529`	`537`
`530`		`-staticuint8`
`531`		`-get_canonical_class(pg_wcharch)`
`532`		`-{`
`533`		`-constpg_unicode_decomposition*entry=get_code_entry(ch);`
`534`		`-`
`535`		`-if (!entry)`
`536`		`-return0;`
`537`		`-else`
`538`		`-returnentry->comb_class;`
`539`		`-}`
`540`		`-`
`541`	`538`	`staticconstpg_unicode_normprops*`
`542`	`539`	`qc_hash_lookup(pg_wcharch,constpg_unicode_norminfo*norminfo)`
`543`	`540`	`{`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit16c302f

File tree

1 file changed

1 file changed

`‎src/common/unicode_norm.c`

0 commit comments