NotificationsYou must be signed in to change notification settings
Fork28
Star153

Commit2991ac5

committed

Add SQL functions for Unicode normalization

This adds SQL expressions NORMALIZE() and IS NORMALIZED to convert andcheck Unicode normal forms, per SQL standard.To support fast IS NORMALIZED tests, we pull in a new data fileDerivedNormalizationProps.txt from Unicode and build a lookup tablefrom that, using techniques similar to ones already used for otherUnicode data. make update-unicode will keep it up to date. We onlybuild and use these tables for the NFC and NFKC forms, because theyare too big for NFD and NFKD and the improvement is not significantenough there.Reviewed-by: Daniel Verite <daniel@manitou-mail.org>Reviewed-by: Andreas Karlsson <andreas@proxel.se>Discussion:https://www.postgresql.org/message-id/flat/c1909f27-c269-2ed9-12f8-3ab72c8caf7a@2ndquadrant.com

1 parent070c3d3 commit2991ac5Copy full SHA for 2991ac5

File tree

20 files changed

+6764

-7

lines changed

doc/src/sgml
- charset.sgml
- func.sgml
src
- backend
  - catalog
    - sql_features.txt
    - system_views.sql
  - parser
    - gram.y
  - utils/adt
    - varlena.c
- common
  - unicode_norm.c
  - unicode
- include
  - catalog
    - catversion.h
    - pg_proc.dat
  - common
    - unicode_norm.h
    - unicode_normprops_table.h
  - parser
    - kwlist.h
- test/regress
  - expected
    - unicode.out
    - unicode_1.out
  - parallel_schedule
  - serial_schedule
  - sql
    - unicode.sql

20 files changed

+6764

-7

lines changed

`‎doc/src/sgml/charset.sgml‎`

Lines changed: 10 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -934,6 +934,16 @@ CREATE COLLATION ignore_accents (provider = icu, locale = 'und-u-ks-level1-kc-tr`
`934`	`934`	`such as pattern matching operations. Therefore, they should be used`
`935`	`935`	`only in cases where they are specifically wanted.`
`936`	`936`	`</para>`
	`937`	`+`
	`938`	`+ <tip>`
	`939`	`+ <para>`
	`940`	`+ To deal with text in different Unicode normalization forms, it is also`
	`941`	`+ an option to use the functions/expressions`
	`942`	`+ <function>normalize</function> and <literal>is normalized</literal> to`
	`943`	`+ preprocess or check the strings, instead of using nondeterministic`
	`944`	`+ collations. There are different trade-offs for each approach.`
	`945`	`+ </para>`
	`946`	`+ </tip>`
`937`	`947`	`</sect3>`
`938`	`948`	`</sect2>`
`939`	`949`	`</sect1>`

`‎doc/src/sgml/func.sgml‎`

Lines changed: 48 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -1560,6 +1560,30 @@`
`1560`	`1560`	`<entry><literal>Value: 42</literal></entry>`
`1561`	`1561`	`</row>`
`1562`	`1562`
	`1563`	`+ <row>`
	`1564`	`+ <entry>`
	`1565`	`+ <indexterm>`
	`1566`	`+ <primary>normalized</primary>`
	`1567`	`+ </indexterm>`
	`1568`	`+ <indexterm>`
	`1569`	`+ <primary>Unicode normalization</primary>`
	`1570`	`+ </indexterm>`
	`1571`	`+ <literal><parameter>string</parameter> is <optional>not</optional> <optional><parameter>form</parameter></optional> normalized</literal>`
	`1572`	`+ </entry>`
	`1573`	`+ <entry><type>boolean</type></entry>`
	`1574`	`+ <entry>`
	`1575`	`+ Checks whether the string is in the specified Unicode normalization`
	`1576`	`+ form. The optional parameter specifies the form:`
	`1577`	`+ <literal>NFC</literal> (default), <literal>NFD</literal>,`
	`1578`	`+ <literal>NFKC</literal>, <literal>NFKD</literal>. This expression can`
	`1579`	`+ only be used if the server encoding is <literal>UTF8</literal>. Note`
	`1580`	`+ that checking for normalization using this expression is often faster`
	`1581`	`+ than normalizing possibly already normalized strings.`
	`1582`	`+ </entry>`
	`1583`	`+ <entry><literal>U&'\0061\0308bc' IS NFD NORMALIZED</literal></entry>`
	`1584`	`+ <entry><literal>true</literal></entry>`
	`1585`	`+ </row>`
	`1586`	`+`
`1563`	`1587`	`<row>`
`1564`	`1588`	`<entry>`
`1565`	`1589`	`<indexterm>`
`@@ -1610,6 +1634,30 @@`
`1610`	`1634`	`<entry><literal>tom</literal></entry>`
`1611`	`1635`	`</row>`
`1612`	`1636`
	`1637`	`+ <row>`
	`1638`	`+ <entry>`
	`1639`	`+ <indexterm>`
	`1640`	`+ <primary>normalize</primary>`
	`1641`	`+ </indexterm>`
	`1642`	`+ <indexterm>`
	`1643`	`+ <primary>Unicode normalization</primary>`
	`1644`	`+ </indexterm>`
	`1645`	`+ <literal><function>normalize(<parameter>string</parameter> <type>text</type>`
	`1646`	`+ <optional>, <parameter>form</parameter> </optional>)</function></literal>`
	`1647`	`+ </entry>`
	`1648`	`+ <entry><type>text</type></entry>`
	`1649`	`+ <entry>`
	`1650`	`+ Converts the string in the first argument to the specified Unicode`
	`1651`	`+ normalization form. The optional second argument specifies the form`
	`1652`	`+ as an identifier: <literal>NFC</literal> (default),`
	`1653`	`+ <literal>NFD</literal>, <literal>NFKC</literal>,`
	`1654`	`+ <literal>NFKD</literal>. This function can only be used if the server`
	`1655`	`+ encoding is <literal>UTF8</literal>.`
	`1656`	`+ </entry>`
	`1657`	`+ <entry><literal>normalize(U&'\0061\0308bc', NFC)</literal></entry>`
	`1658`	`+ <entry><literal>U&'\00E4bc'</literal></entry>`
	`1659`	`+ </row>`
	`1660`	`+`
`1613`	`1661`	`<row>`
`1614`	`1662`	`<entry>`
`1615`	`1663`	`<indexterm>`

`‎src/backend/catalog/sql_features.txt‎`

Lines changed: 1 addition & 1 deletion

Original file line number	Diff line number	Diff line change
`@@ -257,7 +257,7 @@ F386Set identity column generation clauseYES`
`257`	`257`	`F391Long identifiersYES`
`258`	`258`	`F392Unicode escapes in identifiersYES`
`259`	`259`	`F393Unicode escapes in literalsYES`
`260`		`-F394Optional normal form specificationNO`
	`260`	`+F394Optional normal form specificationYES`
`261`	`261`	`F401Extended joined tableYES`
`262`	`262`	`F401Extended joined table01NATURAL JOINYES`
`263`	`263`	`F401Extended joined table02FULL OUTER JOINYES`

`‎src/backend/catalog/system_views.sql‎`

Lines changed: 15 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -1400,6 +1400,21 @@ LANGUAGE INTERNAL`
`1400`	`1400`	`STRICT STABLE PARALLEL SAFE`
`1401`	`1401`	`AS'jsonb_path_query_first_tz';`
`1402`	`1402`
	`1403`	`+-- default normalization form is NFC, per SQL standard`
	`1404`	`+CREATEOR REPLACE FUNCTION`
	`1405`	`+"normalize"(text,text DEFAULT'NFC')`
	`1406`	`+RETURNStext`
	`1407`	`+LANGUAGE internal`
	`1408`	`+STRICT IMMUTABLE PARALLEL SAFE`
	`1409`	`+AS'unicode_normalize_func';`
	`1410`	`+`
	`1411`	`+CREATEOR REPLACE FUNCTION`
	`1412`	`+ is_normalized(text,text DEFAULT'NFC')`
	`1413`	`+RETURNSboolean`
	`1414`	`+LANGUAGE internal`
	`1415`	`+STRICT IMMUTABLE PARALLEL SAFE`
	`1416`	`+AS'unicode_is_normalized';`
	`1417`	`+`
`1403`	`1418`	`--`
`1404`	`1419`	`-- The default permissions for functions mean that anyone can execute them.`
`1405`	`1420`	`-- A number of functions shouldn't be executable by just anyone, but rather`

`‎src/backend/parser/gram.y‎`

Lines changed: 40 additions & 1 deletion

Original file line number	Diff line number	Diff line change
`@@ -444,6 +444,7 @@ static Node makeRecursiveViewSelect(char relname, List aliases, Node query);`
`444`	`444`	`%type<list>substr_listtrim_list`
`445`	`445`	`%type<list>opt_intervalinterval_second`
`446`	`446`	`%type<node>overlay_placingsubstr_fromsubstr_for`
	`447`	`+%type<str>unicode_normal_form`
`447`	`448`
`448`	`449`	`%type<boolean>opt_instead`
`449`	`450`	`%type<boolean>opt_uniqueopt_concurrentlyopt_verboseopt_full`
`@@ -664,7 +665,8 @@ static Node makeRecursiveViewSelect(char relname, List aliases, Node query);`
`664`	`665`
`665`	`666`	`MAPPING MATCH MATERIALIZED MAXVALUE METHOD MINUTE_P MINVALUE MODE MONTH_P MOVE`
`666`	`667`
`667`		`-NAME_P NAMES NATIONAL NATURAL NCHAR NEW NEXT NO NONE`
	`668`	`+NAME_P NAMES NATIONAL NATURAL NCHAR NEW NEXT NFC NFD NFKC NFKD NO NONE`
	`669`	`+NORMALIZE NORMALIZED`
`668`	`670`	`NOT NOTHING NOTIFY NOTNULL NOWAIT NULL_P NULLIF`
`669`	`671`	`NULLS_P NUMERIC`
`670`	`672`
`@@ -13491,6 +13493,22 @@ a_expr:c_expr{ $$ = $1; }`
`13491`	`13493`	`list_make1($1), @2),`
`13492`	`13494`	`@2);`
`13493`	`13495`	`}`
	`13496`	`+\|a_exprISNORMALIZED%precIS`
	`13497`	`+{`
	`13498`	`+$$ = (Node *) makeFuncCall(SystemFuncName("is_normalized"), list_make1($1),@2);`
	`13499`	`+}`
	`13500`	`+\|a_exprISunicode_normal_formNORMALIZED%precIS`
	`13501`	`+{`
	`13502`	`+$$ = (Node *) makeFuncCall(SystemFuncName("is_normalized"), list_make2($1, makeStringConst($3,@3)),@2);`
	`13503`	`+}`
	`13504`	`+\|a_exprISNOTNORMALIZED%precIS`
	`13505`	`+{`
	`13506`	`+$$ = makeNotExpr((Node *) makeFuncCall(SystemFuncName("is_normalized"), list_make1($1),@2),@2);`
	`13507`	`+}`
	`13508`	`+\|a_exprISNOTunicode_normal_formNORMALIZED%precIS`
	`13509`	`+{`
	`13510`	`+$$ = makeNotExpr((Node *) makeFuncCall(SystemFuncName("is_normalized"), list_make2($1, makeStringConst($4,@4)),@2),@2);`
	`13511`	`+}`
`13494`	`13512`	`\|DEFAULT`
`13495`	`13513`	`{`
`13496`	`13514`	`/*`
`@@ -13934,6 +13952,14 @@ func_expr_common_subexpr:`
`13934`	`13952`	`{`
`13935`	`13953`	`$$ = (Node *) makeFuncCall(SystemFuncName("date_part"),$3,@1);`
`13936`	`13954`	`}`
	`13955`	`+\|NORMALIZE'('a_expr')'`
	`13956`	`+{`
	`13957`	`+$$ = (Node *) makeFuncCall(SystemFuncName("normalize"), list_make1($3),@1);`
	`13958`	`+}`
	`13959`	`+\|NORMALIZE'('a_expr','unicode_normal_form')'`
	`13960`	`+{`
	`13961`	`+$$ = (Node *) makeFuncCall(SystemFuncName("normalize"), list_make2($3, makeStringConst($5,@5)),@1);`
	`13962`	`+}`
`13937`	`13963`	`\|OVERLAY'('overlay_list')'`
`13938`	`13964`	`{`
`13939`	`13965`	`/* overlay(A PLACING B FROM C FOR D) is converted to`
`@@ -14569,6 +14595,13 @@ extract_arg:`
`14569`	`14595`	`\|Sconst{$$ =$1; }`
`14570`	`14596`	`;`
`14571`	`14597`
	`14598`	`+unicode_normal_form:`
	`14599`	`+NFC{$$ ="nfc"; }`
	`14600`	`+\|NFD{$$ ="nfd"; }`
	`14601`	`+\|NFKC{$$ ="nfkc"; }`
	`14602`	`+\|NFKD{$$ ="nfkd"; }`
	`14603`	`+;`
	`14604`	`+`
`14572`	`14605`	`/* OVERLAY() arguments`
`14573`	`14606`	`* SQL99 defines the OVERLAY() function:`
`14574`	`14607`	`* o overlay(text placing text from int for int)`
`@@ -15315,7 +15348,12 @@ unreserved_keyword:`
`15315`	`15348`	`\| NAMES`
`15316`	`15349`	`\| NEW`
`15317`	`15350`	`\| NEXT`
	`15351`	`+\| NFC`
	`15352`	`+\| NFD`
	`15353`	`+\| NFKC`
	`15354`	`+\| NFKD`
`15318`	`15355`	`\| NO`
	`15356`	`+\| NORMALIZED`
`15319`	`15357`	`\| NOTHING`
`15320`	`15358`	`\| NOTIFY`
`15321`	`15359`	`\| NOWAIT`
`@@ -15494,6 +15532,7 @@ col_name_keyword:`
`15494`	`15532`	`\| NATIONAL`
`15495`	`15533`	`\| NCHAR`
`15496`	`15534`	`\| NONE`
	`15535`	`+\| NORMALIZE`
`15497`	`15536`	`\| NULLIF`
`15498`	`15537`	`\| NUMERIC`
`15499`	`15538`	`\| OUT_P`

`‎src/backend/utils/adt/varlena.c‎`

Lines changed: 150 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -22,6 +22,7 @@`
`22`	`22`	`#include"catalog/pg_type.h"`
`23`	`23`	`#include"common/hashfn.h"`
`24`	`24`	`#include"common/int.h"`
	`25`	`+#include"common/unicode_norm.h"`
`25`	`26`	`#include"lib/hyperloglog.h"`
`26`	`27`	`#include"libpq/pqformat.h"`
`27`	`28`	`#include"miscadmin.h"`
`@@ -5976,3 +5977,152 @@ rest_of_char_same(const char s1, const char s2, int len)`
`5976`	`5977`	`#include"levenshtein.c"`
`5977`	`5978`	`#defineLEVENSHTEIN_LESS_EQUAL`
`5978`	`5979`	`#include"levenshtein.c"`
	`5980`	`+`
	`5981`	`+`
	`5982`	`+/*`
	`5983`	`+ * Unicode support`
	`5984`	`+ */`
	`5985`	`+`
	`5986`	`+staticUnicodeNormalizationForm`
	`5987`	`+unicode_norm_form_from_string(constchar*formstr)`
	`5988`	`+{`
	`5989`	`+UnicodeNormalizationFormform=-1;`
	`5990`	`+`
	`5991`	`+/*`
	`5992`	`+ * Might as well check this while we're here.`
	`5993`	`+ */`
	`5994`	`+if (GetDatabaseEncoding()!=PG_UTF8)`
	`5995`	`+ereport(ERROR,`
	`5996`	`+(errcode(ERRCODE_SYNTAX_ERROR),`
	`5997`	`+errmsg("Unicode normalization can only be performed if server encoding is UTF8")));`
	`5998`	`+`
	`5999`	`+if (pg_strcasecmp(formstr,"NFC")==0)`
	`6000`	`+form=UNICODE_NFC;`
	`6001`	`+elseif (pg_strcasecmp(formstr,"NFD")==0)`
	`6002`	`+form=UNICODE_NFD;`
	`6003`	`+elseif (pg_strcasecmp(formstr,"NFKC")==0)`
	`6004`	`+form=UNICODE_NFKC;`
	`6005`	`+elseif (pg_strcasecmp(formstr,"NFKD")==0)`
	`6006`	`+form=UNICODE_NFKD;`
	`6007`	`+else`
	`6008`	`+ereport(ERROR,`
	`6009`	`+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),`
	`6010`	`+errmsg("invalid normalization form: %s",formstr)));`
	`6011`	`+`
	`6012`	`+returnform;`
	`6013`	`+}`
	`6014`	`+`
	`6015`	`+Datum`
	`6016`	`+unicode_normalize_func(PG_FUNCTION_ARGS)`
	`6017`	`+{`
	`6018`	`+text*input=PG_GETARG_TEXT_PP(0);`
	`6019`	`+char*formstr=text_to_cstring(PG_GETARG_TEXT_PP(1));`
	`6020`	`+UnicodeNormalizationFormform;`
	`6021`	`+intsize;`
	`6022`	`+pg_wchar*input_chars;`
	`6023`	`+pg_wchar*output_chars;`
	`6024`	`+unsignedchar*p;`
	`6025`	`+text*result;`
	`6026`	`+inti;`
	`6027`	`+`
	`6028`	`+form=unicode_norm_form_from_string(formstr);`
	`6029`	`+`
	`6030`	`+/* convert to pg_wchar */`
	`6031`	`+size=pg_mbstrlen_with_len(VARDATA_ANY(input),VARSIZE_ANY_EXHDR(input));`
	`6032`	`+input_chars=palloc((size+1)*sizeof(pg_wchar));`
	`6033`	`+p= (unsignedchar*)VARDATA_ANY(input);`
	`6034`	`+for (i=0;i<size;i++)`
	`6035`	`+{`
	`6036`	`+input_chars[i]=utf8_to_unicode(p);`
	`6037`	`+p+=pg_utf_mblen(p);`
	`6038`	`+}`
	`6039`	`+input_chars[i]= (pg_wchar)'\0';`
	`6040`	`+Assert((char*)p==VARDATA_ANY(input)+VARSIZE_ANY_EXHDR(input));`
	`6041`	`+`
	`6042`	`+/* action */`
	`6043`	`+output_chars=unicode_normalize(form,input_chars);`
	`6044`	`+`
	`6045`	`+/* convert back to UTF-8 string */`
	`6046`	`+size=0;`
	`6047`	`+for (pg_wcharwp=output_chars;wp;wp++)`
	`6048`	`+{`
	`6049`	`+unsignedcharbuf[4];`
	`6050`	`+`
	`6051`	`+unicode_to_utf8(*wp,buf);`
	`6052`	`+size+=pg_utf_mblen(buf);`
	`6053`	`+}`
	`6054`	`+`
	`6055`	`+result=palloc(size+VARHDRSZ);`
	`6056`	`+SET_VARSIZE(result,size+VARHDRSZ);`
	`6057`	`+`
	`6058`	`+p= (unsignedchar*)VARDATA_ANY(result);`
	`6059`	`+for (pg_wcharwp=output_chars;wp;wp++)`
	`6060`	`+{`
	`6061`	`+unicode_to_utf8(*wp,p);`
	`6062`	`+p+=pg_utf_mblen(p);`
	`6063`	`+}`
	`6064`	`+Assert((char)p== (char)result+size+VARHDRSZ);`
	`6065`	`+`
	`6066`	`+PG_RETURN_TEXT_P(result);`
	`6067`	`+}`
	`6068`	`+`
	`6069`	`+/*`
	`6070`	`+ * Check whether the string is in the specified Unicode normalization form.`
	`6071`	`+ *`
	`6072`	`+ * This is done by convering the string to the specified normal form and then`
	`6073`	`+ * comparing that to the original string. To speed that up, we also apply the`
	`6074`	`+ * "quick check" algorithm specified in UAX #15, which can give a yes or no`
	`6075`	`+ * answer for many strings by just scanning the string once.`
	`6076`	`+ *`
	`6077`	`+ * This function should generally be optimized for the case where the string`
	`6078`	`+ * is in fact normalized. In that case, we'll end up looking at the entire`
	`6079`	`+ * string, so it's probably not worth doing any incremental conversion etc.`
	`6080`	`+ */`
	`6081`	`+Datum`
	`6082`	`+unicode_is_normalized(PG_FUNCTION_ARGS)`
	`6083`	`+{`
	`6084`	`+text*input=PG_GETARG_TEXT_PP(0);`
	`6085`	`+char*formstr=text_to_cstring(PG_GETARG_TEXT_PP(1));`
	`6086`	`+UnicodeNormalizationFormform;`
	`6087`	`+intsize;`
	`6088`	`+pg_wchar*input_chars;`
	`6089`	`+pg_wchar*output_chars;`
	`6090`	`+unsignedchar*p;`
	`6091`	`+inti;`
	`6092`	`+UnicodeNormalizationQCquickcheck;`
	`6093`	`+intoutput_size;`
	`6094`	`+boolresult;`
	`6095`	`+`
	`6096`	`+form=unicode_norm_form_from_string(formstr);`
	`6097`	`+`
	`6098`	`+/* convert to pg_wchar */`
	`6099`	`+size=pg_mbstrlen_with_len(VARDATA_ANY(input),VARSIZE_ANY_EXHDR(input));`
	`6100`	`+input_chars=palloc((size+1)*sizeof(pg_wchar));`
	`6101`	`+p= (unsignedchar*)VARDATA_ANY(input);`
	`6102`	`+for (i=0;i<size;i++)`
	`6103`	`+{`
	`6104`	`+input_chars[i]=utf8_to_unicode(p);`
	`6105`	`+p+=pg_utf_mblen(p);`
	`6106`	`+}`
	`6107`	`+input_chars[i]= (pg_wchar)'\0';`
	`6108`	`+Assert((char*)p==VARDATA_ANY(input)+VARSIZE_ANY_EXHDR(input));`
	`6109`	`+`
	`6110`	`+/* quick check (see UAX #15) */`
	`6111`	`+quickcheck=unicode_is_normalized_quickcheck(form,input_chars);`
	`6112`	`+if (quickcheck==UNICODE_NORM_QC_YES)`
	`6113`	`+PG_RETURN_BOOL(true);`
	`6114`	`+elseif (quickcheck==UNICODE_NORM_QC_NO)`
	`6115`	`+PG_RETURN_BOOL(false);`
	`6116`	`+`
	`6117`	`+/* normalize and compare with original */`
	`6118`	`+output_chars=unicode_normalize(form,input_chars);`
	`6119`	`+`
	`6120`	`+output_size=0;`
	`6121`	`+for (pg_wcharwp=output_chars;wp;wp++)`
	`6122`	`+output_size++;`
	`6123`	`+`
	`6124`	`+result= (size==output_size)&&`
	`6125`	`+(memcmp(input_chars,output_chars,size*sizeof(pg_wchar))==0);`
	`6126`	`+`
	`6127`	`+PG_RETURN_BOOL(result);`
	`6128`	`+}`

`‎src/common/unicode/.gitignore‎`

Lines changed: 1 addition & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -3,5 +3,6 @@`
`3`	`3`
`4`	`4`	`# Downloaded files`
`5`	`5`	`/CompositionExclusions.txt`
	`6`	`+/DerivedNormalizationProps.txt`
`6`	`7`	`/NormalizationTest.txt`
`7`	`8`	`/UnicodeData.txt`

`‎src/common/unicode/Makefile‎`

Lines changed: 6 additions & 3 deletions

Original file line number	Diff line number	Diff line change
`@@ -18,14 +18,14 @@ LIBS += $(PTHREAD_LIBS)`
`18`	`18`	`# By default, do nothing.`
`19`	`19`	`all:`
`20`	`20`
`21`		`-update-unicode: unicode_norm_table.h unicode_combining_table.h`
	`21`	`+update-unicode: unicode_norm_table.h unicode_combining_table.h unicode_normprops_table.h`
`22`	`22`	`$(MAKE) normalization-check`
`23`		`-mvunicode_norm_table.h unicode_combining_table.h ../../../src/include/common/`
	`23`	`+mv$^ ../../../src/include/common/`
`24`	`24`
`25`	`25`	`# These files are part of the Unicode Character Database. Download`
`26`	`26`	`# them on demand. The dependency on Makefile.global is for`
`27`	`27`	`# UNICODE_VERSION.`
`28`		`-UnicodeData.txtCompositionExclusions.txtNormalizationTest.txt:$(top_builddir)/src/Makefile.global`
	`28`	`+UnicodeData.txtDerivedNormalizationProps.txtCompositionExclusions.txtNormalizationTest.txt:$(top_builddir)/src/Makefile.global`
`29`	`29`	`$(DOWNLOAD) https://www.unicode.org/Public/$(UNICODE_VERSION)/ucd/$(@F)`
`30`	`30`
`31`	`31`	`# Generation of conversion tables used for string normalization with`
`@@ -36,6 +36,9 @@ unicode_norm_table.h: generate-unicode_norm_table.pl UnicodeData.txt Composition`
`36`	`36`	`unicode_combining_table.h: generate-unicode_combining_table.pl UnicodeData.txt`
`37`	`37`	`$(PERL)$^>$@`
`38`	`38`
	`39`	`+unicode_normprops_table.h: generate-unicode_normprops_table.pl DerivedNormalizationProps.txt`
	`40`	`+$(PERL)$^>$@`
	`41`	`+`
`39`	`42`	`# Test suite`
`40`	`43`	`normalization-check: norm_test`
`41`	`44`	`./norm_test`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit2991ac5

File tree

20 files changed

20 files changed

`‎doc/src/sgml/charset.sgml‎`

`‎doc/src/sgml/func.sgml‎`

`‎src/backend/catalog/sql_features.txt‎`

`‎src/backend/catalog/system_views.sql‎`

`‎src/backend/parser/gram.y‎`

`‎src/backend/utils/adt/varlena.c‎`

`‎src/common/unicode/.gitignore‎`

`‎src/common/unicode/Makefile‎`

0 commit comments