NotificationsYou must be signed in to change notification settings
Fork33.3k
Star69.8k

Commitbb904e0

authored

closesgh-124016: update Unicode to 16.0.0 (#124017)

1 parenta9594a3 commitbb904e0Copy full SHA for bb904e0

File tree

12 files changed

+22581

-20691

lines changed

Doc
- library
  - stdtypes.rst
  - unicodedata.rst
- reference
  - lexical_analysis.rst
- whatsnew
  - 3.14.rst
Lib/test
Misc/NEWS.d/next/Library
- 2024-09-12-10-55-19.gh-issue-124016.ncs0hd.rst
Modules
- unicodedata_db.h
- unicodename_db.h
Objects
- unicodetype_db.h
Tools/unicode
- makeunicodedata.py

12 files changed

+22581

-20691

lines changed

`‎Doc/library/stdtypes.rst‎`

Lines changed: 4 additions & 4 deletions

Original file line number	Diff line number	Diff line change
@@ -1679,7 +1679,7 @@ expression support in the :mod:`re` module).
`1679`	`1679`
`1680`	`1680`	`The casefolding algorithm is`
`1681`	`1681`	`described in section 3.13 'Default Case Folding' of the Unicode Standard
`1682`		-<https://www.unicode.org/versions/Unicode15.1.0/ch03.pdf>`__.
	`1682`	+<https://www.unicode.org/versions/Unicode16.0.0/core-spec/chapter-3/#G33992>`__.
`1683`	`1683`
`1684`	`1684`	`..versionadded::3.3`
`1685`	`1685`
@@ -1843,7 +1843,7 @@ expression support in the :mod:`re` module).
`1843`	`1843`	`property being one of "Lm", "Lt", "Lu", "Ll", or "Lo". Note that this is different`
`1844`	`1844`	from the `Alphabetic property defined in the section 4.10 'Letters, Alphabetic, and
`1845`	`1845`	`Ideographic' of the Unicode Standard`
`1846`		-<https://www.unicode.org/versions/Unicode15.1.0/ch04.pdf>`_.
	`1846`	+<https://www.unicode.org/versions/Unicode16.0.0/core-spec/chapter-4/#G91002>`_.
`1847`	`1847`
`1848`	`1848`
`1849`	`1849`	`..method::str.isascii()`
@@ -1979,7 +1979,7 @@ expression support in the :mod:`re` module).
`1979`	`1979`
`1980`	`1980`	`The lowercasing algorithm used is`
`1981`	`1981`	`described in section 3.13 'Default Case Folding' of the Unicode Standard
`1982`		-<https://www.unicode.org/versions/Unicode15.1.0/ch03.pdf>`__.
	`1982`	+<https://www.unicode.org/versions/Unicode16.0.0/core-spec/chapter-3/#G33992>`__.
`1983`	`1983`
`1984`	`1984`
`1985`	`1985`	`..method::str.lstrip([chars])`
@@ -2331,7 +2331,7 @@ expression support in the :mod:`re` module).
`2331`	`2331`
`2332`	`2332`	`The uppercasing algorithm used is`
`2333`	`2333`	`described in section 3.13 'Default Case Folding' of the Unicode Standard
`2334`		-<https://www.unicode.org/versions/Unicode15.1.0/ch03.pdf>`__.
	`2334`	+<https://www.unicode.org/versions/Unicode16.0.0/core-spec/chapter-3/#G33992>`__.
`2335`	`2335`
`2336`	`2336`
`2337`	`2337`	`..method::str.zfill(width)`

`‎Doc/library/unicodedata.rst‎`

Lines changed: 4 additions & 4 deletions

Original file line number	Diff line number	Diff line change
`@@ -17,8 +17,8 @@`
`17`	`17`
`18`	`18`	`This module provides access to the Unicode Character Database (UCD) which`
`19`	`19`	`defines character properties for all Unicode characters. The data contained in`
`20`		-this database is compiled from the `UCD version15.1.0
`21`		-<https://www.unicode.org/Public/15.1.0/ucd>`_.
	`20`	+this database is compiled from the `UCD version16.0.0
	`21`	+<https://www.unicode.org/Public/16.0.0/ucd>`_.
`22`	`22`
`23`	`23`	`The module uses the same names and symbols as defined by Unicode`
`24`	`24`	Standard Annex #44, `"Unicode Character Database"
`@@ -175,6 +175,6 @@ Examples:`
`175`	`175`
`176`	`176`	`..rubric::Footnotes`
`177`	`177`
`178`		`-.. [#]https://www.unicode.org/Public/15.1.0/ucd/NameAliases.txt`
	`178`	`+.. [#]https://www.unicode.org/Public/16.0.0/ucd/NameAliases.txt`
`179`	`179`
`180`		`-.. [#]https://www.unicode.org/Public/15.1.0/ucd/NamedSequences.txt`
	`180`	`+.. [#]https://www.unicode.org/Public/16.0.0/ucd/NamedSequences.txt`

`‎Doc/reference/lexical_analysis.rst‎`

Lines changed: 4 additions & 4 deletions

Original file line number	Diff line number	Diff line change
`@@ -314,16 +314,16 @@ The Unicode category codes mentioned above stand for:`
`314`	`314`	`* Nd - decimal numbers`
`315`	`315`	`* Pc - connector punctuations`
`316`	`316`	* Other_ID_Start - explicit list of characters in `PropList.txt
`317`		-<https://www.unicode.org/Public/15.1.0/ucd/PropList.txt>`_ to support backwards
	`317`	+<https://www.unicode.org/Public/16.0.0/ucd/PropList.txt>`_ to support backwards
`318`	`318`	`compatibility`
`319`	`319`	`* Other_ID_Continue - likewise`
`320`	`320`
`321`	`321`	`All identifiers are converted into the normal form NFKC while parsing; comparison`
`322`	`322`	`of identifiers is based on NFKC.`
`323`	`323`
`324`	`324`	`A non-normative HTML file listing all valid identifier characters for Unicode`
`325`		`-15.1.0 can be found at`
`326`		`-https://www.unicode.org/Public/15.1.0/ucd/DerivedCoreProperties.txt`
	`325`	`+16.0.0 can be found at`
	`326`	`+https://www.unicode.org/Public/16.0.0/ucd/DerivedCoreProperties.txt`
`327`	`327`
`328`	`328`
`329`	`329`	`.. _keywords:`
`@@ -1044,4 +1044,4 @@ occurrence outside string literals and comments is an unconditional error:`
`1044`	`1044`
`1045`	`1045`	`..rubric::Footnotes`
`1046`	`1046`
`1047`		`-.. [#]https://www.unicode.org/Public/15.1.0/ucd/NameAliases.txt`
	`1047`	`+.. [#]https://www.unicode.org/Public/16.0.0/ucd/NameAliases.txt`

`‎Doc/whatsnew/3.14.rst‎`

Lines changed: 5 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -253,6 +253,11 @@ symtable`
`253`	`253`
`254`	`254`	(Contributed by Bénédikt Tran in:gh:`120029`.)
`255`	`255`
	`256`	`+unicodedata`
	`257`	`+-----------`
	`258`	`+`
	`259`	`+* The Unicode database has been updated to Unicode 16.0.0.`
	`260`	`+`
`256`	`261`	`.. Add improved modules above alphabetically, not here at the end.`
`257`	`262`
`258`	`263`	`Optimizations`

`‎Lib/test/string_tests.py‎`

Lines changed: 2 additions & 2 deletions

Original file line number	Diff line number	Diff line change
`@@ -1132,8 +1132,8 @@ def test_capitalize_nonascii(self):`
`1132`	`1132`	`self.checkequal('\u2160\u2171\u2172',`
`1133`	`1133`	`'\u2170\u2171\u2172','capitalize')`
`1134`	`1134`	`# check with Ll chars with no upper - nothing changes here`
`1135`		`-self.checkequal('\u019b\u1d00\u1d86\u0221\u1fb7',`
`1136`		`-'\u019b\u1d00\u1d86\u0221\u1fb7','capitalize')`
	`1135`	`+self.checkequal('\u1d00\u1d86\u0221\u1fb7',`
	`1136`	`+'\u1d00\u1d86\u0221\u1fb7','capitalize')`
`1137`	`1137`
`1138`	`1138`	`deftest_startswith(self):`
`1139`	`1139`	`self.checkequal(True,'hello','startswith','he')`

`‎Lib/test/test_str.py‎`

Lines changed: 4 additions & 2 deletions

Original file line number	Diff line number	Diff line change
`@@ -2430,8 +2430,10 @@ def __repr__(self):`
`2430`	`2430`	`self.assertEqual(repr(s1()),'\\n')`
`2431`	`2431`
`2432`	`2432`	`deftest_printable_repr(self):`
`2433`		`-self.assertEqual(repr('\U00010000'),"'%c'"% (0x10000,))# printable`
`2434`		`-self.assertEqual(repr('\U00014000'),"'\\U00014000'")# nonprintable`
	`2433`	`+# printable`
	`2434`	`+self.assertEqual(repr('\U00010000'),"'%c'"% (0x10000,))`
	`2435`	`+# nonprintable (private use area)`
	`2436`	`+self.assertEqual(repr('\U00100001'),"'\\U00100001'")`
`2435`	`2437`
`2436`	`2438`	`# This test only affects 32-bit platforms because expandtabs can only take`
`2437`	`2439`	`# an int as the max value, not a 64-bit C long. If expandtabs is changed`

`‎Lib/test/test_unicodedata.py‎`

Lines changed: 2 additions & 2 deletions

Original file line number	Diff line number	Diff line change
`@@ -18,7 +18,7 @@`
`18`	`18`	`classUnicodeMethodsTest(unittest.TestCase):`
`19`	`19`
`20`	`20`	`# update this, if the database changes`
`21`		`-expectedchecksum='63aa77dcb36b0e1df082ee2a6071caeda7f0955e'`
	`21`	`+expectedchecksum='9e43ee3929471739680c0e705482b4ae1c4122e4'`
`22`	`22`
`23`	`23`	`@requires_resource('cpu')`
`24`	`24`	`deftest_method_checksum(self):`
`@@ -71,7 +71,7 @@ class UnicodeFunctionsTest(UnicodeDatabaseTest):`
`71`	`71`
`72`	`72`	`# Update this if the database changes. Make sure to do a full rebuild`
`73`	`73`	`# (e.g. 'make distclean && make') to get the correct checksum.`
`74`		`-expectedchecksum='232affd2a50ec4bd69d2482aa0291385cbdefaba'`
	`74`	`+expectedchecksum='23ab09ed4abdf93db23b97359108ed630dd8311d'`
`75`	`75`
`76`	`76`	`@requires_resource('cpu')`
`77`	`77`	`deftest_function_checksum(self):`

`‎Misc/NEWS.d/next/Library/2024-09-12-10-55-19.gh-issue-124016.ncs0hd.rst‎`

Lines changed: 1 addition & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1 @@`
	`1`	+Update:mod:`unicodedata` database to Unicode 16.0.0.

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Commitbb904e0

File tree

12 files changed

12 files changed

`‎Doc/library/stdtypes.rst‎`

`‎Doc/library/unicodedata.rst‎`

`‎Doc/reference/lexical_analysis.rst‎`

`‎Doc/whatsnew/3.14.rst‎`

`‎Lib/test/string_tests.py‎`

`‎Lib/test/test_str.py‎`

`‎Lib/test/test_unicodedata.py‎`

`‎Misc/NEWS.d/next/Library/2024-09-12-10-55-19.gh-issue-124016.ncs0hd.rst‎`

0 commit comments