NotificationsYou must be signed in to change notification settings
Fork32.3k
Star67.7k

gh-133968: Add PyUnicodeWriter_WriteASCII() function#133973

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Merged

vstinner merged 8 commits intopython:mainfromvstinner:write_ascii

May 29, 2025

Merged

gh-133968: Add PyUnicodeWriter_WriteASCII() function#133973

vstinner merged 8 commits intopython:mainfromvstinner:write_ascii

May 29, 2025

Conversation

Copy link

Member

vstinner commentedMay 13, 2025•
edited by github-actionsbot
Loading

Replace most PyUnicodeWriter_WriteUTF8() calls with PyUnicodeWriter_WriteASCII().

Issue:Using the public PyUnicodeWriter C API made the json module slower #133968

📚 Documentation preview 📚:https://cpython-previews--133973.org.readthedocs.build/

pythongh-133968: Add PyUnicodeWriter_WriteASCII() function

2aa2e87

Replace most PyUnicodeWriter_WriteUTF8() calls withPyUnicodeWriter_WriteASCII().

vstinner requested review fromisidentical,JelleZijlstra,Eclips4,gpshead,picnixz,markshannon and1st1 ascode owners

May 13, 2025 14:15

bedevere-appbot mentioned this pull request

May 13, 2025

Using the public PyUnicodeWriter C API made the json module slower#133968

Closed

bedevere-appbot added the awaiting core review label

May 13, 2025

Copy link

MemberAuthor

vstinner commentedMay 13, 2025

JSON benchmark:#133832 (comment)

Benchmark	ref	change
encode 100 booleans	7.15 us	6.54 us: 1.09x faster
encode 100 integers	11.6 us	11.7 us: 1.01x slower
encode 100 "ascii" strings	13.4 us	13.2 us: 1.02x faster
encode escaped string len=128	1.11 us	1.10 us: 1.01x faster
encode 1000 booleans	39.3 us	32.9 us: 1.19x faster
encode Unicode string len=1000	4.93 us	4.94 us: 1.00x slower
encode 10000 booleans	343 us	286 us: 1.20x faster
encode ascii string len=10000	28.5 us	28.8 us: 1.01x slower
encode escaped string len=9984	38.7 us	38.9 us: 1.00x slower
encode Unicode string len=10000	42.6 us	42.4 us: 1.00x faster
Geometric mean	(ref)	1.02x faster

Benchmark hidden because not significant (11): encode 100 floats, encode ascii string len=100, encode Unicode string len=100, encode 1000 integers, encode 1000 floats, encode 1000 "ascii" strings, encode ascii string len=1000, encode escaped string len=896, encode 10000 integers, encode 10000 floats, encode 10000 "ascii" strings

Up to1.20x faster to encode booleans is interesting knowing that these strings are very short: "true" (4 characters) and "false" (5 characters).

Copy link

MemberAuthor

vstinner commentedMay 13, 2025

ThePyUnicodeWriter_WriteASCII() function is faster thanPyUnicodeWriter_WriteUTF8(), but has an undefined behavior if the input string contains non-ASCII characters.

@serhiy-storchaka: What do you think of this function?

Copy link

MemberAuthor

vstinner commentedMay 13, 2025

cc@ZeroIntensity

vstinner mentioned this pull request

May 13, 2025

gh-133968: Use private unicode writer for json#133832

Merged

ZeroIntensity reviewed

May 13, 2025

View reviewed changes

Copy link

Member

ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Some nits

Doc/c-api/unicode.rst OutdatedShow resolvedHide resolved

Doc/c-api/unicode.rstShow resolvedHide resolved

Objects/unicodeobject.cShow resolvedHide resolved

Copy link

Member

serhiy-storchaka commentedMay 13, 2025

Well, we had_PyUnicodeWriter_WriteASCIIString for reasons.

Butunicode_decode_utf8_writer is already optimized for ASCII. Can it be optimized even more? In theory, it can be made almost as fast as_PyUnicodeWriter_WriteASCIIString.

We can add private_PyUnicodeWriter_WriteASCII for now, to avoid regression in JSON encode, and then try to squeeze nanoseconds fromPyUnicodeWriter_WriteUTF8. If we fail, we can add publicPyUnicodeWriter_WriteASCII.

Update Doc/c-api/unicode.rst

fc08c32

Co-authored-by: Peter Bierma <zintensitydev@gmail.com>

Copy link

MemberAuthor

vstinner commentedMay 13, 2025

But unicode_decode_utf8_writer is already optimized for ASCII. Can it be optimized even more?

I don't think that it can become as fast or faster than a function which takes ASCII string as argument. If we know that the input string is ASCII, there is no need to scan the string for non-ASCII characters, and we can take the fast path.

You're right that the UTF-8 decoder is already highly optimized.

Copy link

MemberAuthor

vstinner commentedMay 14, 2025

In short:

PyUnicodeWriter_WriteUTF8() callsascii_decode() which is an efficient ASCII decoder.
PyUnicodeWriter_WriteASCII() callsmemcpy().

It's hard to beatmemcpy() performance!

Copy link

Member

serhiy-storchaka commentedMay 14, 2025

Yes, although it was close, at least for moderately large strings. Could it be optimized even more? I don't know.

But decision aboutPyUnicodeWriter_WriteASCII should be made by the C API Workgroup. I'm not sure of my opinion yet. This API is unsafe.

vstinner mentioned this pull request

May 14, 2025

Add PyUnicodeWriter_WriteASCII() functioncapi-workgroup/decisions#65

Closed

6 tasks

Copy link

MemberAuthor

vstinner commentedMay 14, 2025

I createdcapi-workgroup/decisions#65 issue.

Copy link

MemberAuthor

vstinner commentedMay 14, 2025

Benchmark:

write_utf8 size=10: Mean +- std dev: 153 ns +- 1 nswrite_utf8 size=100: Mean +- std dev: 174 ns +- 1 nswrite_utf8 size=1,000: Mean +- std dev: 279 ns +- 0 nswrite_utf8 size=10,000: Mean +- std dev: 1.36 us +- 0.00 uswrite_ascii size=10: Mean +- std dev: 141 ns +- 0 nswrite_ascii size=100: Mean +- std dev: 149 ns +- 0 nswrite_ascii size=1,000: Mean +- std dev: 176 ns +- 3 nswrite_ascii size=10,000: Mean +- std dev: 690 ns +- 8 ns

On long strings (10,000 bytes), PyUnicodeWriter_WriteASCII() is up to 2x faster (1.36 us => 690 ns) than PyUnicodeWriter_WriteUTF8().

from_testcapiimportPyUnicodeWriterimportpyperfrange_100=range(100)defbench_write_utf8(text,size):writer=PyUnicodeWriter(0)for_inrange_100:writer.write_utf8(text,size)writer.write_utf8(text,size)writer.write_utf8(text,size)writer.write_utf8(text,size)writer.write_utf8(text,size)writer.write_utf8(text,size)writer.write_utf8(text,size)writer.write_utf8(text,size)writer.write_utf8(text,size)writer.write_utf8(text,size)defbench_write_ascii(text,size):writer=PyUnicodeWriter(0)for_inrange_100:writer.write_ascii(text,size)writer.write_ascii(text,size)writer.write_ascii(text,size)writer.write_ascii(text,size)writer.write_ascii(text,size)writer.write_ascii(text,size)writer.write_ascii(text,size)writer.write_ascii(text,size)writer.write_ascii(text,size)writer.write_ascii(text,size)runner=pyperf.Runner()sizes= (10,100,1_000,10_000)forsizeinsizes:text=b'x'*sizerunner.bench_func(f'write_utf8 size={size:,}',bench_write_utf8,text,size,inner_loops=1_000)forsizeinsizes:text=b'x'*sizerunner.bench_func(f'write_ascii size={size:,}',bench_write_ascii,text,size,inner_loops=1_000)

Copy link

Member

encukou commentedMay 15, 2025

Do we know where the bottleneck is for long strings?
Would it make sense have a version offind_first_nonascii that checks and copies in the same loop?

Copy link

MemberAuthor

vstinner commentedMay 15, 2025

Do we know where the bottleneck is for long strings?

WriteUTF8() has to check for non-ASCII characters: this check has a cost. That's the bottleneck.

Would it make sense have a version of find_first_nonascii that checks and copies in the same loop?

Maybe, I don't know if it would be faster.

picnixz reviewed

May 15, 2025

View reviewed changes

Doc/whatsnew/3.15.rst OutdatedShow resolvedHide resolved

Python/hamt.cShow resolvedHide resolved

Python/hamt.c OutdatedShow resolvedHide resolved

Objects/unionobject.cShow resolvedHide resolved

Copy link

MemberAuthor

vstinner commentedMay 15, 2025

Would it make sense have a version of find_first_nonascii that checks and copies in the same loop?

I tried but failed to modify the code to copy while reading (checking if the string is encoded to ASCII). The code is quite complicated.

vstinnerand others added3 commits

May 15, 2025 21:41

Update Doc/whatsnew/3.15.rst

14c22c3

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>

Update Python/hamt.c

33b3276

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>

Address Peter's review

e7ca52f

picnixz approved these changes

May 15, 2025

View reviewed changes

Copy link

Member

picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm happy to have this function public. I always preferred using the faster versions of the writer API when I hardcoded strings, but they were private.

Lib/test/test_capi/test_unicode.pyShow resolvedHide resolved

bedevere-appbot added awaiting merge and removed awaiting core review labels

May 15, 2025

Test also empty string

25e9444

ZeroIntensity approved these changes

May 18, 2025

View reviewed changes

Copy link

Member

ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Sorry for the late review, LGTM as well.

Merge branch 'main' into write_ascii

418c836

Copy link

MemberAuthor

vstinner commentedMay 29, 2025

I createdcapi-workgroup/decisions#65 issue.

The C API Working Group voted in favor of adding the function.

Please the linter: remove an unused import

b01a577

vstinnerenabled auto-merge (squash)

May 29, 2025 14:40

vstinner merged commitf49a07b intopython:main

May 29, 2025

39 checks passed

vstinner deleted the write_ascii branch

May 29, 2025 14:54

bedevere-appbot removed the awaiting merge label

May 29, 2025

vstinner added a commit to vstinner/cpython that referenced this pull request

May 31, 2025

pythongh-133968: Add PyUnicodeWriter_WriteASCII() function (python#13…

1dc045c

…3973)Replace most PyUnicodeWriter_WriteUTF8() calls withPyUnicodeWriter_WriteASCII().Unrelated change to please the linter: remove an unusedimport in test_ctypes.Co-authored-by: Peter Bierma <zintensitydev@gmail.com>Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>(cherry picked from commitf49a07b)

Copy link

bedevere-appbot commentedMay 31, 2025

GH-134974 is a backport of this pull request to the3.14 branch.

vstinner added a commit to vstinner/cpython that referenced this pull request

Jun 2, 2025

pythongh-133968: Add PyUnicodeWriter_WriteASCII() function (python#13…

39506d1

…3973)Replace most PyUnicodeWriter_WriteUTF8() calls withPyUnicodeWriter_WriteASCII().Co-authored-by: Peter Bierma <zintensitydev@gmail.com>Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>(cherry picked from commitf49a07b)

vstinner added a commit that referenced this pull request

Jun 9, 2025

[3.14]gh-133968: Add PyUnicodeWriter_WriteASCII() function (#133973) (…

3d69d18

…#134974)gh-133968: Add PyUnicodeWriter_WriteASCII() function (#133973)Replace most PyUnicodeWriter_WriteUTF8() calls withPyUnicodeWriter_WriteASCII().(cherry picked from commitf49a07b)Co-authored-by: Peter Bierma <zintensitydev@gmail.com>Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>