Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-133968: Add PyUnicodeWriter_WriteASCII() function#133973

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
vstinner wants to merge6 commits intopython:main
base:main
Choose a base branch
Loading
fromvstinner:write_ascii

Conversation

vstinner
Copy link
Member

@vstinnervstinner commentedMay 13, 2025
edited by github-actionsbot
Loading

Replace most PyUnicodeWriter_WriteUTF8() calls with PyUnicodeWriter_WriteASCII().


📚 Documentation preview 📚:https://cpython-previews--133973.org.readthedocs.build/

Replace most PyUnicodeWriter_WriteUTF8() calls withPyUnicodeWriter_WriteASCII().
@vstinner
Copy link
MemberAuthor

JSON benchmark:#133832 (comment)

Benchmarkrefchange
encode 100 booleans7.15 us6.54 us: 1.09x faster
encode 100 integers11.6 us11.7 us: 1.01x slower
encode 100 "ascii" strings13.4 us13.2 us: 1.02x faster
encode escaped string len=1281.11 us1.10 us: 1.01x faster
encode 1000 booleans39.3 us32.9 us: 1.19x faster
encode Unicode string len=10004.93 us4.94 us: 1.00x slower
encode 10000 booleans343 us286 us: 1.20x faster
encode ascii string len=1000028.5 us28.8 us: 1.01x slower
encode escaped string len=998438.7 us38.9 us: 1.00x slower
encode Unicode string len=1000042.6 us42.4 us: 1.00x faster
Geometric mean(ref)1.02x faster

Benchmark hidden because not significant (11): encode 100 floats, encode ascii string len=100, encode Unicode string len=100, encode 1000 integers, encode 1000 floats, encode 1000 "ascii" strings, encode ascii string len=1000, encode escaped string len=896, encode 10000 integers, encode 10000 floats, encode 10000 "ascii" strings

Up to1.20x faster to encode booleans is interesting knowing that these strings are very short: "true" (4 characters) and "false" (5 characters).

@vstinner
Copy link
MemberAuthor

ThePyUnicodeWriter_WriteASCII() function is faster thanPyUnicodeWriter_WriteUTF8(), but has an undefined behavior if the input string contains non-ASCII characters.

@serhiy-storchaka: What do you think of this function?

@vstinner
Copy link
MemberAuthor

cc@ZeroIntensity

Copy link
Member

@ZeroIntensityZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Some nits

@serhiy-storchaka
Copy link
Member

Well, we had_PyUnicodeWriter_WriteASCIIString for reasons.

Butunicode_decode_utf8_writer is already optimized for ASCII. Can it be optimized even more? In theory, it can be made almost as fast as_PyUnicodeWriter_WriteASCIIString.

We can add private_PyUnicodeWriter_WriteASCII for now, to avoid regression in JSON encode, and then try to squeeze nanoseconds fromPyUnicodeWriter_WriteUTF8. If we fail, we can add publicPyUnicodeWriter_WriteASCII.

Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
@vstinner
Copy link
MemberAuthor

But unicode_decode_utf8_writer is already optimized for ASCII. Can it be optimized even more?

I don't think that it can become as fast or faster than a function which takes ASCII string as argument. If we know that the input string is ASCII, there is no need to scan the string for non-ASCII characters, and we can take the fast path.

You're right that the UTF-8 decoder is already highly optimized.

@vstinner
Copy link
MemberAuthor

In short:

  • PyUnicodeWriter_WriteUTF8() callsascii_decode() which is an efficient ASCII decoder.
  • PyUnicodeWriter_WriteASCII() callsmemcpy().

It's hard to beatmemcpy() performance!

@serhiy-storchaka
Copy link
Member

Yes, although it was close, at least for moderately large strings. Could it be optimized even more? I don't know.

But decision aboutPyUnicodeWriter_WriteASCII should be made by the C API Workgroup. I'm not sure of my opinion yet. This API is unsafe.

@vstinner
Copy link
MemberAuthor

I createdcapi-workgroup/decisions#65 issue.

@vstinner
Copy link
MemberAuthor

Benchmark:

write_utf8 size=10: Mean +- std dev: 153 ns +- 1 nswrite_utf8 size=100: Mean +- std dev: 174 ns +- 1 nswrite_utf8 size=1,000: Mean +- std dev: 279 ns +- 0 nswrite_utf8 size=10,000: Mean +- std dev: 1.36 us +- 0.00 uswrite_ascii size=10: Mean +- std dev: 141 ns +- 0 nswrite_ascii size=100: Mean +- std dev: 149 ns +- 0 nswrite_ascii size=1,000: Mean +- std dev: 176 ns +- 3 nswrite_ascii size=10,000: Mean +- std dev: 690 ns +- 8 ns

On long strings (10,000 bytes), PyUnicodeWriter_WriteASCII() is up to 2x faster (1.36 us => 690 ns) than PyUnicodeWriter_WriteUTF8().

from_testcapiimportPyUnicodeWriterimportpyperfrange_100=range(100)defbench_write_utf8(text,size):writer=PyUnicodeWriter(0)for_inrange_100:writer.write_utf8(text,size)writer.write_utf8(text,size)writer.write_utf8(text,size)writer.write_utf8(text,size)writer.write_utf8(text,size)writer.write_utf8(text,size)writer.write_utf8(text,size)writer.write_utf8(text,size)writer.write_utf8(text,size)writer.write_utf8(text,size)defbench_write_ascii(text,size):writer=PyUnicodeWriter(0)for_inrange_100:writer.write_ascii(text,size)writer.write_ascii(text,size)writer.write_ascii(text,size)writer.write_ascii(text,size)writer.write_ascii(text,size)writer.write_ascii(text,size)writer.write_ascii(text,size)writer.write_ascii(text,size)writer.write_ascii(text,size)writer.write_ascii(text,size)runner=pyperf.Runner()sizes= (10,100,1_000,10_000)forsizeinsizes:text=b'x'*sizerunner.bench_func(f'write_utf8 size={size:,}',bench_write_utf8,text,size,inner_loops=1_000)forsizeinsizes:text=b'x'*sizerunner.bench_func(f'write_ascii size={size:,}',bench_write_ascii,text,size,inner_loops=1_000)

@encukou
Copy link
Member

Do we know where the bottleneck is for long strings?
Would it make sense have a version offind_first_nonascii that checks and copies in the same loop?

@vstinner
Copy link
MemberAuthor

Do we know where the bottleneck is for long strings?

WriteUTF8() has to check for non-ASCII characters: this check has a cost. That's the bottleneck.

Would it make sense have a version of find_first_nonascii that checks and copies in the same loop?

Maybe, I don't know if it would be faster.

@vstinner
Copy link
MemberAuthor

Would it make sense have a version of find_first_nonascii that checks and copies in the same loop?

I tried but failed to modify the code to copy while reading (checking if the string is encoded to ASCII). The code is quite complicated.

vstinnerand others added3 commitsMay 15, 2025 21:41
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Copy link
Member

@picnixzpicnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm happy to have this function public. I always preferred using the faster versions of the writer API when I hardcoded strings, but they were private.

encukou reacted with thumbs up emoji
Copy link
Member

@ZeroIntensityZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Sorry for the late review, LGTM as well.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@picnixzpicnixzpicnixz approved these changes

@ZeroIntensityZeroIntensityZeroIntensity approved these changes

@isidenticalisidenticalAwaiting requested review from isidenticalisidentical is a code owner

@JelleZijlstraJelleZijlstraAwaiting requested review from JelleZijlstraJelleZijlstra is a code owner

@Eclips4Eclips4Awaiting requested review from Eclips4Eclips4 is a code owner

@gpsheadgpsheadAwaiting requested review from gpsheadgpshead is a code owner

@markshannonmarkshannonAwaiting requested review from markshannonmarkshannon is a code owner

@1st11st1Awaiting requested review from 1st11st1 is a code owner

Assignees
No one assigned
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

5 participants
@vstinner@serhiy-storchaka@encukou@picnixz@ZeroIntensity

[8]ページ先頭

©2009-2025 Movatter.jp