Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-119182: Add PyUnicodeWriter C API#119184

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
vstinner merged 15 commits intopython:mainfromvstinner:WIP_unicode_writer
Jun 17, 2024

Conversation

vstinner
Copy link
Member

@vstinnervstinner commentedMay 19, 2024
edited
Loading

@vstinnervstinnerforce-pushed theWIP_unicode_writer branch 2 times, most recently from49a46ac to14e739bCompareMay 24, 2024 07:19
@vstinnervstinner marked this pull request as ready for reviewJune 7, 2024 19:29
@vstinner
Copy link
MemberAuthor

@erlend-aasland@serhiy-storchaka@encukou: Do you want to review this change?

@serhiy-storchakaserhiy-storchaka self-requested a reviewJune 7, 2024 20:45
Comment on lines +1568 to +1570
*str* must be Python :class:`str` object. *start* must be greater than or
equal to 0, and less than or equal to *end*. *end* must be less than or
equal to *str* length.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Nit; I prefer to use SemBr for paragraphs like this.

Suggested change
*str* must be Python:class:`str` object. *start* must be greater than or
equal to 0, and less than or equal to *end*. *end* must be less than or
equal to *str* length.
*str* must be Python:class:`str` object.
*start* must be greater than or equal to 0,
and less than or equal to *end*.
*end* must be less than or equal to *str* length.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

TIL that this is called SemBr!

Breaking on comma may be too much, but I prefer to break at the sentence boundary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yeah. You don'tneed to break at comma, but I often do to minimise future diffs.

Alternative suggestion:

Suggested change
*str* must be Python:class:`str` object. *start* must be greater than or
equal to 0, and less than or equal to *end*. *end* must be less than or
equal to *str* length.
*str* must be Python:class:`str` object.
*start* must be greater than orequal to 0, and less than or equal to *end*.
*end* must be less than orequal to *str* length.

vstinnerand others added2 commitsJune 10, 2024 10:02
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
vstinnerand others added3 commitsJune 10, 2024 10:10
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
@vstinner
Copy link
MemberAuthor

@erlend-aasland: I addressed your review. Would you mind to review the updated PR?

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Copy link
Member

@serhiy-storchakaserhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Except using TypeError and ValueError instead of SystemError, LGTM.

But I do not insist if other core developers prefer TypeError and ValueError. I just think that SystemError is more appropriate for programming errors in using the C API.

serhiy-storchaka
serhiy-storchaka previously approved these changesJun 10, 2024
Copy link
Member

@malemburgmalemburg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

A more general question in the light of the free threading patches: Are these writer APIs thread-safe ?

@vstinner
Copy link
MemberAuthor

A more general question in the light of the free threading patches: Are these writer APIs thread-safe ?

I did nothing to make them safe, so I would say that they are not thread safe. You must not share a writer between two threads. Use one writer per thread.

@vstinner
Copy link
MemberAuthor

@malemburg@serhiy-storchaka: I modified the implementation to make write operations atomic. Either the whole string is written, or "nothing is written".

In the doc, I wrote "On error, leave the writer unchanged". In practice, it's more subtle, the internal buffer can be enlarged, or its kind (UCS1, UCS2 or UCS4) can change. But from the API consumer point of view, it's as if nothing was written.

I added two unit tests to show that it's still possible to use a writer after an error.

The hack is just to save/restorewriter->pos internally in the 2 functions which are not atomic: WriteUTF8() and WriteFormat().

@serhiy-storchakaserhiy-storchaka self-requested a reviewJune 10, 2024 16:14
@serhiy-storchakaserhiy-storchaka dismissed theirstale reviewJune 10, 2024 16:15

I withdraw my approve because writing cannot be atomic.

Copy link
Contributor

@erlend-aaslanderlend-aasland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Nice; thanks!

@@ -13408,6 +13546,17 @@ _PyUnicodeWriter_Finish(_PyUnicodeWriter *writer)
return unicode_result(str);
}


PyObject*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Nit: inconsistent with the rest of the PR.

Suggested change
PyObject*
PyObject*

Copy link
Member

@picnixzpicnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I also took the liberty of exhaustively point where the 2-blank lines rule is missing (the file is huge so committing directly might help).

@vstinner
Copy link
MemberAuthor

@picnixz: I appreciate your remarks about coding style, but IMO you're going too far. I marked your "two empty lines" suggestions as resolved. This PR is already approved, twice. It's a complex API being discussed for 1 month and I would prefer to not waste my time on coding style. However, I fixed the "an Unicode" typos, thanks :-)

@vstinnervstinner merged commit5c4235c intopython:mainJun 17, 2024
36 checks passed
@vstinnervstinner deleted the WIP_unicode_writer branchJune 17, 2024 15:10
@vstinner
Copy link
MemberAuthor

The C API Working Group agreed to start with this minimum PyUnicodeWriter API. I merged my PR.

mrahtz pushed a commit to mrahtz/cpython that referenced this pull requestJun 30, 2024
noahbkim pushed a commit to hudson-trading/cpython that referenced this pull requestJul 11, 2024
estyxx pushed a commit to estyxx/cpython that referenced this pull requestJul 17, 2024
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@serhiy-storchakaserhiy-storchakaserhiy-storchaka approved these changes

@malemburgmalemburgmalemburg left review comments

@picnixzpicnixzpicnixz left review comments

@erlend-aaslanderlend-aaslanderlend-aasland approved these changes

Assignees
No one assigned
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

5 participants
@vstinner@serhiy-storchaka@malemburg@picnixz@erlend-aasland

[8]ページ先頭

©2009-2025 Movatter.jp