Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-83461: Don't allow datetime parsing to accept non-ASCII digits#131008

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
StanFromIreland wants to merge23 commits intopython:main
base:main
Choose a base branch
Loading
fromStanFromIreland:ascii-strptime

Conversation

StanFromIreland
Copy link
Contributor

@StanFromIrelandStanFromIreland commentedMar 9, 2025
edited by bedevere-appbot
Loading

@picnixzpicnixz added the type-securityA security issue labelMar 9, 2025
@picnixz
Copy link
Member

The original issue was marked with a security bug but I haven't looked at the entire thread so we might consider it a simple bug fix

@StanFromIreland
Copy link
ContributorAuthor

@picnixz

No it needs to be relabeled:

-versions -type-security +type-bug

picnixz reacted with thumbs up emoji

Copy link
Member

@picnixzpicnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Is there a need to update the C implementation?

@picnixzpicnixz removed the type-securityA security issue labelMar 9, 2025
@StanFromIreland
Copy link
ContributorAuthor

C calls the Python implementation.

picnixz reacted with thumbs up emoji

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Copy link
Member

@picnixzpicnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I actually wonder but what about languages for which their input has non-ASCII strings such as Japanese?

Copy link
Member

@picnixzpicnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think the ASCII flag restriction is too broad here as it applies to all formats, not just the digit part.

picnixz
picnixz previously requested changesMar 9, 2025
Copy link
Member

@picnixzpicnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Please carefully address the following suggestions and please make sure that typos lines are correctly formatted.

@bedevere-app
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phraseI have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@picnixz
Copy link
Member

Note: I'll review more this PR once I'm back because it will be easier when I'm on a laptop and not on mobile (so on Wednesday/Thursday)

StanFromIreland reacted with thumbs up emoji

@bedevere-app
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phraseI have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

And if you don't make the requested changes, you will be put in the comfy chair!

@StanFromIreland
Copy link
ContributorAuthor

I have made the requested changes; please review again

@bedevere-app
Copy link

Thanks for making the requested changes!

@pganssle,@picnixz: please review the changes made to this pull request.

@pganssle
Copy link
Member

pganssle commentedMar 26, 2025
edited
Loading

I think my comments on how to organize the documentation were not clear enough, so rather than another round of back-and-forth, I went ahead and pushed some changes to the documentation directly.

Also, I realized that the summary of "reject non-ASCII digits" isn't quite right, because there are actually locales where codes like%c and%X will generate strings with non-ASCII digits, e.g.:

>>>fromdatetimeimportdatetime>>>importlocale;locale.setlocale(locale.LC_ALL,'fa_IR.utf8')>>>print(datetime.now().strftime('%x'))۲۵/۰۳/۲۶>>>print(datetime.now().strftime("c"))چهارشنبه ۲۶مارس ۲۵، ۱۰:۳۷:۵۱

And in fact in 3.14 we fixedstrptime to properly parse these, see#125406.

TBH, I'm a little apprehensive about so widely advertising this change, because part of the rationale for doing it is that support for this kind of thing was patchy in the first place, so it's better for it to eitherall work ornot work at all. Fixing a bug like that doesn't seem like the kind of thing that needs a What's New entry and a permanent log of the change in thedatetime documentation on the same level as API changes like changes to the function prototype.

Given that we're trying to improve locale support in other ways, I could imagine us adding support for these things more explicitly in the future, so I wonder if we want to set the expectation that this willnever work.

@StanFromIreland@picnixz Would one or both of you mind taking a look at my changes and seeing if you agree?

@serhiy-storchaka As the person who has been working on the locale improvements, any thoughts on this subject?

@serhiy-storchaka
Copy link
Member

I think that formats like%Y should only accept ASCII digits, but%OY and%EY should accept non-ASCII digits (the set of accepted digits can depend on the locale). TheO andE prefixes are not yet officially supported in Python.strftime() may support them if the underlying C function support them (they are a part of the C and Posix standards now, but this depends on the version and platform).strptime() is implemented in Python and doesn't officially support them, but I added unofficial support for some directives).

Currently, a heuristic is used to determine the format for%c,%x and%X in terms of simpler directives. I added (unofficially) support of theO prefix to support non-ASCII digits and plan to disallow support of non-ASCII digits in directives without theO orE prefix, but this is a change which cannot be backported. I have not yet finished with changes which can be backported.

First, we need to add official support of theO andE prefixes (even if it is platform dependent). We need to usenl_langinfo() for this. Then we can disallow non-ASCII digits for directives without these prefixes.

Actually, I started to work onstrptime() andnl_langinfo() issues after seen this issue. Most of the work is already done, but please wait a little longer.

pganssle reacted with thumbs up emoji

@picnixz
Copy link
Member

Also, I realized that the summary of "reject non-ASCII digits" isn't quite right, because there are actually locales where codes like %c and %X will generate strings with non-ASCII digits, e.g.:

Oh I wasn't aware of this so thank you for the corrections.

Copy link
Member

@picnixzpicnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Maybe inLib/_striptime.py, you can add a small comment saying that theO* formats are locale-specific? (just above `for d in 'dmyHIMS'?

@@ -563,6 +563,9 @@ Functions
When used with the :func:`strptime` function, ``%U`` and ``%W`` are only used in
calculations when the day of the week and the year are specified.

(5)
The :func:`strptime` function does not accept non-ASCII digits for numeric values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Should we mention the "non-locale-specific numeric format codes" here or, since it's not officially supported, we can be a bit lazy?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@pganssle Do you want to do this too? Or should I?

@StanFromIreland
Copy link
ContributorAuthor

@pganssle friendly ping :-)

Copy link
Member

@pgansslepganssle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

OK, this looks good to me now. Since@serhiy-storchaka asked us to wait on it, though, I'll leave it up to him as to when we merge it or if other changes are necessary.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@picnixzpicnixzpicnixz approved these changes

@pgansslepgansslepganssle approved these changes

@abalkinabalkinAwaiting requested review from abalkinabalkin is a code owner

@serhiy-storchakaserhiy-storchakaAwaiting requested review from serhiy-storchaka

Assignees
No one assigned
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

4 participants
@StanFromIreland@picnixz@pganssle@serhiy-storchaka

[8]ページ先頭

©2009-2025 Movatter.jp