NotificationsYou must be signed in to change notification settings
Fork33.3k
Star69.8k

gh-139516: Fix lambda colon start format spec in f-string in tokenizer#139657

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Merged

pablogsal merged 4 commits intopython:mainfromtom-pytel:fix-issue-139516

Oct 7, 2025

Merged

gh-139516: Fix lambda colon start format spec in f-string in tokenizer#139657

pablogsal merged 4 commits intopython:mainfromtom-pytel:fix-issue-139516

Oct 7, 2025

Conversation

Copy link

Contributor

tom-pytel commentedOct 6, 2025•
edited by bedevere-appbot
Loading

A= followed by a: in an f-string expression could cause the tokenizer to erroneously think it was starting a format spec, leading to incorrect internal state and possible decode errors if this results in split unicode characters on copy. This PR fixes this by disallowing= to setin_debug state unless it is encountered at the top level of an f-string expression.

This problem exists back to py 3.13 and this PR can probably be backported easily enough.

Issue:Parser gives UnicodeDecodeError on what should be good code #139516

pythongh-139516: fix lambda colon start format spec in f-string

7cdd725

tom-pytel requested review fromlysnikolaou andpablogsal ascode owners

October 6, 2025 13:13

bedevere-appbot mentioned this pull request

Oct 6, 2025

Parser gives UnicodeDecodeError on what should be good code#139516

Closed

bedevere-appbot added the awaiting review label

Oct 6, 2025

📜🤖 Added by blurb_it.

f6fbb7e

Copy link

ContributorAuthor

tom-pytel commentedOct 6, 2025

Ping@pablogsal. I added the test totest_tokenize instead oftest_fstring as it seems to fit there better.

Copy link

Member

pablogsal commentedOct 6, 2025•
edited
Loading

Please add a rest for the f-string test file as well as this will be a semantic test that needs to hold true even if we change the tokenizer of some other implementation doesn't have the same tokenizer

add test to test_fstring

a13aaea

pablogsal reviewed

Oct 6, 2025

View reviewed changes

Lib/test/test_fstring.py Outdated

		# gh-139516
		# The '\n' is explicit to ensure no trailing whitespace which would invalidate the test.
		# Must use tokenize instead of compile so that source is parsed by line which exposes the bug.
		list(tokenize.tokenize(BytesIO('''f"{f(a=lambda: 'à'\n)}"'''.encode()).readline))

Copy link

Member

pablogsalOct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I am confused. Isn't it possible to trigger this in anexec oreval call? Or perhaps a file with an encoding?

Copy link

ContributorAuthor

tom-pytelOct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

See below VVV

Copy link

ContributorAuthor

tom-pytel commentedOct 6, 2025

Please add a rest for the f-string test file as well as this will be a semantic test that needs to hold true even if we change the tokenizer of some other implementation doesn't have the same tokenizer

Done. But I had to usetokenize() because of an interesting quirk. The bug shows up withtokenize() or executing a python script directly with the bad source or typing it into the repl. It does not show up withcompile() orast.parse() oreval() orexec() orimport .... The difference seems to be if the source is read line by line or not, in which case if the full string is available on parse then the tail end of the string past the NL is present to offset from on copy and the bug doesn't present.

Let me know if this test is good enough or if you want something else.

Copy link

Member

pablogsal commentedOct 6, 2025•
edited
Loading

Let me know if this test is good enough or if you want something else.

yes, going via the tokenizer makes no sense here. The pourpose of what I asked is that alternative implementations will still run these tests files to check if they are compliant and we need to provide a way to run a file or exec some code and say "this is what we expect". You are triggering the bug via a specific aspect of CPython but I would prefer if we could trigger it end-to-end via a file. There are more tests executing python over files, check intest_syntax ortest_grammar ortest_compile.

test_fstring test using script

e6d23e7

Copy link

ContributorAuthor

tom-pytel commentedOct 6, 2025

Running error as script.

pablogsal approved these changes

Oct 7, 2025

View reviewed changes

bedevere-appbot added awaiting merge and removed awaiting review labels

Oct 7, 2025

Copy link

Member

pablogsal commentedOct 7, 2025

LGTM

Thank you very much@tom-pytel !

pablogsal added needs backport to 3.13

bugs and security fixes

needs backport to 3.14bugs and security fixes labels

Oct 7, 2025

pablogsal merged commit539461d intopython:main

Oct 7, 2025

53 checks passed

Copy link

miss-islington-appbot commentedOct 7, 2025

Thanks@tom-pytel for the PR, and@pablogsal for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13, 3.14.
🐍🍒⛏🤖

bedevere-appbot removed the awaiting merge label

Oct 7, 2025

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request

Oct 7, 2025

pythongh-139516: Fix lambda colon start format spec in f-string in to…

a7ea1ba

…kenizer (pythonGH-139657)(cherry picked from commit539461d)Co-authored-by: Tomasz Pytel <tompytel@gmail.com>

Copy link

miss-islington-appbot commentedOct 7, 2025

Sorry,@tom-pytel and@pablogsal, I could not cleanly backport this to3.13 due to a conflict.
Please backport usingcherry_picker on command line.

cherry_picker 539461d9ec8e5322ead638f7be733fd196aa6c79 3.13

miss-islington-appbot assignedpablogsal

Oct 7, 2025

Copy link

bedevere-appbot commentedOct 7, 2025

GH-139701 is a backport of this pull request to the3.14 branch.

bedevere-appbot removed the needs backport to 3.14bugs and security fixes label

Oct 7, 2025

pablogsal pushed a commit that referenced this pull request

Oct 7, 2025

[3.14]gh-139516: Fix lambda colon start format spec in f-string in t…

de84d09

…okenizer (GH-139657) (#139701)gh-139516: Fix lambda colon start format spec in f-string in tokenizer (GH-139657)(cherry picked from commit539461d)Co-authored-by: Tomasz Pytel <tompytel@gmail.com>

Copy link

Member

pablogsal commentedOct 7, 2025

Sorry,@tom-pytel and@pablogsal, I could not cleanly backport this to3.13 due to a conflict. Please backport usingcherry_picker on command line.
cherry_picker 539461d9ec8e5322ead638f7be733fd196aa6c79 3.13

@tom-pytel can you make the backport following the instructions?

Copy link

ContributorAuthor

tom-pytel commentedOct 7, 2025

Sorry,@tom-pytel and@pablogsal, I could not cleanly backport this to3.13 due to a conflict. Please backport usingcherry_picker on command line.
cherry_picker 539461d9ec8e5322ead638f7be733fd196aa6c79 3.13
@tom-pytel can you make the backport following the instructions?

Sure, in a bit.

tom-pytel mentioned this pull request

Oct 7, 2025

[3.13] gh-139516: Fix lambda colon start format spec in f-string in t…#139726

Merged

Copy link

bedevere-appbot commentedOct 7, 2025

GH-139726 is a backport of this pull request to the3.13 branch.

bedevere-appbot removed the needs backport to 3.13bugs and security fixes label

Oct 7, 2025

pablogsal pushed a commit that referenced this pull request

Oct 7, 2025

[3.13]gh-139516: Fix lambda colon start format spec in f-string in t… (

b7bc977

#139726)[3.13]gh-139516: Fix lambda colon start format spec in f-string in tokenizer (GH-139657)(cherry picked from commit539461d)

Labels

None yet

Movatterモバイル変換

Uh oh!

gh-139516: Fix lambda colon start format spec in f-string in tokenizer#139657

gh-139516: Fix lambda colon start format spec in f-string in tokenizer#139657

Uh oh!

Conversation

tom-pytel commentedOct 6, 2025• edited by bedevere-appbotLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

tom-pytel commentedOct 6, 2025

Uh oh!

pablogsal commentedOct 6, 2025• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

pablogsalOct 6, 2025

Choose a reason for hiding this comment

Uh oh!

tom-pytelOct 6, 2025

Choose a reason for hiding this comment

Uh oh!

tom-pytel commentedOct 6, 2025

Uh oh!

pablogsal commentedOct 6, 2025• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

tom-pytel commentedOct 6, 2025

Uh oh!

pablogsal commentedOct 7, 2025

Uh oh!

Uh oh!

miss-islington-appbot commentedOct 7, 2025

Uh oh!

miss-islington-appbot commentedOct 7, 2025

Uh oh!

bedevere-appbot commentedOct 7, 2025

Uh oh!

pablogsal commentedOct 7, 2025

Uh oh!

tom-pytel commentedOct 7, 2025

Uh oh!

bedevere-appbot commentedOct 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tom-pytel commentedOct 6, 2025•
edited by bedevere-appbot
Loading

pablogsal commentedOct 6, 2025•
edited
Loading

pablogsal commentedOct 6, 2025•
edited
Loading