Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Improve performance of name normalization#533

Merged
jaraco merged 7 commits intomaint/8.xfrom
backport-cpython-143658
Mar 20, 2026
Merged

Improve performance of name normalization#533
jaraco merged 7 commits intomaint/8.xfrom
backport-cpython-143658

Conversation

@jaraco
Copy link
Member

Backport of changes frompython/cpython#143658

hugovkand others added5 commitsMarch 20, 2026 03:04
…ance of `importlib.metadata.Prepared.normalized` (#143660)Co-authored-by: Henry Schreiner <henryschreineriii@gmail.com>Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>Co-authored-by: Bartosz Sławecki <bartosz@ilikepython.com>
…ce of `importlib.metadata.Prepared.normalized` (#144083)Co-authored-by: Henry Schreiner <henryschreineriii@gmail.com>
@jaraco
Copy link
MemberAuthor

I don't understand why the performance test is not showing any effect:

exercises.py:cached distribution: 0:00:00.000169 (+-1 day, 23:59:59.999974, -13%)exercises.py:discovery: 0:00:00.000173 (+0:00:00, 0%)exercises.py:entry_points(): 0:00:00.002320 (+0:00:00.000060, 3%)exercises.py:entrypoint_regexp_perf: 0:00:00.000069 (+-1 day, 23:59:59.999998, -3%)exercises.py:normalize_perf: 0:00:00 (+0:00:00, 0%)exercises.py:uncached distribution: 0:00:00.000273 (+-1 day, 23:59:59.999978, -7%)

It's seeming to indicate that thenormalize('sample') runs in 0 time.

If I add a sleep, it does reveal some performance degradation:

diff --git a/exercises.py b/exercises.pyindex b346cc05f8..a48c57fe6e 100644--- a/exercises.py+++ b/exercises.py@@ -49,6 +49,8 @@ def entrypoint_regexp_perf():  def normalize_perf():     # python/cpython#143658+    import time     import importlib_metadata  # end warmup+    time.sleep(0.001)     importlib_metadata.Prepared.normalize('sample')
exercises.py:normalize_perf: 0:00:00.001270 (+0:00:00, 0%)

So maybe the operation is too small for timeit to measure?

@jaraco
Copy link
MemberAuthor

Running timeit directly reveals some run time:

 🐚 .tox/py/bin/python -m timeit --setup 'import importlib_metadata' -- 'importlib_metadata.Prepared("sample")'1000000 loops, best of 5: 231 nsec per loop

Aah - so 231 nsec is smaller than the precision of Python's timespan, which only handles microseconds.

 importlib_metadata backport-cpython-143658 🐚 pip-run temporaPython 3.14.3 (main, Feb  3 2026, 15:32:20) [Clang 17.0.0 (clang-1700.6.3.2)] on darwinType "help", "copyright", "credits" or "license" for more information.>>> import tempora>>> tempora.parse_timedelta('231 nsec')datetime.timedelta(0)

@jaraco
Copy link
MemberAuthor

jaraco commentedMar 20, 2026
edited
Loading

The latest benchmark shows a 74% reduction in execution time.

exercises.py:normalize_perf: 0:00:00.000100 (+-1 day, 23:59:59.999722, -74%)

I get different numbers when I run timeit manually (~230ns vs ~510ns, 55% reduction).

Was it worth it to save a few hundred nanoseconds?

@hugovk
Copy link
Member

hugovk commentedMar 20, 2026
edited
Loading

In many cases, the speedup won't make any noticeable effect. The change is similar to an improvement frompackaging, which was does have real benefits for pip when doing large resolutions (https://iscinumpy.dev/post/packaging-faster/).

Let's also ask if@henryiii is aware of any use cases.

Personally, I would make this change, I don't think it makes the code harder to read or maintain. But it's fine if you'd prefer to reject this and/or revert in CPython.


btw I'd already opened#529 againstmain, and this is againstmaint/8.x. I couldn't find a contrib guide, what's the normal workflow? Can this be documented somewhere, or did I miss it?

This is the better PR -- same prod code change, but parametrised tests and a performance benchmark added.

@jaraco
Copy link
MemberAuthor

btw I'd already opened#529 againstmain, and this is againstmaint/8.x. I couldn't find a contrib guide, what's the normal workflow? Can this be documented somewhere, or did I miss it?

Thanks for that PR; I'd just not gotten to it as my attention/bandwidth are limited.

There's some documentation inhttps://github.com/python/importlib_metadata/wiki/Development-Methodology. And the contribution to main was the correct thing at the time. But because I'm now getting this change in place (along with some others) and there's a 9.0 release that I'd like to exclude for the time being, that's why I'm targeting 8.x.

hugovk reacted with thumbs up emoji

…termediate implementations. Reference the rationale.
@jaraco
Copy link
MemberAuthor

Personally, I would make this change, I don't think it makes the code harder to read or maintain. But it's fine if you'd prefer to reject this and/or revert in CPython.

I would argue it is less readable and has other concerns.

It introduces a newvalue variable whose state set in multiple locations. I try to avoid mutated variables and follow a functional paradigm wherever possible. In fact, I'm tempted to wrap thewhile loop in another functionreplace_all in order to encapsulate these imperative behaviors.

It also repeats itself, using the.replace operation an arbitrary number of times both statically and dynamically.

I'd also argue that it's less sophisticated, using lower-level operations instead of a more succinct and reusable approach. Although regex comes with its own pitfalls, it elevates the conversation by re-using well-known (and documented) approaches.

Additionally, although the spec is "PEP 503 normalization plus dashes as underscores", it doesn't actually perform PEP 503 normalization, but compiles in the aggregate operation to produce an equivalent output. Compare that with the original implementation, where the code reflected precisely the specified behavior, even though it could have simply subbed to_ and eliminated the.replace, making the code more self-documenting.


These concerns are mostly just me being pedantic, but I wanted to articulate why I would prefer the prior implementation absent any performance concerns, and why I'd prefer we get a strong rationale for the benefits. Still, I missed my opportunity to get my review in before merged with CPython, so I'm planning to proceed with this change to get the code bases aligned and we can consider reverting later (while keeping the performance and unit tests).

hugovk reacted with thumbs up emoji

@jaracojaraco merged commit8c5d91b intomaint/8.xMar 20, 2026
15 of 30 checks passed
@jaracojaraco deleted the backport-cpython-143658 branchMarch 20, 2026 16:06
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

2 participants

@jaraco@hugovk

[8]ページ先頭

©2009-2026 Movatter.jp