python/importlib_metadataPublic

NotificationsYou must be signed in to change notification settings
Fork93
Star140

Improve performance of name normalization#533

Merged

jaraco merged 7 commits intomaint/8.xfrom

backport-cpython-143658

Mar 20, 2026

Merged

Improve performance of name normalization#533
jaraco merged 7 commits intomaint/8.xfrom
backport-cpython-143658

Conversation

Copy link

Member

jaraco commentedMar 20, 2026

Backport of changes frompython/cpython#143658

hugovkand others added5 commits

March 20, 2026 03:04

gh-143658: importlib.metadata: Usestr.translate to improve perform…

3e7f3ac

…ance of `importlib.metadata.Prepared.normalized` (#143660)Co-authored-by: Henry Schreiner <henryschreineriii@gmail.com>Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>Co-authored-by: Bartosz Sławecki <bartosz@ilikepython.com>

gh-143658: Usestr.lower andreplace to further improve performan…

001db0d

…ce of `importlib.metadata.Prepared.normalized` (#144083)Co-authored-by: Henry Schreiner <henryschreineriii@gmail.com>

Remove CPython news fragments.

852e44f

Use parameterize fixture for parameterized tests.

1b0be12

Add performance test for Prepared.normalize.

a77d0d1

Copy link

MemberAuthor

jaraco commentedMar 20, 2026

I don't understand why the performance test is not showing any effect:

exercises.py:cached distribution: 0:00:00.000169 (+-1 day, 23:59:59.999974, -13%)exercises.py:discovery: 0:00:00.000173 (+0:00:00, 0%)exercises.py:entry_points(): 0:00:00.002320 (+0:00:00.000060, 3%)exercises.py:entrypoint_regexp_perf: 0:00:00.000069 (+-1 day, 23:59:59.999998, -3%)exercises.py:normalize_perf: 0:00:00 (+0:00:00, 0%)exercises.py:uncached distribution: 0:00:00.000273 (+-1 day, 23:59:59.999978, -7%)

It's seeming to indicate that thenormalize('sample') runs in 0 time.

If I add a sleep, it does reveal some performance degradation:

diff --git a/exercises.py b/exercises.pyindex b346cc05f8..a48c57fe6e 100644--- a/exercises.py+++ b/exercises.py@@ -49,6 +49,8 @@ def entrypoint_regexp_perf():  def normalize_perf():     # python/cpython#143658+    import time     import importlib_metadata  # end warmup+    time.sleep(0.001)     importlib_metadata.Prepared.normalize('sample')

exercises.py:normalize_perf: 0:00:00.001270 (+0:00:00, 0%)

So maybe the operation is too small for timeit to measure?

Copy link

MemberAuthor

jaraco commentedMar 20, 2026

Running timeit directly reveals some run time:

 🐚 .tox/py/bin/python -m timeit --setup 'import importlib_metadata' -- 'importlib_metadata.Prepared("sample")'1000000 loops, best of 5: 231 nsec per loop

Aah - so 231 nsec is smaller than the precision of Python's timespan, which only handles microseconds.

 importlib_metadata backport-cpython-143658 🐚 pip-run temporaPython 3.14.3 (main, Feb  3 2026, 15:32:20) [Clang 17.0.0 (clang-1700.6.3.2)] on darwinType "help", "copyright", "credits" or "license" for more information.>>> import tempora>>> tempora.parse_timedelta('231 nsec')datetime.timedelta(0)

jaraco mentioned this pull request

Mar 20, 2026

Better support for sub-microsecond operationsjaraco/pytest-perf#18

Open

Repeat the operation to get performance visibility.

cbadafc

Copy link

MemberAuthor

jaraco commentedMar 20, 2026•
edited
Loading

The latest benchmark shows a 74% reduction in execution time.

exercises.py:normalize_perf: 0:00:00.000100 (+-1 day, 23:59:59.999722, -74%)

I get different numbers when I run timeit manually (~230ns vs ~510ns, 55% reduction).

Was it worth it to save a few hundred nanoseconds?

jaraco mentioned this pull request

Mar 20, 2026

importlib.metadata: Usestr.translate to improve performance ofimportlib.metadata.Prepared.normalizedpython/cpython#143658

Closed

Copy link

Member

hugovk commentedMar 20, 2026•
edited
Loading

In many cases, the speedup won't make any noticeable effect. The change is similar to an improvement frompackaging, which was does have real benefits for pip when doing large resolutions (https://iscinumpy.dev/post/packaging-faster/).

Let's also ask if@henryiii is aware of any use cases.

Personally, I would make this change, I don't think it makes the code harder to read or maintain. But it's fine if you'd prefer to reject this and/or revert in CPython.

btw I'd already opened#529 againstmain, and this is againstmaint/8.x. I couldn't find a contrib guide, what's the normal workflow? Can this be documented somewhere, or did I miss it?

This is the better PR -- same prod code change, but parametrised tests and a performance benchmark added.

Copy link

MemberAuthor

jaraco commentedMar 20, 2026

btw I'd already opened#529 againstmain, and this is againstmaint/8.x. I couldn't find a contrib guide, what's the normal workflow? Can this be documented somewhere, or did I miss it?

Thanks for that PR; I'd just not gotten to it as my attention/bandwidth are limited.

There's some documentation inhttps://github.com/python/importlib_metadata/wiki/Development-Methodology. And the contribution to main was the correct thing at the time. But because I'm now getting this change in place (along with some others) and there's a 9.0 release that I'd like to exclude for the time being, that's why I'm targeting 8.x.

Move behavior description into the docstring. Remove references to in…

27169dc

…termediate implementations. Reference the rationale.

Copy link

MemberAuthor

jaraco commentedMar 20, 2026

Personally, I would make this change, I don't think it makes the code harder to read or maintain. But it's fine if you'd prefer to reject this and/or revert in CPython.

I would argue it is less readable and has other concerns.

It introduces a newvalue variable whose state set in multiple locations. I try to avoid mutated variables and follow a functional paradigm wherever possible. In fact, I'm tempted to wrap thewhile loop in another functionreplace_all in order to encapsulate these imperative behaviors.

It also repeats itself, using the.replace operation an arbitrary number of times both statically and dynamically.

I'd also argue that it's less sophisticated, using lower-level operations instead of a more succinct and reusable approach. Although regex comes with its own pitfalls, it elevates the conversation by re-using well-known (and documented) approaches.

Additionally, although the spec is "PEP 503 normalization plus dashes as underscores", it doesn't actually perform PEP 503 normalization, but compiles in the aggregate operation to produce an equivalent output. Compare that with the original implementation, where the code reflected precisely the specified behavior, even though it could have simply subbed to_ and eliminated the.replace, making the code more self-documenting.

These concerns are mostly just me being pedantic, but I wanted to articulate why I would prefer the prior implementation absent any performance concerns, and why I'd prefer we get a strong rationale for the benefits. Still, I missed my opportunity to get my review in before merged with CPython, so I'm planning to proceed with this change to get the code bases aligned and we can consider reverting later (while keeping the performance and unit tests).

jaraco merged commit8c5d91b intomaint/8.x

Mar 20, 2026

15 of 30 checks passed

jaraco deleted the backport-cpython-143658 branch

March 20, 2026 16:06

This was referencedMar 20, 2026

Usestr.translate to improve performance ofnormalize#529

Closed

gh-146228: Better fork support in cached FastPathpython/cpython#146231

Merged

Labels

None yet

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve performance of name normalization#533

Improve performance of name normalization#533
jaraco merged 7 commits intomaint/8.xfrom
backport-cpython-143658

Conversation

jaraco commentedMar 20, 2026

Uh oh!

jaraco commentedMar 20, 2026

Uh oh!

jaraco commentedMar 20, 2026

Uh oh!

jaraco commentedMar 20, 2026•
edited
Loading

Uh oh!

Uh oh!

hugovk commentedMar 20, 2026•
edited
Loading

Uh oh!

Uh oh!

jaraco commentedMar 20, 2026

Uh oh!

jaraco commentedMar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Movatterモバイル変換

Uh oh!

Conversation

jaraco commentedMar 20, 2026

Uh oh!

jaraco commentedMar 20, 2026

Uh oh!

jaraco commentedMar 20, 2026

Uh oh!

jaraco commentedMar 20, 2026• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

hugovk commentedMar 20, 2026• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

jaraco commentedMar 20, 2026

Uh oh!

jaraco commentedMar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jaraco commentedMar 20, 2026•
edited
Loading

hugovk commentedMar 20, 2026•
edited
Loading