Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-103193: cache calls toinspect._shadowed_dict ininspect.getattr_static#104267

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
AlexWaygood merged 4 commits intopython:mainfromAlexWaygood:simple-lru-cache
May 7, 2023

Conversation

AlexWaygood
Copy link
Member

@AlexWaygoodAlexWaygood commentedMay 7, 2023
edited
Loading

EDIT: These performance numbers are out of date with the current PR; read further down the thread for the up-to-date numbers.

This dramatically speeds up calls toinspect.getattr_static.

Here are benchmark results onmain, using@sobolevn's benchmark script from#103193 (comment):

type[Foo]                :   88 ±  0 nsFoo                      :  158 ±  0 nstype[Bar]                :   89 ±  0 nsBar                      :  158 ±  0 nsWithParentClassX         :  224 ±  0 nsBaz                      :  205 ±  0 nsWithParentX              :  271 ±  1 nstype[Missing]            :  252 ±  0 nsMissing                  :  207 ±  0 nsSlotted                  :  200 ±  1 nsMethod                   :  160 ±  1 nsStMethod                 :  158 ±  0 nsClsMethod                :  158 ±  0 ns

And here are results with this PR:

type[Foo]                :   54 ±  0 nsFoo                      :   85 ±  0 nstype[Bar]                :   53 ±  0 nsBar                      :   85 ±  0 nsWithParentClassX         :  102 ±  0 nsBaz                      :   96 ±  0 nsWithParentX              :  113 ±  0 nstype[Missing]            :  110 ±  0 nsMissing                  :   98 ±  0 nsSlotted                  :  137 ±  0 nsMethod                   :   85 ±  0 nsStMethod                 :   85 ±  0 nsClsMethod                :   85 ±  0 ns

With this PR,inspect.getattr_static is fast enough that evenisinstance() calls like this, with "pathological" runtime-checkable protocols, are faster than they were on Python 3.11:

Pathological protocol
fromtypingimportProtocol,runtime_checkable@runtime_checkableclassFoo(Protocol):a:intb:intc:intd:inte:intf:intg:inth:inti:intj:intk:intl:intm:intn:into:intp:intq:intr:ints:intt:intu:intv:intw:intx:inty:intz:intclassBar:def__init__(self):forattrnamein'abcdefghijklmnopqrstuvwxyz':setattr(self,attrname,42)isinstance(Bar(),Foo)

This approach makes me a little nervous. There are two plausible reasons I thought of why adding a cache here might not be a good idea:

  1. It could cause references to theklass argument to be held by the cache even after they've been deleted elsewhere, which could be unexpected behaviour for a low-level function likegetattr_static
  2. Perhaps it's possible that whether or not a__dict__ attribute is shadowed could change at some point during the lifetime of a class.

However, Ithink we're okay on both points.

For objection (1):_shadowed_dict is only ever called on type objects. The vast majority of classes are defined once in the global namespace and never deleted, so it shouldn't be an issue that the cache is holding strong references to the type objects. (If wedo think this is an issue, I also experimented with a version of this PR that uses aweakref.WeakKeyDictionary as a cache. It also sped things up, but not by nearly as much.)

For objection (2): It doesn't seem to be possible, once a class has been created, to change the class's__dict__ attribute, at least from Python code. So for any given classklass,_shadowed_dict(klass) shouldalways return the same result.

@carljm, what do you think?

@AlexWaygoodAlexWaygood added the performancePerformance or resource usage labelMay 7, 2023
@AlexWaygoodAlexWaygood requested a review fromcarljmMay 7, 2023 14:59
@carljm
Copy link
Member

So for any given classklass,_shadowed_dict(klass) shouldalways return the same result.

The__dict__ can't change, but the__mro__ can :/

>>> class A: pass...>>> class B: pass...>>> class C(A): pass...>>> C.__mro__(<class '__main__.C'>, <class '__main__.A'>, <class 'object'>)>>> C.__bases__ = (B,)>>> C.__mro__(<class '__main__.C'>, <class '__main__.B'>, <class 'object'>)

So I don't think this is safe.

AlexWaygood reacted with thumbs up emoji

@AlexWaygood
Copy link
MemberAuthor

AlexWaygood commentedMay 7, 2023
edited
Loading

The__dict__ can't change, but the__mro__ can :/

Well, good that we're discovering holes ingetattr_static test coverage :)

What about something like this, where_shadowed_dict is only cached for the lifetime of a singlegetattr_static call? It's slower than this patch currently, as it turns out that the call to.cache_clear() is pretty expensive. But it's still noticeably faster thanmain:

# add @lru_cache() to _shadowed_dict, like in this PR currentlydefgetattr_static(obj,attr,default=_sentinel):try:# existing getattr_static implementation herefinally:_shadowed_dict.cache_clear()

@carljm
Copy link
Member

How does the performance comparison look if you make_shadowed_dict take the mro tuple rather than the class? I.e. call sites would turn into_shadowed_dict(_static_getmro(klass)).

I think if you did that, then it would be safe to put an LRU cache on_shadowed_dict.

I think it should be a bounded cache, though, not unbounded.

The pathological scenario would be someone creating lots of transient classes inside a function, and callinggetattr_static on instances of them. It seems prettu unlikely that someone is doing all of that, and expecting the transient classes to be GCed. But I think we at least should ensure the memory use in such a case is bounded.

@AlexWaygood
Copy link
MemberAuthor

AlexWaygood commentedMay 7, 2023
edited
Loading

I think it should be a bounded cache, though, not unbounded.

@functools.lru_cache() creates a cache with size 128 by default, so my PR already proposes a bounded cache ;)

>>> functools.lru_cache(lambda:1).cache_info()CacheInfo(hits=0, misses=0, maxsize=128, currsize=0)
carljm reacted with thumbs up emoji

@AlexWaygood
Copy link
MemberAuthor

How does the performance comparison look if you make_shadowed_dict take the mro tuple rather than the class? I.e. call sites would turn into_shadowed_dict(_static_getmro(klass)).

It's not as fast as this PR, but it's alot faster thanmain. I'll switch to that approach.

Comment on lines +1809 to +1810
def _shadowed_dict(klass):
return _shadowed_dict_from_mro_tuple(_static_getmro(klass))
Copy link
MemberAuthor

@AlexWaygoodAlexWaygoodMay 7, 2023
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I considered inlining the calls to_static_getmro, but I thought this made it more readable. It also preservesgit blame better, and having to pass through this function doesn't seem to have much impact on performance

carljm reacted with thumbs up emoji
@AlexWaygoodAlexWaygood marked this pull request as ready for reviewMay 7, 2023 16:22
@AlexWaygood
Copy link
MemberAuthor

AlexWaygood commentedMay 7, 2023
edited
Loading

Here's the new performance numbers using thegetattr_static benchmark:

type[Foo]                :   61 ±  0 nsFoo                      :  100 ±  0 nstype[Bar]                :   61 ±  0 nsBar                      :  100 ±  0 nsWithParentClassX         :  129 ±  0 nsBaz                      :  118 ±  0 nsWithParentX              :  146 ±  0 nstype[Missing]            :  148 ±  0 nsMissing                  :  126 ±  1 nsSlotted                  :  162 ±  2 nsMethod                   :  105 ±  0 nsStMethod                 :  106 ±  0 nsClsMethod                :  100 ±  0 ns

The same benchmark onmain:

type[Foo]                :   90 ±  0 nsFoo                      :  163 ±  0 nstype[Bar]                :   90 ±  0 nsBar                      :  162 ±  0 nsWithParentClassX         :  229 ±  0 nsBaz                      :  209 ±  0 nsWithParentX              :  276 ±  0 nstype[Missing]            :  258 ±  0 nsMissing                  :  213 ±  0 nsSlotted                  :  203 ±  0 nsMethod                   :  162 ±  0 nsStMethod                 :  163 ±  0 nsClsMethod                :  163 ±  0 ns

With this PRnow,isinstance() checks against runtime-checkable protocols are faster than they were on 3.11 as long as the protocol has 13 members or fewer. Onmain, they are only faster than on 3.11 with 6 members or fewer.

Onmain, the results of thetyping_runtime_protocols pyperformance benchmark are262 us +- 11 us. With this PR, they are178 us +- 9 us.

@AlexWaygoodAlexWaygood added type-featureA feature request or enhancement stdlibPython modules in the Lib dir 3.12only security fixes labelsMay 7, 2023
@AlexWaygood
Copy link
MemberAuthor

(Skipping news because, if this patch is correct, it shouldn't change behaviour at all other than improving performance. And the news entry inhttps://github.com/python/cpython/pull/103195/files should already cover the performance boost.)

@AlexWaygoodAlexWaygood changed the titlegh-103193: cache calls to inspect._shadowed_dictgh-103193: cache calls toinspect._shadowed_dict ininspect.getattr_staticMay 7, 2023
@AlexWaygoodAlexWaygood merged commit1b19bd1 intopython:mainMay 7, 2023
@AlexWaygoodAlexWaygood deleted the simple-lru-cache branchMay 7, 2023 17:45
@AlexWaygood
Copy link
MemberAuthor

Thanks@carljm :D

I listed you as a co-author on the commit; I don't think I would have thought of using the mro tuple as a cache key on my own :)

carljm reacted with heart emoji

jbower-fb pushed a commit to jbower-fb/cpython that referenced this pull requestMay 8, 2023
…getattr_static` (python#104267)Co-authored-by: Carl Meyer <carl@oddbird.net>
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@carljmcarljmcarljm approved these changes

Assignees
No one assigned
Labels
3.12only security fixesperformancePerformance or resource usageskip newsstdlibPython modules in the Lib dirtype-featureA feature request or enhancement
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

3 participants
@AlexWaygood@carljm@bedevere-bot

[8]ページ先頭

©2009-2025 Movatter.jp