Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork3.1k
Speed up make_simplified_union, fix recursive tuple crash#15128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
The following code optimises make_simplified_union in the common casethat there are exact duplicates in the union. In this regard, this issimilar topython#15104To get this to work, I needed to use partial tuple fallbacks in a coupleplaces (these maybe had the potential to be latent crashes anyway?)There were some interesting things going on with recursive type aliasesand type state assumptionsThis is about a 25% speedup on the pydantic codebase and about a 2%speedup on self check (measured with uncompiled mypy)
for more information, seehttps://pre-commit.ci
This comment has been minimized.
This comment has been minimized.
hauntsaninja commentedApr 25, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Looks like there's a major performance regression in |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Diff frommypy_primer, showing the effect of this PR on open source code: pandas-stubs (https://github.com/pandas-dev/pandas-stubs) got 1.14 faster (68.0s -> 59.5s)speedrun.com_global_scoreboard_webapp (https://github.com/Avasam/speedrun.com_global_scoreboard_webapp) got 1.24 faster (43.8s -> 35.2s) |
hauntsaninja commentedApr 26, 2023
(note to self: once this is merged, rename |
JukkaL commentedMay 3, 2023
I did a measurement using |
JukkaL left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Looks good! I'm happy to see additional performance wins. I finally got around to reviewing this now that my jet lag is better.
Left a few additional ideas (optional, since this already seems like a nice improvement).
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
| # If deleted subtypes had more general truthiness, use that | ||
| orig_item=new_items[duplicate_index] | ||
| ifnotorig_item.can_be_trueandti.can_be_true: | ||
| new_items[duplicate_index]=true_or_false(orig_item) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Should we only changecan_be_true, or is it ok to only adjustcan_be_false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
| ifnotorig_item.can_be_trueandti.can_be_true: | ||
| new_items[duplicate_index]=true_or_false(orig_item) | ||
| elifnotorig_item.can_be_falseandti.can_be_false: | ||
| new_items[duplicate_index]=true_or_false(orig_item) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Similar to above.
hauntsaninja commentedMay 5, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
This fixes a crash that just got reported#15192, so I added a regression test for it. I added the two micro optimisations suggested. |
Diff frommypy_primer, showing the effect of this PR on open source code: pandas-stubs (https://github.com/pandas-dev/pandas-stubs) got 1.15x faster (66.6s -> 58.0s)speedrun.com_global_scoreboard_webapp (https://github.com/Avasam/speedrun.com_global_scoreboard_webapp) got 1.27x faster (33.3s -> 26.2s) |
hauntsaninja commentedMay 6, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Nice, looks like the primer timings are pretty consistent between: |
ilevkivskyi commentedMay 10, 2023
This PR actually introduced a recursive tuple crash I will check maybe there is a simple fix for it. |
ilevkivskyi commentedMay 10, 2023
It looks like there is (a bug is quite obvious). I will submit a PR shortly. |
ilevkivskyi commentedMay 10, 2023
See#15216 |
Uh oh!
There was an error while loading.Please reload this page.
Fixes#15192
The following code optimises make_simplified_union in the common case that there are exact duplicates in the union. In this regard, this is similar to#15104
There's a behaviour change in one unit test. I think it's good? We'll see what mypy_primer has to say.
To get this to work, I needed to use partial tuple fallbacks in a couple places (these maybe had the potential to be latent crashes anyway?) There were some interesting things going on with recursive type aliases and type state assumptions
This is about a 25% speedup on the pydantic codebase and about a 2% speedup on self check (measured with uncompiled mypy)