This PR adds a benchmark fortypeshed-stats, a project of mine that provides a CLI tool to gather statistics and other information ontypeshed. The benchmark takes around 5 seconds to complete. As discussed on the CPython core devs' discord, it may make an interesting benchmark for several reasons, among them:

It makes significant use of several newish features of Python that are currently underrepresented in the benchmark suite, such as pattern-matching, pathlib, f-strings, enums, typing and dataclasses.
It's real-world code that hasn't been particularly optimised for performance.

Rather than listtypeshed-stats as a dependency for this benchmark, I've instead forked my own project and committed a modified version into this PR.typeshed-stats as it exists on PyPI makes network requests usingaiohttp, which would lead to indeterminacy in the benchmark; the version oftypeshed-stats that I've included in this PR doesn't make network requests. I also made a few other changes to reduce the number of third-party dependencies, such as switching fromattrs todataclasses, and removing the use ofrich. Other than that, however, I've tried to keep the changes to a minimum, so that it accurately reflects the "real" version oftypeshed-stats. That means that there are a few unused functions in this PR thatcould potentially be removed, if we think that would be better.

Two more decisions I made when making this PR:

typeshed-stats as it exists on PyPI usestomli on 3.10, and the stdlibtomllib on 3.11. In this PR, I just usetomllib unconditionally. That means that this benchmark has 0 external dependencies, but it also means that the benchmark only runs on 3.11+. If we wanted, I could add a dependency ontomli and change it so that we usetomli unconditionally, which would make it 3.10-compatible.
I wanted to benchmark the use oftypeshed-stats as a CLI tool, so I've added the whole project to a subdirectory in the benchmark and then listed it as a local dependency in therequirements.txt file for the benchmark. That was the only way I could see of getting therunner.bench_command() function to work, but it doesn't look like any other benchmarks do this currently. Let me know what you think about this!

AlexWaygood added3 commits

March 1, 2023 16:25

Commit the entirety of typeshed intopyperformance

8d44b57

Commit a modified version oftypeshed-stats into pyperformance

9e811d1

Add the benchmark

d7ded8e

Copy link

MemberAuthor

AlexWaygood commentedMar 1, 2023•
edited
Loading

The CI is failing for the Python 3.7-3.10 workflows for the same reason as#200 (comment): therequires-python field in thepyproject.toml file for the benchmark doesn't have any effect. I've specifiedrequires-python = ">= 3.11" in thepyproject.toml file for this benchmark, but that hasn't resulted in the benchmark being skipped for <3.11.

The CI is failing on the 3.12 workflow because of the same reason that the CI is failing onmain:greenlet is not yet compatible with Python 3.12, andgreenlet is a dependency for some of the other benchmarks (#263).

AlexWaygood requested review frombrandtbucher andmdboom

March 1, 2023 17:02

brandtbucher requested changes

Mar 1, 2023

View reviewed changes

Copy link

Member

brandtbucher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

That seems like alot of data files to me... GitHub won't even let me review the actual benchmark.

These are just copied verbatim from typeshed, I assume? If so, maybe we could just install a pinned version in our requirements (not sure if typeshed is pip-installable?) or maybe clone the typeshed repo into a temporary location in the benchmark setup (we would need to be careful that this only happens once perpyperformance run, not once perpyperf run/process... not sure how hard that is). Or maybe we could do something clever with git (like making typeshed a submodule, pinned to a specific revision)?

I'd just like to avoid adding ~230k new lines of code (and ~4k files) if at all possible.

Copy link

MemberAuthor

AlexWaygood commentedMar 1, 2023

That seems like alot of data files to me... GitHub won't even let me review the actual benchmark.

Yup, very much aware — I tried to keep a clean git history here to help out with reviewing (added the data files in the first commit, the project files in the second commit, and the actual code for running the benchmark in the third commit).

These are just copied verbatim from typeshed, I assume?

Correct

not sure if typeshed is pip-installable?) or maybe clone the typeshed repo into a temporary location in the benchmark setup

Typeshed isn'tpip-installable, no, but very much open to the second option. I floated this issue on discord before filing the PR, though, and I think@mdboom might have a different opinion on this question.

Note that this is basically the same approach we took with the docutils benchmark (#216), where we just copied-and-pasted all of docutils' docs as data files into this repo.

Copy link

MemberAuthor

AlexWaygood commentedMar 1, 2023

Or maybe we could do something clever with git (like making typeshed a submodule, pinned to a specific revision)?

This is an interesting idea. I'll look into it.

Copy link

Member

brandtbucher commentedMar 1, 2023

I tried to keep a clean git history here to help out with reviewing (added the data files in the first commit, the project files in the second commit, and the actual code for running the benchmark in the third commit).

Ah, nice. Thanks.

Typeshed isn'tpip-installable, no, but very much open to the second option. I floated this issue on discord before filing the PR, though, and I think@mdboom might have a different opinion on this question.

Note that this is basically the same approach we took with the docutils benchmark (#216), where we just copied-and-pasted all of docutils' docs as data files into this repo.

Well, that one seems like a lot less data, haha. We have lots of benchmarks that vendor a few dozen data files in this way... maybe, if we want to do this approach, we could just run against a small subset oftypeshed? Not sure how that affects runtime.

But we could wait for@mdboom to weigh in, or see if git submodules are an acceptable solution.

(Side-note: whyisn't typeshed pip-installable? Does each tool need to vendor its own huge copy? I imagine there's some good reason...)

Copy link

MemberAuthor

AlexWaygood commentedMar 1, 2023•
edited
Loading

Well, that one seems like a lot less data, haha.

You have to understand that my secret ulterior motive is to become the number-one contributor topyperformance in terms of lines contributed ;)

(Side-note: whyisn't typeshed pip-installable? Does each tool need to vendor its own huge copy? I imagine there's some good reason...)

A bunch of reasons. The stubs for the stdlib are ~50k lines of code (so, just under a quarter of typeshed). Thesedo need to be vendored by each tool, because updating typeshed's stdlib stubs often results in internal tests breaking for mypy/pyright/etc. Some tools also apply their own patches to typeshed's stdlib stubs (e.g. take a look atpython/mypy#13987 -- quite a few tests failed at first; there had to be 2 followup PRs to revert specific typeshed changes to mypy's vendored version of the stdlib stubs).

The other ~three-quarters of typeshed consists of stubs for various third-party packages that don't have inline types at runtime (yet). These parts of typeshedare pip-installable, but they're packaged into separate stubs packages (types-requests for ourrequests stubs;types-colorama for ourcolorama stubs, etc. etc.) and auto-uploaded to PyPI on a regular basis. The machinery for auto-uploading the stubs packages is located in a separate repository for security reasons:https://github.com/typeshed-internal/stub_uploader. This means that people who just need stubs forrequests (~1000 lines of stubs) can install just the stubs forrequests in isolation; they don't need topip install ~179,000 lines of stubs that are completely irrelevant to whatever project they're working on.

Copy link

MemberAuthor

AlexWaygood commentedMar 1, 2023•
edited
Loading

maybe, if we want to do this approach, we could just run against a small subset of typeshed?

Some of our third-party stubs in typeshed are much larger (in terms of lines of code) than others. If we deleted just typeshed's stubs forpywin32,SQLAlchemy,influxdb-client andopenpyxl from the vendored copy of typeshed here, it would reduce the size of this PR by around 52,000 lines of code. I certainly wouldn't mind doing that; we'd still be running typeshed-stats on a version of typeshed with 144 packages and 180,000 lines of code.

But anyway, I'll look into git submodules.

Copy link

Member

hugovk commentedMar 1, 2023

If you just need to get a big bunch of files from a repo, a couple of methods we've used for Pillow are git cloning with--depth=1, or downloading and unzipping the zipball for a tag or branch. Both are pretty fast, and avoid the fiddliness of submodules.

This was referencedApr 25, 2023

Impossible to add a benchmark that exercises newer Python features#281

Closed

On older versions of Python, skip benchmarks that use features introduced in newer Python versions#283

Merged

Merge branch 'python:main' into typeshed-stats

f50bd3b

Copy link

Contributor

mdboom commentedMay 9, 2023

Sorry I missed this PR the first time around. I did say that usually we just include the data in the repo, but for something this large, maybe it's pushing it. If installing from PyPI works here, that seems like the best approach (since support for installing dependencies is already baked into pyperformance). If not on PyPI, would specifying a git dependency in requirements.txt work? Cloning directly from git may work, as long as it can be done at dependency install time so it doesn't happen repeatedly. Ultimately, I don't care too much about the method as long as the exact version can be pinned (so the benchmark performance doesn't change as the typestubs are changed upstream).

Copy link

MemberAuthor

AlexWaygood commentedMay 9, 2023•
edited
Loading

I said to@brandtbucher I'd investigate git submodules, but never did, so this PR is waiting on me at the moment, as far as I'm concerned, don't worry 🙂

I'll try to get back to this PR soon, though I'm quite busy at the moment...

Copy link

Member

AA-Turner commentedAug 29, 2023

For comparison,#200 added ~1.5x the lines of this PR, so there's precdent for just committing the data directly there, if that would unblock this.

Labels

None yet

5 participants

Movatterモバイル変換

Uh oh!

Add a benchmark fortypeshed-stats#268

Are you sure you want to change the base?

Add a benchmark fortypeshed-stats#268

Uh oh!

Conversation

AlexWaygood commentedMar 1, 2023• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

AlexWaygood commentedMar 1, 2023• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

brandtbucher left a comment

Choose a reason for hiding this comment

Uh oh!

AlexWaygood commentedMar 1, 2023

Uh oh!

AlexWaygood commentedMar 1, 2023

Uh oh!

brandtbucher commentedMar 1, 2023

Uh oh!

AlexWaygood commentedMar 1, 2023• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

AlexWaygood commentedMar 1, 2023• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

hugovk commentedMar 1, 2023

Uh oh!

mdboom commentedMay 9, 2023

Uh oh!

AlexWaygood commentedMay 9, 2023• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

AA-Turner commentedAug 29, 2023

Uh oh!

Uh oh!

Add a benchmark for`typeshed-stats`#268

Add a benchmark for`typeshed-stats`#268

AlexWaygood commentedMar 1, 2023•
edited
Loading

AlexWaygood commentedMar 1, 2023•
edited
Loading

AlexWaygood commentedMar 1, 2023•
edited
Loading

AlexWaygood commentedMar 1, 2023•
edited
Loading

AlexWaygood commentedMay 9, 2023•
edited
Loading