Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[WIP] Various build system improvements#101093

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
indygreg wants to merge6 commits intopython:main
base:main
Choose a base branch
Loading
fromindygreg:build-system-fixes

Conversation

indygreg
Copy link
Contributor

I'm creating this PR to track various build system changes that I'd like to make to CPython.

I'm creating a draft PR instead of opening real PRs because general OSS experience has taught me that if I trickle things out slowly without any prior context project maintainers won't see the big picture and/or benefits of the changes. My hope is that by tracking all proposed changes in a Git branch that people can easily diff that I'll have an easier time convincing folks to accept the patches.

Next Steps

CPython maintainer(s) should look at each commit in this branch. I write detailed commit messages. See each commit for more details.

Commits beginning withgh-xxxx: aren't yet assigned issues/PRs.

Then, just tell me which commits to turn into issues/PRs and I'll do so. Keep in mind some commits are dependent on others. For example, the BOLT enhancements (which appear to deliver meaningful pyperformance wins) depends on theoverhaul build rules for optimized binaries commit and its precursors.

I have several more potential patches to include in this branch. Including a bunch that enable cross-compiling for Apple platforms. I figured I'd start with just the build system + BOLT improvements to get people's attention.

Most of this work is derived from my efforts onhttps://github.com/indygreg/python-build-standalone.

cc@corona10 since you seem to have an interest in all the BOLT PRs.

@corona10
Copy link
Member

@indygreg cc@gvanrossum

Due to the holiday weekend in Korea (Lunar new year), I will review relevant PRs by next weekend.
Thanks for submitting these PRs :)

@aaupov
Copy link

Sidenote: BOLT now supports DWARF5, so there's no need to force-gdwarf-4

https://github.com/python/cpython/blob/main/configure.ac#L1950-1952

@corona10
Copy link
Member

Sidenote: BOLT now supports DWARF5, so there's no need to force -gdwarf-4

Nice news! Thanks for let us know :)

@indygreg, Let's drop the flag-gdwarf-4 it was only workaround approach
#95908 (comment)

cc@aaupov

@indygreg
Copy link
ContributorAuthor

No rush on the reviews,@corona10: I'm also busy celebrating the Lunar New Year this weekend.

Some specific guidance I could use help with is how to split up / fold commits and how to associate them with a GitHub Issue. To be honest I find the whole CPython contribution process of managing GitHub Issues in addition to a PR to be confusing. Is there a 1:1 relationship? 1:n?https://devguide.python.org/ is a bit ambiguous on this topic.

I'm lazy/busy, so prefer as little overhead / separate GitHub Issues as possible. But I can go along with whatever. If someone tells me what to do I'll do it.

Also, work that I'd like to do but haven't started working on yet is enabling some ofmakesetup's features to be rolled intoconfigure.ac. e.g. support for building/linking an extension statically. A lot of work in past ~1.5 years towards moving extension module config logic intoconfigure.ac enables this. But I'm not sure ifmakesetup deprecation was an end goal. In my mind it is certainly possible to inlinemakesetup fully intoconfigure.ac.

The end state I'd like to achieve is the ability for CPython's build system to build ~all extensions statically with minimal effort. e.g../configure --extensions-policy=all-static. I'd have to dig it up, but I believe there was a thread somewhere with some CPython maintainers in support of supporting easier static linking of extensions. I just don't know if anyone has opinions on how it should play out.makesetup still has lines that annotate back to Guido's commits made in 1994 and experience has taught me that it is best to ask before taking a sledge hammer to 25+ year old functionality!

@corona10
Copy link
Member

corona10 commentedJan 24, 2023
edited
Loading

@indygreg Let's separate the PR with BOLT itself and configuration enhancement.
Please reference#101282 about the BOLT PRs.

@corona10
Copy link
Member

@indygreg

The end state I'd like to achieve is the ability for CPython's build system to build ~all extensions statically with minimal effort.

One more thing, I have also interested in supporting build all extensions statically with minimal effort for internal purposes.
(Yeah, I have use cases if this build is supported)
To support this, we need to write the PEP first. Can we discuss this through email?
My email isdonghee.na@python.org

@erlend-aasland
Copy link
Contributor

The end state I'd like to achieve is the ability for CPython's build system to build ~all extensions statically with minimal effort.

What wrong with just editingModules/Setup.stdlib in order to achieve this?

@erlend-aasland
Copy link
Contributor

@indygreg, are you planning on following up this PR?

@erlend-aaslanderlend-aasland added the pendingThe issue will be closed if no feedback is provided labelMay 8, 2023
@indygreg
Copy link
ContributorAuthor

@indygreg, are you planning on following up this PR?

I could probably find time to polish off the PGO/BOLT changes if still wanted. Let's treat the broader extension module pieces I referred to as out of scope.

BTW, one of the things the stack of commits in this PR does is it adds BOLT optimization to libpython. Currentmain applies BOLT only to thepython binary. It is unclear to me if the variousCPython performance with BOLT threads I've read are based on the incomplete application of BOLT only topython. When I was developing these commits in January I did see some significant pyperformance changes after BOLT was applied to libpython. That makes sense: there's more code in libpython than in python.

erlend-aasland reacted with thumbs up emoji

@erlend-aasland
Copy link
Contributor

It would be really helpful if you could split this up in separate issues for each isolated change, and for each issue write a few sentences aboutwhy the change is needed, what value it adds, etc. Applying BOLT to libpython makes a lot of sense to me (based on your analysis), but I have no experience when it comes to BOLT :)

@indygreg
Copy link
ContributorAuthor

indygreg commentedMay 14, 2023 via email

It would be really helpful if you could split this up in separate issues for each isolated change, and for each issue write a few sentences about why the change is needed, what value it adds, etc.
Yup. The PR summary called out that I was awaiting instructions/feedback on whether/how to proceed. I wasn’t sure if people were even interested in the refactors.All the details/justification are there in the commit messages. I just need to put in the time to turn the commits into issues/PRs. Maybe I’ll start later today…
erlend-aasland reacted with thumbs up emoji

@erlend-aasland
Copy link
Contributor

All the details/justification are there in the commit messages.

That's great, but it is easier to interact using GitHub Issues as opposed to comments on single commits.

@indygregindygregforce-pushed thebuild-system-fixes branch 2 times, most recently fromff74a68 toc63a03dCompareMay 15, 2023 02:03
@indygreg
Copy link
ContributorAuthor

First 2 PRs from this branch are up:

Need to wait on publishing the next one because it depends on#104491 and I reckon GitHub's already brittle interface forstacking PRs will fall apart with all the automation CPython is using. So I'm not even going to try to do it.

erlend-aasland reacted with thumbs up emoji

Various rules were only ever invoked once and had minimal bodies.I don't see a benefit to the indirection.So this commit inlines rules to simplify the PGO logic.skip news
This commit overhauls the make-based build system's rules forbuilding optimized binaries. Along the way it fixes a myriad ofbugs and shortcomings with the prior approach.The old way of producing optimized binaries had various limitations:* `make [all]` would do work when PGO was enabled because the phony  `profile-opt` rule was non-empty. This prevented no-op PGO builds   from working at all. This meant workflows like `make; make install`   either incurred extra work or failed due to race conditions.* Same thing for BOLT, as its `bolt-opt` rule was also non-empty  and always ran during `make [all]`.* BOLT could not be run multiple times without a full rebuild because  `llvm-bolt` can't instrument binaries that have already received  BOLT optimizations.* It was difficult to run BOLT on its own because of how various make  targets and their dependencies were structured.* I found the old way that configure and make communicated the default  targets to be confusing and hard to understand.There are essentially 2 major changes going on in this commit:1. A rework of the high-level make targets for performing a build and   how they are defined.2. A rework of all the make logic related to profile-based optimization   (read: PGO and BOLT).Build Target Rework======================Before, we essentially had `build_all`, `profile-opt`, `bolt-opt` and`build_wasm` as our 3 targets for performing a build. `all` would aliasto one of these, as appropriate.And there was another definition for which _simple_ make target toevaluate for non-optimized builds. This was likely `build_all` or`all`.In the rework, we introduce 2 new high-level targets:* `build-plain` - Perform a build without optimizations.* `build-optimized` - Perform a build with optimizations.`build-plain` is aliased to `build_all` in all configurations exceptWASM, where it is `build_wasm`.`build-optimized` by default is aliased to a target that prints an errormessage when optimizations aren't enabled. If PGO or BOLT are enabled,it is aliased to their respective target.`build-optimized` is the logical successor to `profile-opt`.I felt it best to delete `profile-opt` completely, as the new `build-*`high-level targets feel more friendly to use. But if people lament itsloss, we can add a `profile-opt: build-optimized` to achieve almost thesame result.Profiled-Based Optimization Rework==================================Most of the make logic related to profile-based optimization (read: PGOand BOLT) has been touched in this change.A major issue with the old way of doing things was we used phony,always-executed make rules. This is a bad practice in make because itundermines no-op builds.Another issue is that the separation between the rules and what orderthey ran in wasn't always clear. Both PGO and BOLT consist of the same4 phase solution: instrument, run, analyze, and apply. However, thesesteps weren't clearly expressed in the make logic. This is especiallytrue for BOLT, which only had 1 make rule.Another issue with BOLT is that it was really easy to get things intoa bad state. e.g. if you applied BOLT to `pythonX.Y` you could notrun BOLT again unless you rebuilt `pythonX.Y` from source.In the new world, we have separate `profile-<tool>-<stage>-stamp`rules defining the 4 distinct `instrument`, `run`, `analyze`, and`apply` stages for both PGO and BOLT. Each of these stages is trackedby a _stamp_ semaphore file so progress can be captured. This shouldall be pretty straightforward.There is some minimal complexity here to handle BOLT's optionaldependency on PGO, as BOLT either depends on `build_all` or`profile-pgo-apply-stamp`.As part of the refactor to BOLT we also preserve the original inputbinary before BOLT is applied. This original file is restored ifBOLT runs again. This greatly simplifies repeated BOLT invocations,as make doesn't perform needless work. However, this is all besteffort, as it is possible for some make target evaluations to stillget things in a bad state.Other Remarks=============If this change perturbs any bugs, they are likely around cleaningbehavior. The cleaning rules are a bit complicated and not clearlydocumented. And I'm unsure which targets CPython developers ofteniterate on. It is highly possible that state cleanup of PGO and/orBOLT files isn't as robust as it needs to be.I explicitly deleted some calls to PGO cleanup because those callsprevented no-op `make [all]` from working. It is certainly possiblesomething somewhere (release automation?) relied on these files beingdeleted when they no longer are. We still have targets to purge profilefiles and it should be trivial to add these to appropriate make rules.
This allows easily customizing the arguments to `llvm-bolt` withouthaving to edit the Makefile.Arguments can be passed to configure and are reflected in configureoutput, which can be useful for log analysis.When defined in the Makefile we use `?=` syntax so the flags can beoverridden via `make VAR=VALUE` syntax or via environment variables.Super useful for iterating on different BOLT flags.
This ensures `LD_LIBRARY_PATH` is set. Without this, I was able totickle a libpythonX.Y.so not found error.
Before, we only supported running BOLT on the main `python` binary.If a shared library was in play, it wouldn't be optimized. That wasleaving a ton of optimization opportunities on the floor.This commit adds support for running BOLT on libpython.Functionality is disabled by default because BOLT asserts on LLVM 15,which is the latest LLVM. I've built LLVM tip and it is able toprocess libpython just fine. So it is known to work.
The generated sysconfig data during builds encodes a PEP-425platform tag. During native (target=host) builds, the bootstrappedcompiler runs Python code in sysconfig to derive an appropriate value.For cross compiles, we fall back to logic in configure (that codelives around the changed lines) to derive an appropriate platformtag, which is exported as an environment variable during builds.And there is a "backdoor" in `sysconfig.py` that causes`sysconfig.get_platform()` to return that value.The logic in configure for deriving an appropriate platform tagis a far cry from what's in `sysconfig.py`. Ideally that logicwould be fully (re)implemented in configure. But that's anon-trivial amount of work.Recognizing that configure makes inadequate platform tag decisionsduring cross-compiles, this commit switches `_PYTHON_HOST_PLATFORM`from a regular output variable to a "precious variable" (in autoconfspeak). This has the side-effect of allowing invokers to define thevariable, effectively allowing them to explicitly set the platformtag during builds to a correct value when configure otherwise wouldn'tset one.
indygreg added a commit to indygreg/cpython that referenced this pull requestMay 20, 2023
(This change is a quick and dirty way to merge some of the build systemimprovements I'm proposing inpythongh-101093 before the 3.12 feature freeze.I wanted to scope bloat myself to fix some longstanding deficiencies inthe build system around profile-guided builds. But I'm getting softresistance to the reviews so close to the freeze deadline and it isobvious that we need a simpler solution to hit the 3.12 deadline. Whilethis change is quick and dirty, it attempts to not make things worse.)Before this change, we only applied bolt to the main python binary.After this change, we apply bolt to libpython if it is configured. Inshared library builds, most of the C code is in libpython so it iscritical to apply bolt to libpython to realize bolt benefits.This change also reworks how bolt instrumentation is applied. Iteffectively removes the readelf based logic added inpythongh-101525 andreplaces it with a mechanism that saves a copy of the pre-bolt binaryand restores that copy when necessary. This allows us to performbolt optimizations without having to manually delete the output binaryto force a new bolt run.We also add a new make target for purging bolt files and hook it upto `clean` so bolt state is purged when appropriate.`.gitignore` rules have been added to ignore files related to bolt.Before and after this refactor, `make` will no-op after a previous run.Both versions should also share common make DAG deficiencies wheretargets fail to trigger as often as they need to or can triggerprematurely in certain scenarios. e.g. after this change you may needto `rm profile-bolt-stamp` to force a bolt run because there aren'tappropriate non-phony targets for bolt's make target to depend on.Fixing this is a non-trivial amount of work that will likely have towait until the 3.13 window.To make it easier to iterate on custom BOLT settings, the flags topass to instrumentation and application are now defined in configureand can be overridden by passing `BOLT_INSTRUMENT_FLAGS` and`BOLT_APPLY_FLAGS`.
indygreg added a commit to indygreg/cpython that referenced this pull requestMay 21, 2023
(This change is a quick and dirty way to merge some of the build systemimprovements I'm proposing inpythongh-101093 before the 3.12 feature freeze.I wanted to scope bloat myself to fix some longstanding deficiencies inthe build system around profile-guided builds. But I'm getting softresistance to the reviews so close to the freeze deadline and it isobvious that we need a simpler solution to hit the 3.12 deadline. Whilethis change is quick and dirty, it attempts to not make things worse.)Before this change, we only applied bolt to the main python binary.After this change, we apply bolt to libpython if it is configured. Inshared library builds, most of the C code is in libpython so it iscritical to apply bolt to libpython to realize bolt benefits.This change also reworks how bolt instrumentation is applied. Iteffectively removes the readelf based logic added inpythongh-101525 andreplaces it with a mechanism that saves a copy of the pre-bolt binaryand restores that copy when necessary. This allows us to performbolt optimizations without having to manually delete the output binaryto force a new bolt run.We also add a new make target for purging bolt files and hook it upto `clean` so bolt state is purged when appropriate.`.gitignore` rules have been added to ignore files related to bolt.Before and after this refactor, `make` will no-op after a previous run.Both versions should also share common make DAG deficiencies wheretargets fail to trigger as often as they need to or can triggerprematurely in certain scenarios. e.g. after this change you may needto `rm profile-bolt-stamp` to force a bolt run because there aren'tappropriate non-phony targets for bolt's make target to depend on.Fixing this is a non-trivial amount of work that will likely have towait until the 3.13 window.To make it easier to iterate on custom BOLT settings, the flags topass to instrumentation and application are now defined in configureand can be overridden by passing `BOLT_INSTRUMENT_FLAGS` and`BOLT_APPLY_FLAGS`.
@erlend-aaslanderlend-aasland removed the pendingThe issue will be closed if no feedback is provided labelJun 22, 2023
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@corona10corona10Awaiting requested review from corona10corona10 is a code owner

@erlend-aaslanderlend-aaslandAwaiting requested review from erlend-aaslanderlend-aasland is a code owner

Assignees

@corona10corona10

Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

5 participants
@indygreg@corona10@aaupov@erlend-aasland@bedevere-bot

[8]ページ先頭

©2009-2025 Movatter.jp