Movatterモバイル変換


[0]ホーム

URL:


Following system colour schemeSelected dark colour schemeSelected light colour scheme

Python Enhancement Proposals

PEP 774 – Removing the LLVM requirement for JIT builds

PEP 774 – Removing the LLVM requirement for JIT builds

Author:
Savannah Ostrowski <savannah at python.org>
Discussions-To:
Discourse thread
Status:
Deferred
Type:
Standards Track
Created:
27-Jan-2025
Python-Version:
3.14
Post-History:
27-Jan-2025
Resolution:
14-Mar-2025

Table of Contents

Abstract

Since Python 3.13, CPython has been able to be configured and built with anexperimental just-in-time (JIT) compiler via the--enable-experimental-jitflag on Linux and Mac and--experimental-jit on Windows. To build CPython withthe JIT enabled, users are required to have LLVM installed on their machine(initially, with LLVM 16 but more recently, with LLVM 19). LLVM is responsiblefor generating stencils that are essential to our copy-and-patch JIT (seePEP 744).These stencils are predefined, architecture-specific templates that are usedto generate machine code at runtime.

This PEP proposes removing the LLVM build-time dependency for JIT-enabled buildsby hosting the generated stencils in the CPython repository. This approachallows us to leverage the checked-in stencils for supported platforms at buildtime, simplifying the contributor experience and address concerns raised at thePython Core Developer Sprint in September 2024. That said, there is a cleartradeoff to consider, as improved developer experience does come at the cost ofincreased repository size.

It is important to note that this PEP is not a proposal to accept or reject theJIT itself but rather to determine whether the build-time dependency on LLVM isacceptable for JIT builds moving forward. If this PEP is rejected, we willproceed with the status quo, retaining the LLVM build-time requirement. Whilethis dependency has served the JIT development process effectively thus far, itintroduces setup complexity and additional challenges that this PEP seeks toalleviate.

Motivation

At the Python Core Developer Sprint that took place in September 2024, there wasdiscussion about the next steps for the JIT - a related discussion also tookplace onGitHub. As partof that discussion, there was also a clear appetite for removing the LLVMrequirement for JIT builds in preparation for shipping the JIT off by default in3.14. The consensus at the sprint was that it would be sufficient to providepre-generated stencils for non-debug builds for Tier 1 platforms and thatchecking these files into the CPython repo would be adequate for the limitednumber of platforms (though more options have been explored; seeRejectedIdeas).

Currently, building CPython withthe JIT requires LLVM as abuild-time dependency. Despite not being exposed to end users, this dependencyis suboptimal. Requiring LLVM adds a setup burden for developers and those whowish to build CPython with the JIT enabled. Depending on the operating system,the version of LLVM shipped with the OS may differ from that required by our JITbuilds, which introduces additional complexity to troubleshoot and resolve. Withfew core developers currently contributing to and maintaining the JIT, we alsowant to make sure that the friction to work on JIT-related code is minimized asmuch as possible.

With the proposed approach, hosting pre-compiled stencils for supportedarchitectures can be generated in advance, stored in a central location, andautomatically used during builds. This approach ensures reproducible builds,making the JIT a more stable and sustainable part of CPython’s future.

Rationale

This PEP proposes checking JIT stencils directly into the CPython repo as thebest path forward for eliminating our build-time dependency on LLVM.

This approach:

  • Provides the best end-to-end experience for those looking to build CPythonwith the JIT
  • Lessens the barrier to entry for those looking to contribute to the JIT
  • Ensures builds remain reproducible and consistent across platforms withoutrelying on external infrastructure or download mechanisms
  • Eliminates variability introduced by network conditions or potentialdiscrepancies between hosted files and the CPython repository state, and
  • Subjects stencils to the same review processes we have for all other JIT-relatedcode

However, this approach does result in a slight increase in overallrepository size. Comparing repo growth on commits over the past 90 days, thedifference between the actual commits and the same commits with stencils addedamounts to a difference of 0.03 MB per stencil file. This is a small increase inthe context of the overall repository size, which has grown by 2.55 MB in thesame time period. For six stencil files, this amounts to an upper bound of 0.18 MB.The current total size of the stencil files for all six platforms is 7.2 MB.[1]

These stencils could become larger in the future with changes to registerallocation, which would introduce 5-6 variants per instruction in each stencilfile (5-6x larger). However, if we ended up going this route, there areadditional modifications we could make to stencil files that could helpcounteract this size increase (e.g., stripping comments, minimizing thestencils).

Specification

This specification outlines the proposed changes to remove the build-timedependency on LLVM and the contributor experience if this PEP is accepted.

Repository changes

The CPython repository would now host the pre-compiled JIT stencils in a newsubdirectory inTools/jit calledstencils/. At present, the JIT is testedand built for six platforms, so to start, we’d check in six stencil files. Inthe future, we may check in additional stencil files if support for additionalplatforms is desired or relevant.

cpython/    Tools/        jit/            stencils/                aarch64-apple-darwin.h                aarch64-unknown-linux-gnu.h                i686-pc-windows-msvc.h                x86_64-apple-darwin.h                x86_64-pc-windows-msvc.h                x86_64-pc-linux-gnu.h

Workflow

The workflow changes can be split into two parts, namely building CPython withthe JIT enabled and working on the JIT’s implementation.

Building CPython with the JIT

Precompiled JIT stencil files will be stored in theTools/jit/stencilsdirectory, with each file name corresponding to its target triple as outlinedabove. At build time, we determine whether to use the checked in stencils or togenerate a new stencil for the user’s platform. Specifically, for contributorswith LLVM installed, thebuild.py script inTools/jit/stencils will allowthem to regenerate the stencil for their platform. Those without LLVM can relyon the precompiled stencil files directly from the repository.

Working on the JIT’s implementation (or touching JIT files)

In continuous integration (CI), stencil files will be automatically validated and updated when changesare made to JIT-related files. When a pull request is opened that touches thesefiles, thejit.yml workflow, which builds and tests our builds, will run asusual.

However, as part of this, we will introduce a new step that diffs the currentstencils in the repo against those generated in CI. If there is a diff for aplatform’s stencil file, a patch file for the updated stencil is generated andthe step will fail. Each patch is uploaded to GitHub Actions. After CI isfinished running across all platforms, the patches are aggregated into a singlepatch file for convenience. You can download this aggregated patch, apply itlocally, and commit the updated stencils back to your branch. Then, thesubsequent CI run will pass.

Reference Implementation

Key parts of thereference implementation include:

Ignoring the stencils themselves and any necessary JIT README changes, thechanges to the source code to support reproducible stencil generation andhosting are minimal (around 150 lines of changes).

Rejected Ideas

Several alternative approaches were considered as part of the research andexploration for this PEP. However, the ideas below either involveinfrastructural cost, maintenance burden, or a worse overall developerexperience.

Using Git submodules

Git submodules are a poor developer experience for hosting stencils because theycreate a different kind of undesirable friction. For instance, anyupdates to the JIT would necessitate regenerating the stencils and committingthem to a separate repository. This introduces a convoluted process: you mustupdate the stencils in the submodule repository, commit those changes, and thenupdate the submodule reference in the main CPython repository. This disconnectadds unnecessary complexity and overhead, making the process brittle anderror-prone for contributors and maintainers.

Using Git subtrees

When using subtrees, the embedded repository becomes part of the mainrepository, similar to what’s being proposed in this PEP. However, subtreesrequire additional tooling and steps for maintenance, which adds unnecessarycomplexity to workflows.

Hosting in a separate repository

While splitting JIT stencils into a separate repository avoids the storageoverhead associated with hosting the stencils, it adds complexity to the buildprocess. Additional tooling would be required to fetch the stencils andpotentially create additional and unnecessary failure points in the workflow.This separation also makes it harder to ensure consistency between the stencilsand the CPython source tree, as updates must be coordinated across therepositories.

Hosting in cloud storage

Hosting stencils in cloud storage like S3 buckets or GitHub raw storageintroduces external dependencies, complicating offline developmentworkflows. Also, depending on the provider, this type of hosting comes withadditional cost, which we’d like to avoid.

Using Git LFS

Git Large File Storage (LFS) adds a tool dependency for contributors,complicating the development workflow, especially for those who may not alreadyuse Git LFS. Git LFS does not work well with offline workflows since filesmanaged by LFS require an internet connection to fetch when checking outspecific commits, which is disruptive for even basic Git workflows. Git LFS hassome free quota but there areadditionalcostsfor exceeding that quota which are also undesirable.

Maintain the status quo with LLVM as a build-time dependency

Retaining LLVM as a build-time dependency upholds the existing barriers toadoption and contribution. Ultimately, this option fails to address the corechallenges of accessibility and simplicity, and fails to eliminate thedependency which was deemed undesirable at the Python Core Developer Sprint inthe fall (the impetus for this PEP), making it a poor long-term solution.

Footnotes

[1]
Calculated using thisGist.This script replays commits for roughly the past 90 days, generates thestencil file for the platform for each commit, and then commits the stencilfile into a copy of the repository if they change. The calculation comparesthe before and after of the repository after runninggitgc--aggressive,which is used to pack the repo (similar to what GitHub does on repo clone).

Copyright

This document is placed in the public domain or under theCC0-1.0-Universal license, whichever is more permissive.


Source:https://github.com/python/peps/blob/main/peps/pep-0774.rst

Last modified:2025-05-26 16:58:47 GMT


[8]ページ先頭

©2009-2026 Movatter.jp