Precompiling
Precompiling is compiling Python source files (.py files) into bytecode(.pyc files) at build time instead of runtime. Doing it at build time canimprove performance by skipping that work at runtime.
Precompiling is disabled by default, so you must enable it using flags orattributes to use it.
Overhead of precompiling
While precompiling helps runtime performance, it has two main costs:
Increasing the size (count and disk usage) of runfiles. It approximatelydouble the count of the runfiles because for every
.pyfile, there is alsoa.pycfile. Compiled files are generally around the same size as thesource files, so it approximately doubles the disk usage.Precompiling requires running an extra action at build time. Whilecompiling itself isn’t that expensive, the overhead can become noticeableas more files need to be compiled.
Binary-level opt-in
Binary-level opt-in allows enabling precompiling on a per-target basis. This isuseful for situations such as:
Globally enabling precompiling in your
.bazelrcisn’t feasible. This maybe because some targets don’t work with precompiling, e.g. because they’re toobig.Enabling precompiling for build tools (exec config targets) separately fromtarget-config programs.
To use this approach, set thepyc_collection attribute on thebinaries/tests that should or should not use precompiling. Then change the--precompile default.
The default for thepyc_collection attribute is controlled by the flag--@rules_python//python/config_settings:precompile, so youcan use an opt-in or opt-out approach by setting its value:
targets must opt-out:
--@rules_python//python/config_settings:precompile=enabledtargets must opt-in:
--@rules_python//python/config_settings:precompile=disabled
Pyc-only builds
A pyc-only build (aka “sourceless” builds) is when only.pyc files areincluded; the source.py files are not included.
To enable this, set--@rules_python//python/config_settings:precompile_source_retention=omit_sourceflag on the command line or theprecompile_source_retention=omit_sourceattribute on specific targets.
The advantage of pyc-only builds are:
Fewer total files in a binary.
Importsmay beslightly faster.
The disadvantages are:
Error messages will be less precise because the precise line and offsetinformation isn’t in a pyc file.
pyc files are Python major-version-specific.
Note
pyc files are not a form of hiding source code. They are trivial to uncompile,and uncompiling them can recover almost the original source.
Advanced precompiler customization
The default implementation of the precompiler is a persistent, multiplexed,sandbox-aware, cancellation-enabled, json-protocol worker that uses the sameinterpreter as the target toolchain. This works well for local builds, but maynot work as well for remote execution builds. To customize the precompiler, twomechanisms are available:
The exec tools toolchain allows customizing the precompiler binary used withthe
precompilerattribute. Arbitrary binaries are supported.The execution requirements can be customized using
--@rules_python//tools/precompiler:execution_requirements. This is a listflag that can be repeated. Each entry is akey=valuepair that is added to theexecution requirements of thePyCompileaction. Note that this flagis specific to therules_pythonprecompiler. If a custom binary is used,this flag will have to be propagated from the custom binary using thetesting.ExecutionInfoprovider; refer to thepy_interpreter_programexample.
The default precompiler implementation is an asynchronous/concurrentimplementation. If you find it has bugs or hangs, please report them. In themeantime, the flag--worker_extra_flag=PyCompile=--worker_impl=serial canbe used to switch to a synchronous/serial implementation that may not performas well, but is less likely to have issues.
Theexecution_requirements keys of most relevance are:
supports-workers: 1 or 0, to indicate if a regular persistent worker isdesired.supports-multiplex-workers:1or0, to indicate if a multiplexed persistentworker is desired.requires-worker-protocol:jsonorproto; therules_pythonprecompilercurrently only supportsjson.supports-multiplex-sandboxing:1or0, to indicate if sandboxing of theworker is supported.supports-worker-cancellation:1or0, to indicate if requests to the workercan be cancelled.
Note that any execution requirements values can be specified in the flag.
Known issues, caveats, and idiosyncrasies
Precompiling requires Bazel 7+ with the Pystar rule implementation enabled.
Mixing rules_python PyInfo with Bazel builtin PyInfo will result in pyc filesbeing dropped.
Precompiled files may not be used in certain cases prior to Python 3.11. Thisoccurs due to Python adding the directory of the binary’s main
.pyfile, whichcauses the module to be found in the workspace source directory instead ofwithin the binary’s runfiles directory (where the pyc files are). This canusually be worked around by removingsys.path[0](or otherwise ensuring therunfiles directory comes before the repo’s source directory insys.path).The pyc filename does not include the optimization level (e.g.,
foo.cpython-39.opt-2.pyc). This works fine (it’s all bytecode), but alsomeans the interpreter-Oargument can’t be used – doing so will cause theinterpreter to look for the non-existentopt-Nnamed files.Targets with the same source files and different exec properties will resultin action conflicts. This most commonly occurs when a
py_binaryandapy_libraryhave the same source files. To fix this, modify both targets sothey have the same exec properties. If this is difficult because unsupportedexec groups end up being passed to the Python rules, please file an issueto have those exec groups added to the Python rules.