Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32.1k
gh-101525: Use only safe identical code folding with BOLT#134642
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Conversation
"Identical code folding" (ICF) is the feature of an optimizer to find that twofunctions have the same code and that they can therefore be deduplicatedin the binary. While this is usually safe, it can cause observablebehavior differences if the program relies on the fact that the twofunctions have different addresses.CPython relies on this in (at least) Objects/typeobject.c, which definestwo functions wrap_binaryfunc() and wrap_binaryfunc_l() with the sameimplementation, and stores their addresses in the slotdefs array. Ifthese two functions have the same address, update_one_slot() in thatfile will fill in slots it shouldn't, causing, for instances,classes defined in Python that inherit from some built-in types tomisbehave.As of LLVM 20 (llvm/llvm-project#116275), BOLT has a "safe ICF" mode,where it looks to see if there are any uses of a function symbol outsidefunction calls (e.g., relocations in data sections) and skips ICF onsuch functions. The intent is that this avoids observable behaviordifferences but still saves storage as much as possible.This version is about two months old at the time of writing. To supportolder LLVM versions, we have to turn off ICF entirely.This problem was previously noticed for Windows/MSVC inpython#53093 (andagain inpython#24098), where the default behavior of PGO is to enable ICF(which they expand to "identical COMDAT folding") and we had to turn itoff.
[py_cv_bolt_icf_safe=no | ||
${LLVM_BOLT} -icf=safe -o conftest.bolt conftest$EXEEXT >&AS_MESSAGE_LOG_FD 2>&1 dnl | ||
&& py_cv_bolt_icf_safe=yes], | ||
[AC_MSG_FAILURE([could not compile empty test program])]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
[AC_MSG_FAILURE([could notcompile empty test program])]) | |
[AC_MSG_FAILURE([could notlink empty test program with -icf=safe])]) |
It'd be good to be a bit more descriptive than "compilation failed".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The autoconf syntax here is a little confusing. This message isn't from the attempt to optimize with-icf=safe
, it's from actually compiling the test program. BOLT is a post-link optimizer, so we give it a compiled and linked executable as an argument. (There isalso ICF support in many linkers, to make things more confusing.) I'm using autoconf's function for running test-compiles for the side effect of creating a binary to pass to BOLT, but we need to handle the case where the compile, for whatever reason, fails. I don't expect anyone to hit this error because if you couldn't compile programs you probably couldn't get this far in ./configure anyway, but it's good to have something for the "can't happen" case.
The pseudo-Python equivalent of this check is
py_bolt_icf_flag="-icf=safe"@autoconf.cache_check("py_cv_bolt_icf_safe",f"whether{LLVM_BOLT} supports safe identical code folding",)defcheck_bolt_icf_safe():saved_cflags,saved_ldflags=CFLAGS,LDFLAGSCFLAGS,LDFLAGS=CFLAGS_NODIST,LDFLAGS_NODISTifautoconf.test_link(autoconf.make_program("","")):py_cv_bolt_icf_safe=Falsep=subprocess.run( [LLVM_BOLT,"-icf=safe""-o","conftest.bolt",f"conftest{EXEEXT}"],stdout=autoconf.message_log,stderr=autoconf.message_log, )ifp.returncode==0:py_cv_bolt_icf_safe=Trueelse:raiseRuntimeError("could not compile empty test program")CFLAGS,LDFLAGS=saved_cflags,saved_ldflagsreturnpy_cv_bolt_icf_safeifnotcheck_bolt_icf_safe():py_bolt_icf_flag=""
It does occur to me, writing this out, that youcould theoretically hit this failure case if your compiler and linker don't support the flags needed to produce a BOLT-able binary (Wl,--emit-relocs -fno-pie -no-pie
), because this is the first time we're asking autoconf to use those flags in a test-compile. I have no idea why you would ask for BOLT if you don't have a vaguely compatible compiler, though, and you'd previously get a random compile error at some later point in the process, so I don't think this is making anything worse... but we could change the message to "could not compile and link test program with flags needed for BOLT" or something.
I would also accept the argument that I need wider indentation or some comments or something to make the autoconf readable.
astral-sh/uv#13610 reported a misbehavior that is the result of asubclass of str incorrectly having its ->tp_as_number->nb_add slotfilled in with the value of PyUnicode_Type->tp_as_sequence->sq_concat.There are some times when this is an appropriate thing to do iwhensubclassing, but this is not one of them. The logic to prevent it inthis case relies on two helper functions in the file, wrap_binaryfuncand wrap_binaryfunc_l, having different addresses, even though theycontain identical code.For some reason BOLT does not do this optimization in the shared library(even though those are static functions and not exported), so we onlystarted seeing this in the static build.BOLT in LLVM 20+ supports "safe" code folding, which uses heuristics aboutrelocations to determine whether a function's address is used in any way otherthan a call. This seems to be enough to fix the issue. Add a patch to switchto -icf=safe, submitted upstream aspython/cpython#134642
corona10 left a comment• edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Thanks for digging this issue, Now BOLT build be more stable :)
LGTM.
I like this change and it's real issue:astral-sh/python-build-standalone#622
But I am not sure this is best autoconf that we add, I will wait@erlend-aasland 's review also.
Erlend, the change looks good. Can you review, focusing on the autoconf code itself?
saved_cflags="$CFLAGS" | ||
saved_ldflags="$LDFLAGS" | ||
CFLAGS="$CFLAGS_NODIST" | ||
LDFLAGS="$LDFLAGS_NODIST" | ||
AC_LINK_IFELSE( | ||
[AC_LANG_PROGRAM([[]], [[]])], | ||
[py_cv_bolt_icf_safe=no | ||
${LLVM_BOLT} -icf=safe -o conftest.bolt conftest$EXEEXT >&AS_MESSAGE_LOG_FD 2>&1 dnl | ||
&& py_cv_bolt_icf_safe=yes], | ||
[AC_MSG_FAILURE([could not compile empty test program])]) | ||
CFLAGS="$saved_cflags" | ||
LDFLAGS="$saved_ldflags" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Consider using ourWITH_SAVE_ENV
macro; it automatically saves and restoresCFLAGS
,CPPFLAGS
,LDFLAGS
, andLIBS
:
saved_cflags="$CFLAGS" | |
saved_ldflags="$LDFLAGS" | |
CFLAGS="$CFLAGS_NODIST" | |
LDFLAGS="$LDFLAGS_NODIST" | |
AC_LINK_IFELSE( | |
[AC_LANG_PROGRAM([[]], [[]])], | |
[py_cv_bolt_icf_safe=no | |
${LLVM_BOLT} -icf=safe -o conftest.bolt conftest$EXEEXT >&AS_MESSAGE_LOG_FD 2>&1 dnl | |
&& py_cv_bolt_icf_safe=yes], | |
[AC_MSG_FAILURE([could not compile empty test program])]) | |
CFLAGS="$saved_cflags" | |
LDFLAGS="$saved_ldflags" | |
WITH_SAVE_ENV([ | |
CFLAGS="$CFLAGS_NODIST" | |
LDFLAGS="$LDFLAGS_NODIST" | |
AC_LINK_IFELSE( | |
[AC_LANG_PROGRAM([], [])], | |
[py_cv_bolt_icf_safe=no | |
${LLVM_BOLT} -icf=safe -o conftest.bolt conftest$EXEEXT >&AS_MESSAGE_LOG_FD 2>&1 dnl | |
&& py_cv_bolt_icf_safe=yes], | |
[AC_MSG_FAILURE([could not compile empty test program])]) | |
]) |
Moreover, the double brackets ([[]]
) are not needed; a single pair ([]
) is sufficient.
Uh oh!
There was an error while loading.Please reload this page.
"Identical code folding" (ICF) is the feature of an optimizer to find that two functions have the same code and that they can therefore be deduplicated in the binary. While this is usually safe, it can cause observable behavior differences if the program relies on the fact that the two functions have different addresses.
CPython relies on this in (at least) Objects/typeobject.c, which defines two functions wrap_binaryfunc() and wrap_binaryfunc_l() with the same implementation, and stores their addresses in the slotdefs array. If these two functions have the same address, update_one_slot() in that file will fill in slots it shouldn't, causing, for instances, classes defined in Python that inherit from some built-in types to misbehave.
As of LLVM 20 (llvm/llvm-project#116275), BOLT has a "safe ICF" mode, where it looks to see if there are any uses of a function symbol outside function calls (e.g., relocations in data sections) and skips ICF on such functions. The intent is that this avoids observable behavior differences but still saves storage as much as possible.
This version is about two months old at the time of writing. To support older LLVM versions, we have to turn off ICF entirely.
This problem was previously noticed for Windows/MSVC in#53093 (and again in#24098), where the default behavior of PGO is to enable ICF (which they expand to "identical COMDAT folding") and we had to turn it off.