Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32k
gh-128972: Add_Py_ALIGN_AS
and revertPyASCIIObject
memory layout.#133085
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
… layout.Add `_Py_ALIGN_AS` as per C API WG vote:capi-workgroup/decisions#61This patch only adds it to free-threaded builds; the `#ifdef Py_GIL_DISABLED`can be removed in the future.Use this to revert `PyASCIIObject` memory layout for non-free-threaded builds.The long-term plan is to deprecate the entire struct; until that happensit's better to keep it unchanged, as courtesy to people that rely on it despiteit not being stable ABI.
/* Ensure 4 byte alignment for PyUnicode_DATA(), see gh-63736 on m68k. | ||
In the non-free-threaded build, we'll use explicit padding instead */ | ||
_Py_ALIGN_AS(4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
It does not ensure alignment forPyUnicode_DATA()
, it ensures alignment forstate
.
To ensure alignment forPyUnicode_DATA()
, you need to add_Py_ALIGN_AS
before thedata
field inPyUnicodeObject
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Am I missing something? This sounds like a distinction without a difference.
- The
PyUnicode_DATA()
macro depends on the size ofPyASCIIObject
. - The size is a multiple of the alignment.
- The alignment of
PyASCIIObject
must be at least the alignment of every member, so it must be at least 4 bytes with the_Py_ALIGN_AS(4)
macro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The alignment ofPyASCIIObject
should be the same as the alignment of PyObject.
The definition of the_PyUnicode_COMPACT_DATA()
macro and the code that calculates the size of thePyASCIIObject
objects should be changed to guarantee the alignment of data.
Perhaps_Py_ALIGN_AS(4)
should be added inPyObject_HEAD
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Huh? Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
What exactly are you asking about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Why? AFAIK, forsizeof(PyASCIIObject)
it doesn't really matter.
I'd rather avoid setting a 4-byte alignment for 64-bit pointers.
FWIW, since 3.3 the header says "the data immediately follow the structure". We can't switch users to_Py_SIZE_ROUND_UP
, we need to make sure the size is aligned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The data immediately following the structure (PyASCIIObject or PyCompactUnicodeObject) has the same alignment as the structure. Adding_Py_ALIGN_AS()
before thestate
member does not change the alignment of the structure. If we add a byte at the end of the structure, the alignment of the data immediately following the structure will change. It works for current structures by accident, but we can just add padding for the same effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I'd rather avoid setting a 4-byte alignment for 64-bit pointers.
_Alignas(4)
sets theminimum alignment -- it only makes the alignmentstricter. It won't reduce the alignment of 64-bit pointers.
Adding
_Py_ALIGN_AS()
before the state member does not change the alignment of the structure.
It does. Seehttps://gcc.godbolt.org/z/evh3hjxM8, for example. Arrays of structs (bothFoo arr[N]
andmalloc(sizeof(Foo) * N)
) would not work if that wasn't the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Hmm, it seems that the alignment of a structure is determined by the largest alignment of its members. I didn't find an explicit statement in the C11 standard, but it seems logical. If the structure has less alignment, then we can't guarantee the alignment of the members.
So we can apply _Py_ALIGN_AS() to any member, and it will work independently of the compiler.
The original issue (#63736) was because on some platformsPy_ssize_t
and pointers have alignment less than 4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Yeah.
FWIW, I also didn't find an explicit statement that structsize is a multiple of its alignment, but it's implied by the fact thatmalloc(sizeof(element_type) * n)
allocates an array.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
LGTM.
/* Ensure 4 byte alignment for PyUnicode_DATA(), see gh-63736 on m68k. | ||
In the non-free-threaded build, we'll use explicit padding instead */ | ||
_Py_ALIGN_AS(4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Hmm, it seems that the alignment of a structure is determined by the largest alignment of its members. I didn't find an explicit statement in the C11 standard, but it seems logical. If the structure has less alignment, then we can't guarantee the alignment of the members.
So we can apply _Py_ALIGN_AS() to any member, and it will work independently of the compiler.
The original issue (#63736) was because on some platformsPy_ssize_t
and pointers have alignment less than 4.
bedevere-bot commentedMay 1, 2025
🤖 New build scheduled with the buildbot fleet by@encukou for commit95ffa4c 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F133085%2Fmerge If you want to schedule another build, you need to add the🔨 test-with-buildbots label again. |
bedevere-bot commentedMay 2, 2025
🤖 New build scheduled with the buildbot fleet by@encukou for commit5b6685e 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F133085%2Fmerge If you want to schedule another build, you need to add the🔨 test-with-buildbots label again. |
987e45e
intopython:mainUh oh!
There was an error while loading.Please reload this page.
Thank you for the reviews! |
Uh oh!
There was an error while loading.Please reload this page.
Add
_Py_ALIGN_AS
as per C API WG vote:capi-workgroup/decisions#61This patch only adds it to free-threaded builds; the
#ifdef Py_GIL_DISABLED
should be removed in the future.Use this to revert
PyASCIIObject
memory layout for non-free-threaded builds. The long-term plan is to deprecate the entire struct; until that happens it's better to keep it unchanged, as courtesy to people that rely on it despite it not being stable ABI.