Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork33.4k
GH-100240: Generic freelist, applied to ints#101453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
markshannon left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I know this is a draft, but I've taken the liberty of reviewing anyway 🙂
Objects/longobject.c Outdated
| result= (PyLongObject*)_PyInterpreterState_FreelistAlloc(interp,sizeof(PyLongObject)); | ||
| } | ||
| #else | ||
| if (size==0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Don't change it for this PR, but this shouldn't be needed, as there should only be one zero. I think this indicates a bug in the caller.
The above assert,assert(size >= 0) should then beassert(size > 0)
Uh oh!
There was an error while loading.Please reload this page.
Include/internal/pycore_interp.h Outdated
| } | ||
| staticinlinevoid | ||
| _PyInterpreterState_FreelistFree(PyInterpreterState*interp,PyObject*op,intsize) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
What issize?. The assumption is that it is size in bytes, but I think you are using size in machine words.
Also needs to bePy_ssize_t.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
it's the results of sizeof(PyLongObject), that should be bytes no?
Uh oh!
There was an error while loading.Please reload this page.
Objects/longobject.c Outdated
| staticvoid | ||
| int_dealloc(PyLongObject*op) | ||
| { | ||
| #ifWITH_FREELISTS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Should we move the#if WITH_FREELISTS intoPyInterpreterState_FreelistFree to keep it localized?
Personally, I think we should just removeWITH_FREELISTS, but that needs a wider discussion.
| } | ||
| uint32_ti=0; | ||
| for (;i<list->space>>1;i++) { | ||
| void*ptr=PyObject_Malloc(list->size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Note for future PR: we should add the bulk allocate/free capability to the allocator.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
iritkatriel commentedFeb 3, 2023
benchmarks are showing this as 2% slower overall. I guess we need to tune it. https://github.com/faster-cpython/benchmarking/tree/main/results/bm-20230131-3.12.0a4%2B-fe65f49 |
markshannon commentedFeb 3, 2023
It looks like this PR is allocating from the freelist, but freeing to the underlying allocator in the specialized |
Include/internal/pycore_interp.h Outdated
| PyAPI_FUNC(int)_PyInterpreterState_IDIncref(PyInterpreterState*); | ||
| PyAPI_FUNC(void)_PyInterpreterState_IDDecref(PyInterpreterState*); | ||
| #defineSIZE_TO_FREELIST_INDEX(size) ((size-4)/2) |
markshannonFeb 3, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Why divide by 2 and not2 * sizeof(void *), which is the quantum of allocation for malloc and ob_malloc?
(Assuming that the size is in bytes)
Could you use the term "size class" or equivalent, instead of "freelist index" The mapping here is from sizes to classes (not all size classes get free lists).
Any reason not to make this an inline function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Alternatively, put an assert in this function, and put a size check in the caller.
Uh oh!
There was an error while loading.Please reload this page.
markshannon commentedFeb 6, 2023
I see you've added stats, have you recorded the stats yet? |
iritkatriel commentedFeb 6, 2023
Not yet. I have a bug I'm trying to figure out. It crashes in tests that use subprocesses. |
Include/internal/pycore_interp.h Outdated
| PyAPI_FUNC(void)_PyInterpreterState_IDDecref(PyInterpreterState*); | ||
| #defineFREELIST_QUANTUM (2*sizeof(void*)) | ||
| #defineSIZE_TO_FREELIST_INDEX(size) ((size-4)/FREELIST_QUANTUM) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Why this formula?
You want to allocate all sizes from(n-1)*FREELIST_QUANTUM + 1 ton*FREELIST_QUANTUM (inclusive) from the same freelist.
I would expect the formula to be(size + FREELIST_QUANTUM -1)/FREELIST_QUANTUM, or(size-1)/FREELIST_QUANTUM.
And because C division rounds towards 0, not -infinity, you want to use>>LOG_BASE_2_OF_FREELIST_QUANTUM, not/FREELIST_QUANTUM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I haven't changed this yet. I'm trying to get it to work with just ints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
There already are some macros in pycore_obmalloc.h that seem to do something like this:
/* * Alignment of addresses returned to the user. 8-bytes alignment works * on most current architectures (with 32-bit or 64-bit address buses). * The alignment value is also used for grouping small requests in size * classes spaced ALIGNMENT bytes apart. * * You shouldn't change this unless you know what you are doing. */#if SIZEOF_VOID_P > 4#define ALIGNMENT 16 /* must be 2^N */#define ALIGNMENT_SHIFT 4#else#define ALIGNMENT 8 /* must be 2^N */#define ALIGNMENT_SHIFT 3#endif /* Return the number of bytes in size class I, as a uint. */#define INDEX2SIZE(I) (((pymem_uint)(I) + 1) << ALIGNMENT_SHIFT)Uh oh!
There was an error while loading.Please reload this page.
iritkatriel commentedFeb 6, 2023
|
Uh oh!
There was an error while loading.Please reload this page.
iritkatriel commentedFeb 8, 2023
test_embed is failing because of a leak (I'll check why). The other tests pass. |
iritkatriel commentedFeb 8, 2023
The leak was because an int was freed and inserted to the freelist after it has been cleared in interpreter_clear. I set space and capacity to 0 in interpreter_clear to prevent this. |
iritkatriel commentedFeb 17, 2023
Dropping this for now. |
Uh oh!
There was an error while loading.Please reload this page.
@markshannon This is basically the freelist from your branch, made a bit more generic, and applied to ints. We should think about which sizes we want to include.
(I'm trying to run benchmarks, but there is some issue with the machine at the moment).