Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

GH-100240: Generic freelist, applied to ints#101453

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed
iritkatriel wants to merge19 commits intopython:mainfromiritkatriel:int-freelist

Conversation

@iritkatriel
Copy link
Member

@iritkatrieliritkatriel commentedJan 31, 2023
edited by bedevere-bot
Loading

@markshannon This is basically the freelist from your branch, made a bit more generic, and applied to ints. We should think about which sizes we want to include.

(I'm trying to run benchmarks, but there is some issue with the machine at the moment).

@iritkatrieliritkatriel marked this pull request as draftJanuary 31, 2023 10:36
@iritkatrieliritkatriel changed the titlefreelist for intsgeneric freelist, applied to intsJan 31, 2023
Copy link
Member

@markshannonmarkshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I know this is a draft, but I've taken the liberty of reviewing anyway 🙂

iritkatriel reacted with thumbs up emoji
result= (PyLongObject*)_PyInterpreterState_FreelistAlloc(interp,sizeof(PyLongObject));
}
#else
if (size==0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Don't change it for this PR, but this shouldn't be needed, as there should only be one zero. I think this indicates a bug in the caller.
The above assert,assert(size >= 0) should then beassert(size > 0)

}

staticinlinevoid
_PyInterpreterState_FreelistFree(PyInterpreterState*interp,PyObject*op,intsize) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

What issize?. The assumption is that it is size in bytes, but I think you are using size in machine words.
Also needs to bePy_ssize_t.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

it's the results of sizeof(PyLongObject), that should be bytes no?

staticvoid
int_dealloc(PyLongObject*op)
{
#ifWITH_FREELISTS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Should we move the#if WITH_FREELISTS intoPyInterpreterState_FreelistFree to keep it localized?

Personally, I think we should just removeWITH_FREELISTS, but that needs a wider discussion.

}
uint32_ti=0;
for (;i<list->space>>1;i++) {
void*ptr=PyObject_Malloc(list->size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Note for future PR: we should add the bulk allocate/free capability to the allocator.

@iritkatriel
Copy link
MemberAuthor

benchmarks are showing this as 2% slower overall. I guess we need to tune it.

https://github.com/faster-cpython/benchmarking/tree/main/results/bm-20230131-3.12.0a4%2B-fe65f49

@markshannon
Copy link
Member

It looks like this PR is allocating from the freelist, but freeing to the underlying allocator in the specializedBINARY_OP instructions.
Which could explain the large slowdown in spectralnorm.

PyAPI_FUNC(int)_PyInterpreterState_IDIncref(PyInterpreterState*);
PyAPI_FUNC(void)_PyInterpreterState_IDDecref(PyInterpreterState*);

#defineSIZE_TO_FREELIST_INDEX(size) ((size-4)/2)
Copy link
Member

@markshannonmarkshannonFeb 3, 2023
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Why divide by 2 and not2 * sizeof(void *), which is the quantum of allocation for malloc and ob_malloc?
(Assuming that the size is in bytes)
Could you use the term "size class" or equivalent, instead of "freelist index" The mapping here is from sizes to classes (not all size classes get free lists).

Any reason not to make this an inline function?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Alternatively, put an assert in this function, and put a size check in the caller.

@markshannonmarkshannon changed the titlegeneric freelist, applied to intsGH-100240: Generic freelist, applied to intsFeb 3, 2023
@markshannon
Copy link
Member

I see you've added stats, have you recorded the stats yet?

@iritkatriel
Copy link
MemberAuthor

I see you've added stats, have you recorded the stats yet?

Not yet. I have a bug I'm trying to figure out. It crashes in tests that use subprocesses.

PyAPI_FUNC(void)_PyInterpreterState_IDDecref(PyInterpreterState*);

#defineFREELIST_QUANTUM (2*sizeof(void*))
#defineSIZE_TO_FREELIST_INDEX(size) ((size-4)/FREELIST_QUANTUM)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Why this formula?

You want to allocate all sizes from(n-1)*FREELIST_QUANTUM + 1 ton*FREELIST_QUANTUM (inclusive) from the same freelist.
I would expect the formula to be(size + FREELIST_QUANTUM -1)/FREELIST_QUANTUM, or(size-1)/FREELIST_QUANTUM.

And because C division rounds towards 0, not -infinity, you want to use>>LOG_BASE_2_OF_FREELIST_QUANTUM, not/FREELIST_QUANTUM.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I haven't changed this yet. I'm trying to get it to work with just ints.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

There already are some macros in pycore_obmalloc.h that seem to do something like this:

/*  * Alignment of addresses returned to the user. 8-bytes alignment works * on most current architectures (with 32-bit or 64-bit address buses). * The alignment value is also used for grouping small requests in size * classes spaced ALIGNMENT bytes apart. * * You shouldn't change this unless you know what you are doing. */#if SIZEOF_VOID_P > 4#define ALIGNMENT              16               /* must be 2^N */#define ALIGNMENT_SHIFT         4#else#define ALIGNMENT               8               /* must be 2^N */#define ALIGNMENT_SHIFT         3#endif   /* Return the number of bytes in size class I, as a uint. */#define INDEX2SIZE(I) (((pymem_uint)(I) + 1) << ALIGNMENT_SHIFT)

@iritkatriel
Copy link
MemberAuthor

@iritkatriel
Copy link
MemberAuthor

test_embed is failing because of a leak (I'll check why). The other tests pass.

@iritkatriel
Copy link
MemberAuthor

test_embed is failing because of a leak (I'll check why). The other tests pass.

The leak was because an int was freed and inserted to the freelist after it has been cleared in interpreter_clear. I set space and capacity to 0 in interpreter_clear to prevent this.

@iritkatriel
Copy link
MemberAuthor

Dropping this for now.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@markshannonmarkshannonmarkshannon left review comments

@ericsnowcurrentlyericsnowcurrentlyAwaiting requested review from ericsnowcurrentlyericsnowcurrently will be requested when the pull request is marked ready for reviewericsnowcurrently is a code owner

@ZeroIntensityZeroIntensityAwaiting requested review from ZeroIntensityZeroIntensity will be requested when the pull request is marked ready for reviewZeroIntensity is a code owner

Assignees

No one assigned

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

3 participants

@iritkatriel@markshannon@bedevere-bot

[8]ページ先頭

©2009-2025 Movatter.jp