Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32.4k
gh-115103: Implement delayed memory reclamation (QSBR)#115180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
This adds a safe memory reclamation scheme based on FreeBSD's "GUS" andquiescent state based reclamation (QSBR). The API provides a mechanismfor callers to detect when it is safe to free memory that may beconcurrently accessed by readers.
Notes for reviewers:
|
if (QSBR_LT(rd_seq, min_seq)) { | ||
// It's okay if the compare-exchange failed: another thread updated it | ||
(void)_Py_atomic_compare_exchange_uint64(&shared->rd_seq, &rd_seq, min_seq); | ||
rd_seq = min_seq; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
If the compare/exchange fails is it worth returning the greater ofrd_seq
andmin_seq
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
It doesn't matter for correctness and I'm not sure it's worth any extra complexity for performance.
Python/qsbr.c Outdated
// Starting size of the array of qsbr thread states | ||
#define MIN_ARRAY_SIZE 8 | ||
// The shared write sequence is always odd and incremented by two. Detached |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Can we document why the shared write sequence is always odd? I'm assuming that's just to avoid issues with it wrapping around and colliding withQSBR_OFFLINE
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Yeah, that was the original motivation, but as I wrote in another comment, with 64-bit counters we don't really have to worry about wrap around.
I'm still tempted to keep this numbering scheme and theQBSR_LT
wrap-around safe comparisons, but let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think it's fine to keep it this way, it's just worth a comment on why it's that way :)
@@ -169,6 +169,10 @@ extern PyTypeObject _PyExc_MemoryError; | |||
{ .threshold = 10, }, \ | |||
}, \ | |||
}, \ | |||
.qsbr = { \ | |||
.wr_seq = 1, \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Can these be inited toQSBR_INITIAL
(it seems like includingpycore_qsbr.h
shouldn't be an issue?)
// If there are no free entries, we pause all threads, grow the array, | ||
// and update the pointers in PyThreadState to entries in the new array. | ||
if (qsbr == NULL) { | ||
_PyEval_StopTheWorld(interp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
It seems like this isn't safe to do with the mutex locked and not-detached? If another thread is trying to reserve at the same time it won't be detached, but it also won't be responding to the request for the eval breaker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
You are right -- I mixed up the locking discipline. The outer lock should not use_Py_LOCK_DONT_DETACH
. I think the locking discipline with stop-the-world should be:
- A lock can be acquired before starting a stop-the-world pause or after starting it, but not both (consistent lock ordering)
- If acquired before the stop-the-world, the acquisition should use
_Py_LOCK_DETACH
(the default) - If acquired within the stop-the-world, the acquisition should use
_Py_LOCK_DONT_DETACH
HEAD_LOCK(runtime)
is an example of case 3. This is case 2.
The reasoning for case (3) is a bit subtle: locks can be directly handed off to a detached thread, so with a normalPyMutex_Lock(&m)
a thread can both hold the mutexm
and have paused for another thread's stop-the-world.
Python/qsbr.c Outdated
// Initialize (or reintialize) the freelist of QSBR thread states | ||
static void | ||
initialize_freelist(struct _qsbr_shared *shared) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Maybeinitialize_new_array
or something like that? It's just that it's not only initing the free lists, it's also initializing all of the thread state pointers into the array (and thenew part covers the re-initialization)
uint64_t | ||
_Py_qsbr_advance(struct _qsbr_shared *shared) | ||
{ | ||
return _Py_atomic_add_uint64(&shared->wr_seq, QSBR_INCR) + QSBR_INCR; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The FreeBSD version has logic to avoid making sure thatwr_seq
doesn't get too far ahead fromrd_seq
and if so blocks until it catches up to avoid extreme wrap around issues... Is there a reason we don't need that? :). Or should we at least add a TODO and/or an assertion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
We are using 64-bit sequence counters to avoid worrying about wrap around. FreeBSD uses 32-bit counters. (2^62 ns = ~146 years; 2^30 ns = ~1 sec)
We could consider using 32-bit counters and handling wrap around in the future. It wouldn't matter much for x86-64 and aarch64, but would probably be more efficient on some other platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Okay, that makes sense, FreeBSD's is buried in a typedef so I hadn't noticed it was 32-bit.
return true; | ||
} | ||
rd_seq = qsbr_poll_scan(qsbr->shared); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This is probably just related to future scaling, but I wonder if we could pass in the goal and bail on the full-scan if any thread hasn't reached it yet. It might prevent us from updating the shared for someone else who would benefit from it though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
It's not clear to me if that case will happen often enough to be worth handling. Unfortunately, we can't really measure or test for that until we disable the GIL. With the GIL, only one thread is ever attached and active.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
LGTM!
@sethmlarson, does this need to be included in the SBOM? This includes code derived from a FreeBSD component, but there isn't really any shared code. In other words, it's derivative enough that I think the copyright notice should be preserved, but different enough that if there were any security vulnerabilities in the FreeBSD component, they would not be relevant here. |
@colesbury Thanks for the ping! Great question, the SBOM for CPython actually doesn't concern itself with copyright or licensing right now, it's primary use-case is for vulnerability management and tracking. Given this I don't think we need to update the SBOM for this contribution. |
…115180)This adds a safe memory reclamation scheme based on FreeBSD's "GUS" andquiescent state based reclamation (QSBR). The API provides a mechanismfor callers to detect when it is safe to free memory that may beconcurrently accessed by readers.
…115180)This adds a safe memory reclamation scheme based on FreeBSD's "GUS" andquiescent state based reclamation (QSBR). The API provides a mechanismfor callers to detect when it is safe to free memory that may beconcurrently accessed by readers.
…115180)This adds a safe memory reclamation scheme based on FreeBSD's "GUS" andquiescent state based reclamation (QSBR). The API provides a mechanismfor callers to detect when it is safe to free memory that may beconcurrently accessed by readers.
Uh oh!
There was an error while loading.Please reload this page.
This adds a safe memory reclamation scheme for the free-threaded build based on FreeBSD's "GUS" and quiescent state based reclamation (QSBR). The API provides a mechanism for callers to detect when it is safe to free memory that may be concurrently accessed by readers.
Related PRs that build on this:
📚 Documentation preview 📚:https://cpython-previews--115180.org.readthedocs.build/