- Notifications
You must be signed in to change notification settings - Fork5.2k
Initial checkin for regions#45172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
ghost commentedNov 24, 2020
Tagging subscribers to this area: @dotnet/gc Issue DetailsI have not done a full functional test run - there are things I know don't work yet. But being I added an env var for now to specify the range for regions (GCRegionsRange). We can have Each generation has at least one region; each region can only belong to one generation. Main changes -
Some key implementation differences between regions and segments -
Other things worth mentioning -
This is the env I used for testing - complus_gcConcurrent=0 command line I used for GCPerfSim - -tc 2 -tagb 16 -tlgb 0.05 -lohar 0 -sohsi 10 -lohsi 0 -pohsi 0 -sohpi 0 -lohpi 0 -pohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time -printEveryNthIter 300000 -tc 2 -tagb 32 -tlgb 0.05 -lohar 100 -sohsi 10 -lohsi 100 -pohsi 0 -sohpi 0 -lohpi 0 -pohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time -printEveryNthIter 300000
|
b83cba8 to9fb58c6Compare3d20621 to0ecfc50CompareMaoni0 commentedNov 25, 2020 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
the one test failure from "Libraries Build windows net48 x86 Release" was a cryptography test that's now been disabled 'cause the machines don't have the right patch. tracked by issue#45168. the only other failures are from mono tests which are also known. |
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
src/coreclr/src/gc/gc.cpp Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Nit: why have a separate if with a swapped but I think identical condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
no reason other than this is a refactoring change and I simply moved the code to be inside a method instead of duplicated in 2 places :) I can definitely make it nicer (including your comment below). I have a cleanup PR coming up so I could do it then unless there's something else I need to change in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Ok.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
src/coreclr/src/gc/gc.cpp Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
So this is just calling the helper instead of repeating the code here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
yes, I think when I wrote the original code I had something else going that made it not exactly the same. but that difference was gone so I should have converted it to calling the method.
In reply to:534273646 [](ancestors = 534273646)
src/coreclr/src/gc/gc.cpp Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Nit: you could combine this with the previous line containing the declaration - easier to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
will do.
src/coreclr/src/gc/gc.cpp Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Is this really signed or unsigned? I see we compute it as a sum of differences of pointers, so ptrdiff_t is appropriate, but then we return it as a size_t, so we should probably add an assert to make sure it's >= 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
this is a bit involved... let me see if I can find you online to explain more. basically there was a perf bug in the old code (which only affects a certain scenario and doesn't cause a big problem so I just left the code as is - we can fix it but I didn't want to fix it in this PR 'cause I really tried to not change functionally for segments.
src/coreclr/src/gc/gc.cpp Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I don't understand this change - looks like it's a bug fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
yes this was a bug fix. I noticed it when I was testing regions and I knew the cards were not correct yet the last GC's verify_heap didn't find it which was because we should always update found_card_p and need_card_p every time we see a new card instead of only updating them when found_card_p is FALSE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Was this change just for your testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
yes.. I shouldn't have checked this in.. and I'll revert it. I just noticed that I'll need to update the PR anyway (because the merge conflicts are too complicated to resolve online). so I'll also include changes that you suggested in other comments.
…d the ones I ran below).I have not done a full functional test run - there are things I know don't work yet. But beingable to run some tests will allow multiple people to work on this in parallel.I added an env var for now to specify the range for regions. We can have a default reserve rangethat's very large. This simplies a lot of things. Right now I'm only enabling the feature on 64-bitbecause I've only been testing on 64-bit. For 32-bit we would likely still need to have things like grow_brick_card_tables.Each generation has at least one region; each region can only belong to one generation.---Main changes -+ I'm using the current heap_segment to represent a region instead of doing a ton of renaming because each region also needs to keep track of the same allocated/committed/etc. So when USE_REGIONS is defined you could think region whenever you see segment. But gen/planned gen are now maintained by each region since we no long have a contiguous range for ephemeral gens.+ Added a simple region allocator that finds free blocks for region alloc requests and coalesce adjacent free blocks.+ Places that need to check for ephemeral generations are changed to check for gen_num/plan_gen_num accordingly, eg, gc_mark, mark_through_cards, relocate_address.+ plan phase is changed quite a bit to accommodate regions, eg, we set each region's plan as we've planned them; we need to go through eph regions for allocating survivors, instead of detecting crossing gen0/1 boundary by the survivors position relative to their gen start, we plan the next generation (if necessary) when we run out of regions of the current generation we are going through. This actually avoids some of the complexity with segments.+ sweep phase is heavily changed since we don't have a gen limit to detect eph gen boundaries anymore.+ Rewrote the code that threads segments together for regions after compact/sweep is done.+ Changed the SOH allocator to acquire new regions when needed, still maintain ephemeral_heap_segment but now it's used to indicate the region we are currently allocating in.+ Got rid of the following that doesn't apply to regions - ephemeral_low/ephemeral_high demotion_low/demotion_high Some of these were replaced for regions; others haven't been (like ephemeral_low/ephemeral_high) which will be done in separate work items.+ Added some internal stress mechanisms like selectively pinning some objects and creating ro segs in GC.+ I have not changed the write barrier code so for WKS GC cards will be set unconditionally like in SVR GC.+ Changed verify_heap to verify regions.+ Perf changes, eg, to determine compaction.+ Some changes for BGC; it's still incomplete but I've changed places where it needs to avoid reading into the ephemeral regions for concurrent like BGC revisit/overflow. Overflow can be optimized to be per region (this optimization applies to segs as well but more gains with regions).Some key implementation differences between regions and segments -+ We no longer have the expand heap code paths which simplies things by a lot.+ We no longer have a generation start object - each generation has a list of regions that belong to that generation. Any empty region can be removed off of this list. This is different from when we had a generation start which meant the first segment would never be empty and we don't reuse it for new eph seg.+ With segments in plan phase we may not have allocated gen1/0 start while going through survivors since there may not be survivors in gen1/0... with regions it's different because each time I finish a generation I will call process_last_np to allocate the new gen. It's more consistent.+ consing_gen in plan phase really doesn't need to be a generation. With segments it would change that generation's alloc seg and we'd switch to gen1 when we need to plan ephemeral generations. But this is unnecessary because all we need is the alloc_ptr/limit pair + alloc seg (or alloc region for regions). I've made some progress simplying this so this never changes (always the condemned gen) and always represents the alloc_ptr/limit/region. We can totally get rid of consing_gen and only maintain those 3 things.=======Other things worth mentioning -+ I intentionally left quite a bit of logging for the new code. Some can be removed when the code is more baked.+ I have some "REGIONS TODO"s in the code that are things needed to changed about this PR. Of course there are many optimizations that will be done - seperated PRs will be submitted for those.=======This is the env I used for testing -complus_gcConcurrent=0complus_GCLogEnabled=1complus_GCLogFile=c:\temp\gclogcomplus_GCLogFileSize=100complus_GCName=clrgc.dllcomplus_GCRegionsRange=20000000complus_heapverify=1complus_StressLog=0command line I used for GCPerfSim --tc 2 -tagb 16 -tlgb 0.05 -lohar 0 -sohsi 10 -lohsi 0 -pohsi 0 -sohpi 0 -lohpi 0 -pohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time -printEveryNthIter 300000-tc 2 -tagb 32 -tlgb 0.05 -lohar 100 -sohsi 10 -lohsi 100 -pohsi 0 -sohpi 0 -lohpi 0 -pohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time -printEveryNthIter 300000
I have not done a full functional test run - there are things I know don't work yet. But being
able to run some tests will allow multiple people to work on this in parallel. I've included the
ones I ran below).
I added an env var for now to specify the range for regions (GCRegionsRange). We can have
a default reserve range that's very large. This simplies a lot of things. Right now I'm only enabling
the feature on 64-bit because I've only been testing on 64-bit. For 32-bit we would likely still
need to have things like grow_brick_card_tables.
Each generation has at least one region; each region can only belong to one generation.
Main changes -
because each region also needs to keep track of the same allocated/committed/etc. So when
USE_REGIONS is defined you could think region whenever you see segment. But gen/planned gen
are now maintained by each region since we no long have a contiguous range for ephemeral gens.
adjacent free blocks.
accordingly, eg, gc_mark, mark_through_cards, relocate_address.
we've planned them; we need to go through eph regions for allocating survivors, instead of
detecting crossing gen0/1 boundary by the survivors position relative to their gen start, we
plan the next generation (if necessary) when we run out of regions of the current generation
we are going through. This actually avoids some of the complexity with segments.
anymore.
but now it's used to indicate the region we are currently allocating in.
ephemeral_low/ephemeral_high
demotion_low/demotion_high
Some of these were replaced for regions; others haven't been (like ephemeral_low/ephemeral_high)
which will be done in separate work items.
in GC.
SVR GC.
into the ephemeral regions for concurrent like BGC revisit/overflow. Overflow can be optimized to
be per region (this optimization applies to segs as well but more gains with regions).
Some key implementation differences between regions and segments -
that generation. Any empty region can be removed off of this list. This is different from when we
had a generation start which meant the first segment would never be empty and we don't reuse it for new
eph seg.
there may not be survivors in gen1/0... with regions it's different because each time I finish a generation
I will call process_last_np to allocate the new gen. It's more consistent.
generation's alloc seg and we'd switch to gen1 when we need to plan ephemeral generations. But this is
unnecessary because all we need is the alloc_ptr/limit pair + alloc seg (or alloc region for regions).
I've made some progress simplying this so this never changes (always the condemned gen) and always
represents the alloc_ptr/limit/region. We can totally get rid of consing_gen and only maintain those
3 things.
Other things worth mentioning -
is more baked.
there are many optimizations that will be done - seperated PRs will be submitted for those.
This is the env I used for testing -
complus_gcConcurrent=0
complus_GCLogEnabled=1
complus_GCLogFile=c:\temp\gclog
complus_GCLogFileSize=100
complus_GCName=clrgc.dll
complus_GCRegionsRange=20000000
complus_heapverify=1
complus_StressLog=0
command line I used for GCPerfSim -
-tc 2 -tagb 16 -tlgb 0.05 -lohar 0 -sohsi 10 -lohsi 0 -pohsi 0 -sohpi 0 -lohpi 0 -pohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time -printEveryNthIter 300000
-tc 2 -tagb 32 -tlgb 0.05 -lohar 100 -sohsi 10 -lohsi 100 -pohsi 0 -sohpi 0 -lohpi 0 -pohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time -printEveryNthIter 300000