Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Initial checkin for regions#45172

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
Maoni0 merged 1 commit intodotnet:masterfromMaoni0:regions
Dec 4, 2020
Merged

Initial checkin for regions#45172

Maoni0 merged 1 commit intodotnet:masterfromMaoni0:regions
Dec 4, 2020

Conversation

@Maoni0
Copy link
Member

I have not done a full functional test run - there are things I know don't work yet. But being
able to run some tests will allow multiple people to work on this in parallel. I've included the
ones I ran below).

I added an env var for now to specify the range for regions (GCRegionsRange). We can have
a default reserve range that's very large. This simplies a lot of things. Right now I'm only enabling
the feature on 64-bit because I've only been testing on 64-bit. For 32-bit we would likely still
need to have things like grow_brick_card_tables.

Each generation has at least one region; each region can only belong to one generation.


Main changes -

  • I'm using the current heap_segment to represent a region instead of doing a ton of renaming
    because each region also needs to keep track of the same allocated/committed/etc. So when
    USE_REGIONS is defined you could think region whenever you see segment. But gen/planned gen
    are now maintained by each region since we no long have a contiguous range for ephemeral gens.
  • Added a simple region allocator that finds free blocks for region alloc requests and coalesce
    adjacent free blocks.
  • Places that need to check for ephemeral generations are changed to check for gen_num/plan_gen_num
    accordingly, eg, gc_mark, mark_through_cards, relocate_address.
  • plan phase is changed quite a bit to accommodate regions, eg, we set each region's plan as
    we've planned them; we need to go through eph regions for allocating survivors, instead of
    detecting crossing gen0/1 boundary by the survivors position relative to their gen start, we
    plan the next generation (if necessary) when we run out of regions of the current generation
    we are going through. This actually avoids some of the complexity with segments.
  • sweep phase is heavily changed since we don't have a gen limit to detect eph gen boundaries
    anymore.
  • Rewrote the code that threads segments together for regions after compact/sweep is done.
  • Changed the SOH allocator to acquire new regions when needed, still maintain ephemeral_heap_segment
    but now it's used to indicate the region we are currently allocating in.
  • Got rid of the following that doesn't apply to regions -
    ephemeral_low/ephemeral_high
    demotion_low/demotion_high
    Some of these were replaced for regions; others haven't been (like ephemeral_low/ephemeral_high)
    which will be done in separate work items.
  • Added some internal stress mechanisms like selectively pinning some objects and creating ro segs
    in GC.
  • I have not changed the write barrier code so for WKS GC cards will be set unconditionally like in
    SVR GC.
  • Changed verify_heap to verify regions.
  • Perf changes, eg, to determine compaction.
  • Some changes for BGC; it's still incomplete but I've changed places where it needs to avoid reading
    into the ephemeral regions for concurrent like BGC revisit/overflow. Overflow can be optimized to
    be per region (this optimization applies to segs as well but more gains with regions).

Some key implementation differences between regions and segments -

  • We no longer have the expand heap code paths which simplies things by a lot.
  • We no longer have a generation start object - each generation has a list of regions that belong to
    that generation. Any empty region can be removed off of this list. This is different from when we
    had a generation start which meant the first segment would never be empty and we don't reuse it for new
    eph seg.
  • With segments in plan phase we may not have allocated gen1/0 start while going through survivors since
    there may not be survivors in gen1/0... with regions it's different because each time I finish a generation
    I will call process_last_np to allocate the new gen. It's more consistent.
  • consing_gen in plan phase really doesn't need to be a generation. With segments it would change that
    generation's alloc seg and we'd switch to gen1 when we need to plan ephemeral generations. But this is
    unnecessary because all we need is the alloc_ptr/limit pair + alloc seg (or alloc region for regions).
    I've made some progress simplying this so this never changes (always the condemned gen) and always
    represents the alloc_ptr/limit/region. We can totally get rid of consing_gen and only maintain those
    3 things.

Other things worth mentioning -

  • I intentionally left quite a bit of logging for the new code. Some can be removed when the code
    is more baked.
  • I have some "REGIONS TODO"s in the code that are things needed to changed about this PR. Of course
    there are many optimizations that will be done - seperated PRs will be submitted for those.

This is the env I used for testing -

complus_gcConcurrent=0
complus_GCLogEnabled=1
complus_GCLogFile=c:\temp\gclog
complus_GCLogFileSize=100
complus_GCName=clrgc.dll
complus_GCRegionsRange=20000000
complus_heapverify=1
complus_StressLog=0

command line I used for GCPerfSim -

-tc 2 -tagb 16 -tlgb 0.05 -lohar 0 -sohsi 10 -lohsi 0 -pohsi 0 -sohpi 0 -lohpi 0 -pohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time -printEveryNthIter 300000

-tc 2 -tagb 32 -tlgb 0.05 -lohar 100 -sohsi 10 -lohsi 100 -pohsi 0 -sohpi 0 -lohpi 0 -pohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time -printEveryNthIter 300000

GrabYourPitchforks reacted with hooray emoji
@ghost
Copy link

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Issue Details

I have not done a full functional test run - there are things I know don't work yet. But being
able to run some tests will allow multiple people to work on this in parallel. I've included the
ones I ran below).

I added an env var for now to specify the range for regions (GCRegionsRange). We can have
a default reserve range that's very large. This simplies a lot of things. Right now I'm only enabling
the feature on 64-bit because I've only been testing on 64-bit. For 32-bit we would likely still
need to have things like grow_brick_card_tables.

Each generation has at least one region; each region can only belong to one generation.


Main changes -

  • I'm using the current heap_segment to represent a region instead of doing a ton of renaming
    because each region also needs to keep track of the same allocated/committed/etc. So when
    USE_REGIONS is defined you could think region whenever you see segment. But gen/planned gen
    are now maintained by each region since we no long have a contiguous range for ephemeral gens.
  • Added a simple region allocator that finds free blocks for region alloc requests and coalesce
    adjacent free blocks.
  • Places that need to check for ephemeral generations are changed to check for gen_num/plan_gen_num
    accordingly, eg, gc_mark, mark_through_cards, relocate_address.
  • plan phase is changed quite a bit to accommodate regions, eg, we set each region's plan as
    we've planned them; we need to go through eph regions for allocating survivors, instead of
    detecting crossing gen0/1 boundary by the survivors position relative to their gen start, we
    plan the next generation (if necessary) when we run out of regions of the current generation
    we are going through. This actually avoids some of the complexity with segments.
  • sweep phase is heavily changed since we don't have a gen limit to detect eph gen boundaries
    anymore.
  • Rewrote the code that threads segments together for regions after compact/sweep is done.
  • Changed the SOH allocator to acquire new regions when needed, still maintain ephemeral_heap_segment
    but now it's used to indicate the region we are currently allocating in.
  • Got rid of the following that doesn't apply to regions -
    ephemeral_low/ephemeral_high
    demotion_low/demotion_high
    Some of these were replaced for regions; others haven't been (like ephemeral_low/ephemeral_high)
    which will be done in separate work items.
  • Added some internal stress mechanisms like selectively pinning some objects and creating ro segs
    in GC.
  • I have not changed the write barrier code so for WKS GC cards will be set unconditionally like in
    SVR GC.
  • Changed verify_heap to verify regions.
  • Perf changes, eg, to determine compaction.
  • Some changes for BGC; it's still incomplete but I've changed places where it needs to avoid reading
    into the ephemeral regions for concurrent like BGC revisit/overflow. Overflow can be optimized to
    be per region (this optimization applies to segs as well but more gains with regions).

Some key implementation differences between regions and segments -

  • We no longer have the expand heap code paths which simplies things by a lot.
  • We no longer have a generation start object - each generation has a list of regions that belong to
    that generation. Any empty region can be removed off of this list. This is different from when we
    had a generation start which meant the first segment would never be empty and we don't reuse it for new
    eph seg.
  • With segments in plan phase we may not have allocated gen1/0 start while going through survivors since
    there may not be survivors in gen1/0... with regions it's different because each time I finish a generation
    I will call process_last_np to allocate the new gen. It's more consistent.
  • consing_gen in plan phase really doesn't need to be a generation. With segments it would change that
    generation's alloc seg and we'd switch to gen1 when we need to plan ephemeral generations. But this is
    unnecessary because all we need is the alloc_ptr/limit pair + alloc seg (or alloc region for regions).
    I've made some progress simplying this so this never changes (always the condemned gen) and always
    represents the alloc_ptr/limit/region. We can totally get rid of consing_gen and only maintain those
    3 things.

Other things worth mentioning -

  • I intentionally left quite a bit of logging for the new code. Some can be removed when the code
    is more baked.
  • I have some "REGIONS TODO"s in the code that are things needed to changed about this PR. Of course
    there are many optimizations that will be done - seperated PRs will be submitted for those.

This is the env I used for testing -

complus_gcConcurrent=0
complus_GCLogEnabled=1
complus_GCLogFile=c:\temp\gclog
complus_GCLogFileSize=100
complus_GCName=clrgc.dll
complus_GCRegionsRange=20000000
complus_heapverify=1
complus_StressLog=0

command line I used for GCPerfSim -

-tc 2 -tagb 16 -tlgb 0.05 -lohar 0 -sohsi 10 -lohsi 0 -pohsi 0 -sohpi 0 -lohpi 0 -pohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time -printEveryNthIter 300000

-tc 2 -tagb 32 -tlgb 0.05 -lohar 100 -sohsi 10 -lohsi 100 -pohsi 0 -sohpi 0 -lohpi 0 -pohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time -printEveryNthIter 300000

Author:Maoni0
Assignees:-
Labels:

area-GC-coreclr

Milestone:-

@Maoni0
Copy link
MemberAuthor

Maoni0 commentedNov 25, 2020
edited
Loading

the one test failure from "Libraries Build windows net48 x86 Release" was a cryptography test that's now been disabled 'cause the machines don't have the right patch. tracked by issue#45168.

the only other failures are from mono tests which are also known.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Nit: why have a separate if with a swapped but I think identical condition?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

no reason other than this is a refactoring change and I simply moved the code to be inside a method instead of duplicated in 2 places :) I can definitely make it nicer (including your comment below). I have a cleanup PR coming up so I could do it then unless there's something else I need to change in this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Ok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

So this is just calling the helper instead of repeating the code here?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

yes, I think when I wrote the original code I had something else going that made it not exactly the same. but that difference was gone so I should have converted it to calling the method.


In reply to:534273646 [](ancestors = 534273646)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Nit: you could combine this with the previous line containing the declaration - easier to read.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

will do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Is this really signed or unsigned? I see we compute it as a sum of differences of pointers, so ptrdiff_t is appropriate, but then we return it as a size_t, so we should probably add an assert to make sure it's >= 0.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

this is a bit involved... let me see if I can find you online to explain more. basically there was a perf bug in the old code (which only affects a certain scenario and doesn't cause a big problem so I just left the code as is - we can fix it but I didn't want to fix it in this PR 'cause I really tried to not change functionally for segments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't understand this change - looks like it's a bug fix?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

yes this was a bug fix. I noticed it when I was testing regions and I knew the cards were not correct yet the last GC's verify_heap didn't find it which was because we should always update found_card_p and need_card_p every time we see a new card instead of only updating them when found_card_p is FALSE.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Was this change just for your testing?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

yes.. I shouldn't have checked this in.. and I'll revert it. I just noticed that I'll need to update the PR anyway (because the merge conflicts are too complicated to resolve online). so I'll also include changes that you suggested in other comments.

…d the ones I ran below).I have not done a full functional test run - there are things I know don't work yet. But beingable to run some tests will allow multiple people to work on this in parallel.I added an env var for now to specify the range for regions. We can have a default reserve rangethat's very large. This simplies a lot of things. Right now I'm only enabling the feature on 64-bitbecause I've only been testing on 64-bit. For 32-bit we would likely still need to have things like grow_brick_card_tables.Each generation has at least one region; each region can only belong to one generation.---Main changes -+ I'm using the current heap_segment to represent a region instead of doing a ton of renaming  because each region also needs to keep track of the same allocated/committed/etc. So when  USE_REGIONS is defined you could think region whenever you see segment. But gen/planned gen  are now maintained by each region since we no long have a contiguous range for ephemeral gens.+ Added a simple region allocator that finds free blocks for region alloc requests and coalesce  adjacent free blocks.+ Places that need to check for ephemeral generations are changed to check for gen_num/plan_gen_num  accordingly, eg, gc_mark, mark_through_cards, relocate_address.+ plan phase is changed quite a bit to accommodate regions, eg, we set each region's plan as  we've planned them; we need to go through eph regions for allocating survivors, instead of  detecting crossing gen0/1 boundary by the survivors position relative to their gen start, we  plan the next generation (if necessary) when we run out of regions of the current generation  we are going through. This actually avoids some of the complexity with segments.+ sweep phase is heavily changed since we don't have a gen limit to detect eph gen boundaries  anymore.+ Rewrote the code that threads segments together for regions after compact/sweep is done.+ Changed the SOH allocator to acquire new regions when needed, still maintain ephemeral_heap_segment  but now it's used to indicate the region we are currently allocating in.+ Got rid of the following that doesn't apply to regions -  ephemeral_low/ephemeral_high  demotion_low/demotion_high  Some of these were replaced for regions; others haven't been (like ephemeral_low/ephemeral_high)  which will be done in separate work items.+ Added some internal stress mechanisms like selectively pinning some objects and creating ro segs  in GC.+ I have not changed the write barrier code so for WKS GC cards will be set unconditionally like in  SVR GC.+ Changed verify_heap to verify regions.+ Perf changes, eg, to determine compaction.+ Some changes for BGC; it's still incomplete but I've changed places where it needs to avoid reading  into the ephemeral regions for concurrent like BGC revisit/overflow. Overflow can be optimized to  be per region (this optimization applies to segs as well but more gains with regions).Some key implementation differences between regions and segments -+ We no longer have the expand heap code paths which simplies things by a lot.+ We no longer have a generation start object - each generation has a list of regions that belong to  that generation. Any empty region can be removed off of this list. This is different from when we  had a generation start which meant the first segment would never be empty and we don't reuse it for new  eph seg.+ With segments in plan phase we may not have allocated gen1/0 start while going through survivors since  there may not be survivors in gen1/0... with regions it's different because each time I finish a generation  I will call process_last_np to allocate the new gen. It's more consistent.+ consing_gen in plan phase really doesn't need to be a generation. With segments it would change that  generation's alloc seg and we'd switch to gen1 when we need to plan ephemeral generations. But this is  unnecessary because all we need is the alloc_ptr/limit pair + alloc seg (or alloc region for regions).  I've made some progress simplying this so this never changes (always the condemned gen) and always  represents the alloc_ptr/limit/region. We can totally get rid of consing_gen and only maintain those  3 things.=======Other things worth mentioning -+ I intentionally left quite a bit of logging for the new code. Some can be removed when the code  is more baked.+ I have some "REGIONS TODO"s in the code that are things needed to changed about this PR. Of course  there are many optimizations that will be done - seperated PRs will be submitted for those.=======This is the env I used for testing -complus_gcConcurrent=0complus_GCLogEnabled=1complus_GCLogFile=c:\temp\gclogcomplus_GCLogFileSize=100complus_GCName=clrgc.dllcomplus_GCRegionsRange=20000000complus_heapverify=1complus_StressLog=0command line I used for GCPerfSim --tc 2 -tagb 16 -tlgb 0.05 -lohar 0 -sohsi 10 -lohsi 0 -pohsi 0 -sohpi 0 -lohpi 0 -pohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time -printEveryNthIter 300000-tc 2 -tagb 32 -tlgb 0.05 -lohar 100 -sohsi 10 -lohsi 100 -pohsi 0 -sohpi 0 -lohpi 0 -pohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time -printEveryNthIter 300000
@Maoni0Maoni0 merged commitacfd477 intodotnet:masterDec 4, 2020
@ghostghost locked asresolvedand limited conversation to collaboratorsJan 3, 2021
Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.

Reviewers

1 more reviewer

@PeterSolMSPeterSolMSPeterSolMS approved these changes

Reviewers whose approvals may not affect merge requirements

Assignees

No one assigned

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

3 participants

@Maoni0@PeterSolMS@Dotnet-GitSync-Bot

[8]ページ先頭

©2009-2025 Movatter.jp