Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

datas tuning fix#98743

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
Maoni0 merged 2 commits intodotnet:mainfromMaoni0:datas_tuning
Feb 23, 2024
Merged

datas tuning fix#98743

Maoni0 merged 2 commits intodotnet:mainfromMaoni0:datas_tuning
Feb 23, 2024

Conversation

@Maoni0
Copy link
Member

@Maoni0Maoni0 commentedFeb 21, 2024
edited
Loading

  • Change the HC (heap count) adjustment based on history and how successful the previous adjustment was -
    • looking at the trending of this buffer and using it to detect if things look stable or if they are
      trending up/down (and if so how fast is that trend) and make a decision if we want to grow/shrink according to our calculation
    • previous we barely ever shrank the HC, with this change we shrink as needed
    • if we just grew and the calculation says to grow again, we grow more aggressively
    • if we just shrink but the tcp didn't come down, and the calculation says to shrink again, we should avoid shrinking for a while
  • One of the reasons for outliers is something temporarily affected GC work. We pick the min tcp if the survival is very stable to avoid counting these outliers.
  • Added simple gen2 handling for BGC.
  • Bug fixes -
    • When we change the heap count, we should not be refreshing all new heaps' budget which will cause a spike in heap size. If the budget is already partially used we should use up the existing budget and let the next GC will refresh it.
    • Don't carry stcp over when HC is changed - it doesn't make sense since the estimated stcp is bogus
    • Don't add the first sample as it's artificially skewed by startup time

There are a few issues with these that will be addressed in future checkins -

  • the aggressiveness factor needs to be capped and also it needs to discard history if history is too distant
  • growth is too aggressive for large tcps which causes an initial spike when we look at heap counts
  • recognize when the slope direction changes, ie, trending upward <-> downward and discard older entries as appropriate

En3Tho reacted with eyes emoji
@ghostghost added the area-GC-coreclr labelFeb 21, 2024
@ghostghost assignedMaoni0Feb 21, 2024
@ghost
Copy link

Tagging subscribers to this area: @dotnet/gc
See info inarea-owners.md if you want to be subscribed.

Issue Details

will add description soon.

Author:Maoni0
Assignees:Maoni0
Labels:

area-GC-coreclr

Milestone:-

}

float mean (float* arr, int size)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Worth checking if size > 0 as a precondition check?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I've added an assert in slope which makes more sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

kind of similar to log_with_base, is the assertion/condition intended formean or callers ofmean? If it's a precondition formean, then I would expect the precondition check to be inmean (or bothmean and the callers).

Or ifmean is supposed to support some callers with a negative size, then the final return probably needs to be something likereturn (size > 0) ? (sum / size) : 0


size_t gc_heap::get_num_completed_gcs ()
float log_with_base (float x, float base)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Is it worth asserting if x and base > 0?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

it's actually meant to have x > base and should be enforced. but I can still add an assert.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

log_b(x) is fine forx <= base (e.g., log_2(2) = 1, log_4(2) = 1/2)

I think you're saying (by "should be enforced") that current call site(s) expectx > base.log_with_base is a very reasonable helper function that could get used elsewhere without such a restriction. Or maybe you want to rename it to show that it is intended as a helper for a specific context rather than a general log helper?

@Maoni0Maoni0 changed the title[WIP] datas tuning fixdatas tuning fixFeb 22, 2024
@Maoni0
Copy link
MemberAuthor

v2-base is the baseline and v2-rc3 is this change, 4 runs each.

imageimageimage
mrsharm reacted with heart emoji

uint64_telapsed_between_gcs;// time between gcs in microseconds (this should really be between_pauses)
uint64_tgc_pause_time;// pause time for this GC
uint64_tmsl_wait_time;
size_tgc_survived_size;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
size_tgc_survived_size;
size_tgc_survived_size;// total survived size across all relevant generations for this GC

i.e., it's -not- gen0 to be consistent in what is being recorded

//
// We need to observe the history of tcp's so record them in a small buffer.
//
floatrecorded_tcp_rearranged[recorded_tcp_array_size];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

You've mentioned this before, but this is doable without copying the data (though I think the real concern would be avoiding the additional concept of "rearranged" data rather than the copy of a small amount data, which easily could be negligible in cost.

Encapsulating the data in a circular buffer with an iterator would probably accomplish this - probably makes sense to this as a follow-up PR which I can do.

floatrecorded_tcp_rearranged[recorded_tcp_array_size];
floatrecorded_tcp[recorded_tcp_array_size];
intrecorded_tcp_index;
inttotal_recorded_tcp;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
inttotal_recorded_tcp;
inttotal_recorded_tcp;// can exceed the array size

Comment on lines +4268 to +4272
recorded_tcp_index++;
if (recorded_tcp_index==recorded_tcp_array_size)
{
recorded_tcp_index=0;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
recorded_tcp_index++;
if (recorded_tcp_index==recorded_tcp_array_size)
{
recorded_tcp_index=0;
}
recorded_tcp_index= (recorded_tcp_index+1) %recorded_tcp_array_size;

if (total_recorded_tcp >=recorded_tcp_array_size)
{
intearlier_entry_size=recorded_tcp_array_size-recorded_tcp_index;
memcpy (recorded_tcp_rearranged, (recorded_tcp+recorded_tcp_index), (earlier_entry_size*sizeof (float)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Can we use std::copy in this project to avoid the manual byte size computation?

returncopied_count;
}

inthighest_avg_recorded_tcp (intcount,floatavg,float*highest_avg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This name is a bit confusing to me. It looks like it returns the average and count of the elements above a limit (which happens to be the average, given the name of the parameter, but it isn't relevant to this function that it's the average).

floathighest_sum=0.0;
inthighest_count=0;

for (inti=0;i<count;i++)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think this is using thecount oldest elements in the buffer - should it be newest?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

note -count is the entire buffer (as returned by the rearrange method and passed back in here), so there isn't a correctness issue here

floatrecorded_tcp_rearranged[recorded_tcp_array_size];
floatrecorded_tcp[recorded_tcp_array_size];
intrecorded_tcp_index;
inttotal_recorded_tcp;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

recorded_tcp_count to be consistent with other naming?

// each time our calculation tells us to shrink.
intdec_failure_count;
intdec_failure_recheck_threshold;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

For later - I think it would be interesting to share the increment/decrement cases to avoid some duplication. It would have to be parameterized in some way so that the behavior could be customized. Anyways, there's no requested change here right now.

floatbelow_target_accumulation;
floatbelow_target_threshold;

// Currently only used for dprintf.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

#ifdef this?

// Recording the gen2 GC indices so we know how far apart they are. Currently unused
// but we should consider how much value there is if they are very far apart.
size_tgc_index;
// This is (gc_elapsed_time / time inbetween this and the last gen2 GC)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

nit - "in between" or even just "between"

// at the beginning of a BGC and the PM triggered full GCs
// fall into this case.
PER_HEAP_ISOLATED_FIELD_DIAG_ONLYuint64_tsuspended_start_time;
// Right now this is diag only but may be used functionally later.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't think this comment really adds anything

Comment on lines +22052 to +22053
dynamic_heap_count_data.sample_index = (dynamic_heap_count_data.sample_index + 1) % dynamic_heap_count_data_t::sample_size;
(dynamic_heap_count_data.current_samples_count)++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It bugs me a bit that the sample and recorded tcp handling are different (one inline here, the other in helper methods), but I think that's for another day.

}
}

float avg_x = (float)sum_x / n;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
float avg_x = (float)sum_x / n;
float avg_x = ((float)sum_x) / n;

or thestatic_cast<float>(sum_x) / n format requires parenthesis

Copy link
Contributor

@markplesmarkplesFeb 23, 2024
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

also below, though I don't think those explicit casts are needed since avg_x is a float. fine to be careful though of course.

Copy link
Contributor

@markplesmarkplesFeb 23, 2024
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

also this is just (n+1) / 2.0f, though the loop is still needed for dprintf

// Change it to a desired number if you want to print.
int max_times_to_print_tcp = 0;

// Return the slope, and the average values in the avg arg.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Is there a name for the slope that is being calculated here? I see that it's a weighted sum based on distance from the middle, but I'm not familiar with that. For example, I don't think this is the slope of a typical regression line? (which is fine, though I guess I'm a bit curious about the mathematical properties of this)

}

float median_throughput_cost_percent = median_of_3 (throughput_cost_percents[0], throughput_cost_percents[1], throughput_cost_percents[2]);
float avg_throughput_cost_percent = (float)((throughput_cost_percents[0] + throughput_cost_percents[1] + throughput_cost_percents[2]) / 3.0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

nit - might be able to drop the(float) if you used3.0f

Comment on lines +25583 to +25590
if (dynamic_heap_count_data.dec_failure_count)
{
(dynamic_heap_count_data.dec_failure_count)++;
}
else
{
dynamic_heap_count_data.dec_failure_count = 1;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't think thisif is necessary.


if (shrink_p && step_down_int && (new_n_heaps > step_down_int))
{
// TODO - if we see that it wants to shrink by 1 heap too many times, we do want to shrink.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

also if n_heaps is small, then 1 is significant

(well, significant to the heap count, if the gc heap is a small fraction of overall memory, which it might be if the heap count is small, then the memory savings could still be insignificant)

Copy link
Contributor

@markplesmarkples left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

My review is very late for the preview release. These aren't necessary right now and can be addressed in a future PR.

@Maoni0Maoni0 merged commitb07134a intodotnet:mainFeb 23, 2024
@github-actionsgithub-actionsbot locked and limited conversation to collaboratorsMar 25, 2024
@sebastienros
Copy link
Member

sebastienros commentedApr 5, 2024
edited
Loading

/cc@MichalStrehovsky@eerhardt

Just for visibility as people started asking about this, I believe this introduced a slight RPS regression in the native aot benchmarks. Windows and Linux.

image

And we can see an improvement in max working set

image

NB: The unstable results are unrelated and were tracked in#98021

Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.

Reviewers

@mrsharmmrsharmmrsharm left review comments

@mangod9mangod9mangod9 approved these changes

+1 more reviewer

@markplesmarkplesmarkples approved these changes

Reviewers whose approvals may not affect merge requirements

Assignees

@Maoni0Maoni0

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

5 participants

@Maoni0@sebastienros@markples@mangod9@mrsharm

[8]ページ先頭

©2009-2025 Movatter.jp