Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Replace LINQ's ArrayBuilder, LargeArrayBuilder, and SparseArrayBuilder with new SegmentedArrayBuilder#96570

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
stephentoub merged 3 commits intodotnet:mainfromstephentoub:linqtoarray
Jan 8, 2024

Conversation

@stephentoub
Copy link
Member

Following up on#90459 (comment). FYI,@neuecc.

There's a lot of code involved in these, some of which we special-case to only build into some of the targets. We can get most of the benefit with a simpler solution, which this attempts to provide. This commit removes those three types and replaces them with a SegmentedArrayBuilder. It uses stack space for the first N items (currently 8, hopefully that won't be an issue if some value type T is ridiculously large), and then uses an InlineArray of T[]s to store pool-rented arrays for each new segment it needs to allocate. Calling ToArray then produces an array of the exact right size, copying in the elements from the stack-allocated span and each of the constiuent arrays, then returning everything necessary to the pool.

The changes to ToArray also obviate the need for the old Buffer type. It existed to avoid one array allocation at the end of loading an enumerable, where we'd doubled and doubled and doubled in size, and then eventually have an array with all the data but some extra space. Now that we don't have such an array, we don't need Buffer, and can just the normal Enumerable.ToArray.

The one thing the new scheme doesn't handle as well is when there are multiple sources being added (specifically in Concat / SelectMany). Previously, the code used a complicated scheme to reserve space in the output for partial sources known to be ICollections, so it could use ICollection.CopyTo for each consistuent source. But CopyTo doesn't support partial copies, which means we can only use it if there's enough room available in an individual segment. The implementation thus tries to use it, and falls back to normal enumeration if it can't. For larger enumerations where the cost would end up being more apparent, the expectation is we'd quickly grow to a segment size where the subsequent appends are able to fit into the current segment and can thus still use the ICollection.CopyTo path. Hopefully.

My main motivation here was to reduce the complexity of having all of the various builders.

MethodToolchaincountMeanRatioAllocatedAlloc Ratio
Iterator_ToArray\main\corerun.exe177.21 ns1.00144 B1.00
Iterator_ToArray\pr\corerun.exe169.04 ns0.9088 B0.61
Iterator_ToArray\main\corerun.exe5135.72 ns1.00264 B1.00
Iterator_ToArray\pr\corerun.exe5104.83 ns0.77120 B0.45
Iterator_ToArray\main\corerun.exe50583.05 ns1.001200 B1.00
Iterator_ToArray\pr\corerun.exe50530.17 ns0.91480 B0.40
usingBenchmarkDotNet.Attributes;usingBenchmarkDotNet.Running;BenchmarkSwitcher.FromAssembly(typeof(Tests<>).Assembly).Run(args);publicpartialclassTests<T>whereT:new(){privateT_value=new();[Benchmark][Arguments(1)][Arguments(5)][Arguments(50)]publicT[]Iterator_ToArray(intcount)=>GetItems(count).ToArray();privateIEnumerable<T>GetItems(intcount){for(inti=0;i<count;i++)yieldreturn_value;}}[HideColumns("Error","StdDev","Median","RatioSD","Job")][MemoryDiagnoser(false)]publicclassInt32Tests:Tests<int>{}[HideColumns("Error","StdDev","Median","RatioSD","Job")][MemoryDiagnoser(false)]publicclassObjectTests:Tests<object>{}

Hellevar and xljiulang reacted with thumbs up emojiKeterSCP, PaulusParssinen, ksharperd, KirillKornienko, PauloHMattos, and pentp reacted with hooray emojieiriktsarpalis reacted with heart emojiHellevar and am11 reacted with rocket emojifilipnavara and neon-sunset reacted with eyes emoji
@stephentoubstephentoub added this to the9.0.0 milestoneJan 5, 2024
@ghost
Copy link

Tagging subscribers to this area: @dotnet/area-system-linq
See info inarea-owners.md if you want to be subscribed.

Issue Details

Following up on#90459 (comment). FYI,@neuecc.

There's a lot of code involved in these, some of which we special-case to only build into some of the targets. We can get most of the benefit with a simpler solution, which this attempts to provide. This commit removes those three types and replaces them with a SegmentedArrayBuilder. It uses stack space for the first N items (currently 8, hopefully that won't be an issue if some value type T is ridiculously large), and then uses an InlineArray of T[]s to store pool-rented arrays for each new segment it needs to allocate. Calling ToArray then produces an array of the exact right size, copying in the elements from the stack-allocated span and each of the constiuent arrays, then returning everything necessary to the pool.

The changes to ToArray also obviate the need for the old Buffer type. It existed to avoid one array allocation at the end of loading an enumerable, where we'd doubled and doubled and doubled in size, and then eventually have an array with all the data but some extra space. Now that we don't have such an array, we don't need Buffer, and can just the normal Enumerable.ToArray.

The one thing the new scheme doesn't handle as well is when there are multiple sources being added (specifically in Concat / SelectMany). Previously, the code used a complicated scheme to reserve space in the output for partial sources known to be ICollections, so it could use ICollection.CopyTo for each consistuent source. But CopyTo doesn't support partial copies, which means we can only use it if there's enough room available in an individual segment. The implementation thus tries to use it, and falls back to normal enumeration if it can't. For larger enumerations where the cost would end up being more apparent, the expectation is we'd quickly grow to a segment size where the subsequent appends are able to fit into the current segment and can thus still use the ICollection.CopyTo path. Hopefully.

My main motivation here was to reduce the complexity of having all of the various builders.

MethodToolchaincountMeanRatioAllocatedAlloc Ratio
Iterator_ToArray\main\corerun.exe177.21 ns1.00144 B1.00
Iterator_ToArray\pr\corerun.exe169.04 ns0.9088 B0.61
Iterator_ToArray\main\corerun.exe5135.72 ns1.00264 B1.00
Iterator_ToArray\pr\corerun.exe5104.83 ns0.77120 B0.45
Iterator_ToArray\main\corerun.exe50583.05 ns1.001200 B1.00
Iterator_ToArray\pr\corerun.exe50530.17 ns0.91480 B0.40
usingBenchmarkDotNet.Attributes;usingBenchmarkDotNet.Running;BenchmarkSwitcher.FromAssembly(typeof(Tests<>).Assembly).Run(args);publicpartialclassTests<T>whereT:new(){privateT_value=new();[Benchmark][Arguments(1)][Arguments(5)][Arguments(50)]publicT[]Iterator_ToArray(intcount)=>GetItems(count).ToArray();privateIEnumerable<T>GetItems(intcount){for(inti=0;i<count;i++)yieldreturn_value;}}[HideColumns("Error","StdDev","Median","RatioSD","Job")][MemoryDiagnoser(false)]publicclassInt32Tests:Tests<int>{}[HideColumns("Error","StdDev","Median","RatioSD","Job")][MemoryDiagnoser(false)]publicclassObjectTests:Tests<object>{}
Author:stephentoub
Assignees:-
Labels:

area-System.Linq

Milestone:9.0.0

There's a lot of code involved in these, some of which we special-case to only build into some of the targets, and it's unnecessarily complicated.  We can get most of the benefit with a simpler solution, which this attempts to provide. This commit removes those three types and replaces them with a SegmentedArrayBuilder.The changes to ToArray also obviate the need for the old Buffer type. It existed to avoid one array allocation at the end of loading an enumerable, where we'd doubled and doubled and doubled in size, and then eventually have an array with all the data but some extra space. Now that we don't have such an array, we don't need Buffer, and can just the normal Enumerable.ToArray.The one thing the new scheme doesn't handle as well is when there are multiple sources being added (specifically in Concat / SelectMany). Previously, the code used a complicated scheme to reserve space in the output for partial sources known to be ICollections, so it could use ICollection.CopyTo for each consistuent source. But CopyTo doesn't support partial copies, which means we can only use it if there's enough room available in an individual segment. The implementation thus tries to use it, and falls back to normal enumeration if it can't. For larger enumerations where the cost would end up being more apparent, the expectation is we'd quickly grow to a segment size where the subsequent appends are able to fit into the current segment. Hopefully.
@eiriktsarpaliseiriktsarpalis added the tenet-performancePerformance related issue labelJan 8, 2024
Copy link
Member

@eiriktsarpaliseiriktsarpalis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Good candidate for this year's perf blog it seems? :-)

stephentoub reacted with laugh emoji
@eerhardt
Copy link
Member

@stephentoub, we are seeing a slight app size regression in the ASP.NET benchmarks that I think might be caused by this change.

The regression happened between1063fc2...b4ec422.

I got themstat files before and after, comparing them in sizoscope shows:

image

Is this expected? The size regression is about 2% (roughly 200KB of 10MB).

Here are the the mstat files I got from the benchmark server, in case someone wants to take a look.
BasicMinimalApi.zip

@eerhardteerhardt mentioned this pull requestJan 10, 2024
@stephentoub
Copy link
MemberAuthor

Is this expected?

I actually expected the opposite.

Do you know where those additional interface implementations are coming from?

The array pool ones are expected, in that we're renting arrays as part of ToArray, so we'll get ArrayPool types for T[]s being created by ToArray.

@stephentoub
Copy link
MemberAuthor

stephentoub commentedJan 10, 2024
edited
Loading

Ah, looked at your sizeoscope files... they're all because of using ArrayPool in ToArray now. ArrayPool itself has this, for example:


so for every distinctT used with the pool, you also get a distinctSharedArrayPool<T>.Partitions.Partition[] type and all of the interfaces it implements.

@stephentoub
Copy link
MemberAuthor

stephentoub commentedJan 11, 2024
edited
Loading

@MichalStrehovsky, is there any way (in corelib) to say "i promise this T[] will never be cast to any of the interfaces it implements, feel free not to keep that code around if it wouldn't otherwise be needed"? Or is there any kind of analysis we could be doing to automatically see that it's never used in that capacity?

@MichalStrehovsky
Copy link
Member

@MichalStrehovsky, is there any way (in corelib) to say "i promise this T[] will never be cast to any of the interfaces it implements, feel free not to keep that code around if it wouldn't otherwise be needed"?

We don't have that. We could build that; it sounds a bit like it would open a can of worms though.

A different can of worms (with different problems) for consideration:

staticT[]AllocateNewFrugalArray<T>(intcount)=>typeof(T).IsValueType?newT[count]:Unsafe.As<T[]>(newobject[count]);

Or is there any kind of analysis we could be doing to automatically see that it's never used in that capacity?

We already do this analysis, but interfaces on reference type arrays are difficult to get rid of due to array covariance. The app might not be usingIList<string>, but if it usesIList<object>, that could end up callingIList<string> methods onstring[]. It doesn't even matter thatIList<T> is not variant because the interface has magic behaviors on arrays.

@stephentoub
Copy link
MemberAuthor

Thanks. I should be able to just use object[].

@neuecc
Copy link

thanks for great implementation!

@github-actionsgithub-actionsbot locked and limited conversation to collaboratorsFeb 11, 2024
Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.

Reviewers

@eiriktsarpaliseiriktsarpaliseiriktsarpalis approved these changes

Assignees

@stephentoubstephentoub

Labels

area-System.Linqtenet-performancePerformance related issue

Projects

None yet

Milestone

9.0.0

Development

Successfully merging this pull request may close these issues.

5 participants

@stephentoub@eerhardt@MichalStrehovsky@neuecc@eiriktsarpalis

[8]ページ先頭

©2009-2025 Movatter.jp