- Notifications
You must be signed in to change notification settings - Fork4.2k
Use runtime helperCreateSpan forstackalloc of non-byte arrays#71261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
CreateSpan forstackalloc of non-byte arrays intoReadOnlySpanCreateSpan forstackalloc of non-byte arrays9dd1ba3 to5261199Compare5261199 tob452022CompareUh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
src/Compilers/CSharp/Portable/CodeGen/EmitStackAllocInitializer.cs OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
src/Compilers/CSharp/Portable/CodeGen/EmitStackAllocInitializer.cs OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
src/Compilers/CSharp/Portable/CodeGen/EmitStackAllocInitializer.cs OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter_Conversion.cs OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter_Conversion.cs OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter_Conversion.cs OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
src/Compilers/CSharp/Test/Emit/CodeGen/CodeGenStackAllocInitializerTests.cs OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
src/Compilers/CSharp/Test/Emit/CodeGen/CodeGenStackAllocInitializerTests.csShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
src/Compilers/CSharp/Portable/CodeGen/EmitStackAllocInitializer.cs OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
src/Compilers/CSharp/Portable/CodeGen/EmitStackAllocInitializer.cs OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
src/Compilers/CSharp/Portable/CodeGen/EmitStackAllocInitializer.cs OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
src/Compilers/CSharp/Portable/CodeGen/EmitStackAllocInitializer.cs OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
src/Compilers/CSharp/Portable/CodeGen/EmitStackAllocInitializer.cs OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
| _builder.EmitOpCode(ILOpCode.Call,0); | ||
| EmitSymbolToken(getPinnableReference,syntaxNode,optArgList:null); | ||
| _builder.EmitIntConstant(data.Length); | ||
| _builder.EmitOpCode(ILOpCode.Cpblk,-3); |
AlekseyTsDec 15, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
We emitcpblk in a very similar way as above (underif (elementType.EnumUnderlyingTypeOrSelf().SpecialType.SizeInBytes() == 1)), so I assumed the alignment is fine.
ECMA-335 also says:
All such structures, allocated by the CLI, are naturally aligned for the current platform
and the field has a structure type, so I think it should be aligned properly.
AlekseyTsDec 18, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
and the field has a structure type, so I think it should be aligned properly.
A blob of bytes is not an instance of a structure type. We are not copying data from a regular field. We are copying data from a blob stored in a module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I thought we are emitting a special struct (e.g.,__StaticArrayInitTypeSize=12) and a field in<PrivateImplementationDetails> which is of the type of that struct, initialized with a blob of data, and we are using that field - so that field should be naturally aligned as it's of the struct type. PlusCreateSpan can return pointer to a different blob of data with fixed-up endianness, but it should be also naturally aligned as it allocates (and caches) an array in that case.
AlekseyTsDec 20, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Here is what I think:
- We are not working with a regular field. We are working with a field which is mapped into a region of memory inside module's image, so-called an RVA field.
- Alignment of that region cannot change at runtime, it is the alignment that was used during emit.
- Some operations have certain alignment requirements. For example, we already established that
CreateSpanAPI has them. - A comment for
PrivateImplementationDetails.CreateDataFieldexplains how compiler enforces the right alignment for an RVA field. Here is a verbatim quote: "a type is generated of the same size as
the data, and that type needs its .pack set to the alignment required for the underlying data. While that
.pack value isn't required by anything else in the compiler (the compiler always aligns RVA fields at 8-byte
boundaries, which accomodates any element type that's relevant), it is necessary for IL rewriters. Such rewriters
also need to ensure an appropriate alignment is maintained for the RVA field, and while they could also simplify
by choosing a worst-case alignment as does the compiler, they may instead use the .pack value as the alignment
to use for that field, since it's an opaque blob with no other indication as to what kind of data is
stored and what alignment might be required." - It looks like we already addressed the
CreateSpanrequired alignment, which is thenatural boundary of the span's element type. - The span produced by
CreateSpanpoints into the same blob when the data are not moved or copied anywhere else. Therefore, still have the original alignment. - By comparison to
CreateSpan,cpblkcommand has a different alignment requirement: "aligned tothe natural size of the machine". Does the RVA meet this requirement? The spec doesn't provide any specific number. As of commit 10, we are using different alignment for different element types. It is quite possible that some of the numbers aren't matching "the natural size of the machine". Perhaps we should always use 8-byte alignment. It would be good to get some guidance from the runtime team here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@stephentoub can you please advise how we ensure we meetcpblk's alignment requirements here? Thanks.
Btw, looks like this code creates an RVA field with alignment 1 and uses it withcpblk, which, if correct, would suggestcpblk doesn't need any special alignment of the field:
| elseif(elementType.EnumUnderlyingTypeOrSelf().SpecialType.SizeInBytes()==1) | |
| { | |
| // Initialize the stackalloc by copying the data from a metadata blob | |
| varfield=_builder.module.GetFieldForData(data,alignment:1,inits.Syntax,_diagnostics.DiagnosticBag); | |
| _builder.EmitOpCode(ILOpCode.Dup); | |
| _builder.EmitOpCode(ILOpCode.Ldsflda); | |
| _builder.EmitToken(field,inits.Syntax,_diagnostics.DiagnosticBag); | |
| _builder.EmitIntConstant(data.Length); | |
| _builder.EmitOpCode(ILOpCode.Cpblk,-3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
You can prefix the cpblk instruction with unaligned in order to avoid the alignment requirement, which IIRC wants the pointers to be aligned to the machine's natural size. It's possible the cited existing code should be using unaligned as well, or maybe the requirement is less about the machine's natural word size and more about the size of the type in use (or maybe this is a place where ECMA and coreclr differ).@jkotas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
If you would like to be compliant with ECMA, emit theunaligned prefix. It is not required to make things work on current runtimes.cpblk implementations in current runtimes do not make any assumptions about the alignment of the inputs.
src/Compilers/CSharp/Portable/CodeGen/EmitStackAllocInitializer.cs OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
AlekseyTs commentedDec 15, 2023
Done with review pass (commit 1) |
src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter_Conversion.cs OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
src/Compilers/CSharp/Test/Emit/CodeGen/CodeGenStackAllocInitializerTests.csShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
src/Compilers/CSharp/Test/Emit/CodeGen/CodeGenStackAllocInitializerTests.cs OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
| EmitSymbolToken(spanGetItem,syntaxNode,optArgList:null); | ||
| _builder.EmitIntConstant(data.Length); | ||
| if(sizeInBytes!=8) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Looking at theunaligned. section of ECMA-335
shall use unaligned. if the alignment is not known at compile time to be 8-byte.
From that, I infer that, if we align the RVA field type at 8 bytes, we will be compliant without theunaligned prefix. Should we use that option instead, rather than having conditional IL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Are there any downsides to conditional IL? On the other hand, it seems that aligning properly to 1/2/4/8 bytes instead of always to 8 bytes saves space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Are there any downsides to conditional IL?
IL is bigger. Compiler's code is more complex. Test matrix is bigger, more scenarios need special attention.
it seems that aligning properly to 1/2/4/8 bytes instead of always to 8 bytes saves space.
I do not think it actually does. At least not for the code emitted by our compilers because, I think, compiler aligns all blobs at 8 bytes regardless of the requested alignment, this is mentioned in one of the comments in code. The alignment that we explicitly set on the types is important for IL rewriting scenarios.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The alignment that we explicitly set on the types is important for IL rewriting scenarios.
I think this means it improves size for AOT and similar precompiled code (right?) which seems important as well.
AlekseyTsJan 10, 2024 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think this means it improves size for AOT and similar precompiled code (right?) which seems important as well.
To be honest, I do not know the answer to these questions, And, even if that was the case, I am not sure if that is something that we should care about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I do not think it makes sense to deoptimize the 1-byte case like that.
To make the two cases as similar as possible, you can emitunaligned.1 cpblk in both cases and emit the blob withelementType.EnumUnderlyingTypeOrSelf().SpecialType.SizeInBytes() alignment in both cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
did you want that to be aligned to 8 bytes as well?
Yes, this is what I would do. I would align both at 8 bytes explicitly (we are aligning blobs at 8 bytes regardless), and would completely get rid of the.unaligned instruction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I would align both at 8 bytes explicitly (we are aligning blobs at 8 bytes regardless), and would completely get rid of the .unaligned instruction.
It is size de-optimization for trimmer and AOT that are scenarios where we care about the side the most.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
It is size de-optimization for trimmer and AOT that are scenarios where we care about the side the most.
Perhaps I misinterpreted you first message on this discussion thread.
If the blob size optimization is important for AOT, then, I think we should keep the.unaligned usage on both code paths. We can get there by reverting the last commit. Sorry for the confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Perhaps I misinterpreted you first message on this discussion thread.
I was commenting on why we set the alignment and why it would not work to omit it on the types. Sorry for the confusion.
src/Compilers/CSharp/Test/Emit/CodeGen/CodeGenStackAllocInitializerTests.csShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
AlekseyTs left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
LGTM (commit 14)
jkotas commentedJan 10, 2024 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Is the JIT able to optimize the new pattern well? For example, what is performance of |
jjonescz commentedJan 11, 2024
Yes. Here's a benchmark:https://github.com/jjonescz/RoslynIssue69325
[Benchmark]publicvoidRun_3()=>Use(stackallocint[]{1,2,3});[Benchmark]publicvoidRun_1000()=>Use(stackallocint[]{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,321,322,323,324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342,343,344,345,346,347,348,349,350,351,352,353,354,355,356,357,358,359,360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379,380,381,382,383,384,385,386,387,388,389,390,391,392,393,394,395,396,397,398,399,400,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,416,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,469,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486,487,488,489,490,491,492,493,494,495,496,497,498,499,500,501,502,503,504,505,506,507,508,509,510,511,512,513,514,515,516,517,518,519,520,521,522,523,524,525,526,527,528,529,530,531,532,533,534,535,536,537,538,539,540,541,542,543,544,545,546,547,548,549,550,551,552,553,554,555,556,557,558,559,560,561,562,563,564,565,566,567,568,569,570,571,572,573,574,575,576,577,578,579,580,581,582,583,584,585,586,587,588,589,590,591,592,593,594,595,596,597,598,599,600,601,602,603,604,605,606,607,608,609,610,611,612,613,614,615,616,617,618,619,620,621,622,623,624,625,626,627,628,629,630,631,632,633,634,635,636,637,638,639,640,641,642,643,644,645,646,647,648,649,650,651,652,653,654,655,656,657,658,659,660,661,662,663,664,665,666,667,668,669,670,671,672,673,674,675,676,677,678,679,680,681,682,683,684,685,686,687,688,689,690,691,692,693,694,695,696,697,698,699,700,701,702,703,704,705,706,707,708,709,710,711,712,713,714,715,716,717,718,719,720,721,722,723,724,725,726,727,728,729,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,746,747,748,749,750,751,752,753,754,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,771,772,773,774,775,776,777,778,779,780,781,782,783,784,785,786,787,788,789,790,791,792,793,794,795,796,797,798,799,800,801,802,803,804,805,806,807,808,809,810,811,812,813,814,815,816,817,818,819,820,821,822,823,824,825,826,827,828,829,830,831,832,833,834,835,836,837,838,839,840,841,842,843,844,845,846,847,848,849,850,851,852,853,854,855,856,857,858,859,860,861,862,863,864,865,866,867,868,869,870,871,872,873,874,875,876,877,878,879,880,881,882,883,884,885,886,887,888,889,890,891,892,893,894,895,896,897,898,899,900,901,902,903,904,905,906,907,908,909,910,911,912,913,914,915,916,917,918,919,920,921,922,923,924,925,926,927,928,929,930,931,932,933,934,935,936,937,938,939,940,941,942,943,944,945,946,947,948,949,950,951,952,953,954,955,956,957,958,959,960,961,962,963,964,965,966,967,968,969,970,971,972,973,974,975,976,977,978,979,980,981,982,983,984,985,986,987,988,989,990,991,992,993,994,995,996,997,998,999,1000});[MethodImpl(MethodImplOptions.NoInlining)]staticvoidUse(Span<int>span){} |
Uh oh!
There was an error while loading.Please reload this page.
Resolves#69325.
Based on#57123 by@svick.
RuntimeHelpers.CreateSpanon constant data forstackallocof non-bytearrays.stackallocisReadOnlySpan, result of the helper is simply used. Otherwise, the data are copied to stack viaReadOnlySpan.GetPinnableReferenceandcpblkinstruction.Benchmark (source)