Foreword 🍓
Dotnet provides several classes, some under theSystem.Runtime.Intrinsics namespace that allow hardware to execute instructions inparallel.
usingSystem.Runtime.Intrinsics;Vector512v512;Vector256v256;Vector128v128;
The numbersuffix(512, 256, 128) indicates the size in bits of the vector that the hardware can process in parallel.
This has positive impact in operations that performs aggregates, specially in a loop with large arrays.
To know if the hardware allows this type of registers we can consult the static read-only propertyIsHardwareAccelerated
if(Vector256.IsHardwareAccelerated){_is256=true;...}
The above code will test if our hardware supports 256 bit vector operations through JIT intrinsics.
Exploring 🧠
Suppose we want to simultaneously calculate the maximum and minimum of a sequence of integers usingVector256.
The process will consist of creating a loop in which we will move forward taking 256-bit chunks and updating the maximum and minimum
(TMin,TMax)MinMax256<T>(ReadOnlySpan<T>source)whereT:struct,INumber<T>{}
First we initialize some variables to hold the current element, the last element, and the last size wise element (thetovariable)
refTcurrent=refMemoryMarshal.GetReference(source);refTlast=refUnsafe.Add(refcurrent,source.Length);refTto=refUnsafe.Add(reflast,-Vector256<T>.Count);Vector256<T>minElement=Vector256.LoadUnsafe(refcurrent);Vector256<T>maxElement=minElement;
Then we start the loop. Inside, we load data in 256 bit chunks callingVector256.LoadUsafe
while(Unsafe.IsAddressLessThan(refcurrent,refto)){Vector256<T>tempElement=Vector256.LoadUnsafe(refcurrent);minElement=Vector256.Min(minElement,tempElement);maxElement=Vector256.Max(maxElement,tempElement);current=refUnsafe.Add(refcurrent,Vector256<T>.Count);}
We use the static Min and Max methods ofVector256and store the value in minElement and maxElement.
Finally, we increment the position counter (current) by adding 256 bits to the pointer.
Once we have exceeded the established size, we have to calculate the maximum and minimum individually
Tmin=minElement[0];Tmax=maxElement[0];for(inti=1;i<Vector256<T>.Count;i++){TtempMin=minElement[i];if(tempMin<min){min=tempMin;}TtempMax=maxElement[i];if(tempMax>max){max=tempMax;}}
After that we calculate the remaining elements if any:
while(Unsafe.IsAddressLessThan(refcurrent,reflast)){if(current<min){min=current;}if(current>max){max=current;}current=refUnsafe.Add(refcurrent,1);}
And that's all, we return the results:
return(min,max);
Benchmark 🔥
A quick test with BenchmarkDotnet calculating the maximum and minimum of an array of 10_000 integers reveals a performance improvement ofx146with Vector256 support.
💡 Ryzen 7 1700, 1 CPU
.NET SDK=8.0.100-rc.1.23455.8
Method | Mean (ns) |
---|---|
🐢 MinMaxLinq .NET Framework 4.8 | 118,675.226 |
⚡MinMaxSimd .NET 8.0 | 808.150 |
Farewell
All the code with a more elavorated example is hosted in github. Be happy and love your family 💖
NetDefender / SimdIteration
SIMD tests
References
System.Runtime.Intrinsics Espacio de nombres | Microsoft Learn
Contiene tipos que se usan para crear y transmitir estados de registro de distintos tamaños y formatos para su uso con las extensiones del conjunto de instrucciones. Para obtener instrucciones sobre cómo manipular estos registros, vea System.Runtime.Intrinsics.X86 y System.Runtime.Intrinsics.Arm.
System.Runtime.Intrinsics work planned for .NET 8#79005
This is a work in progress as we develop our .NET 8 plans. This list is expected to change throughout the release cycle according to ongoing planning and discussions, with possible additions and subtractions to the scope.
Summary
During .NET 8, we will be focusing on AVX-512, an effort that includes the addition of a new intrinsic typeVector512
as well asVector<T>
improvements. Beyond that major theme, we will invest in quality, enhancements and new APIs. This is an ambitious set of work, so it's likely that several of the items below will be pushed out beyond .NET 8. It is also likely additional items will be added throughout the year.
Planned for .NET 8
AVX-512
- [ ]https://github.com/dotnet/runtime/issues/63331
- [x]https://github.com/dotnet/runtime/issues/73262
- [ ]https://github.com/dotnet/runtime/issues/73604
- [x]https://github.com/dotnet/runtime/issues/74613
- [x]https://github.com/dotnet/runtime/issues/74813
- [ ]https://github.com/dotnet/runtime/issues/76244
- [ ]https://github.com/dotnet/runtime/issues/76579
- [x]https://github.com/dotnet/runtime/issues/76593
Quality
Enhancements / New APIs
Hardware Intrinsics in .NET Core - .NET Blog

Top comments(0)
For further actions, you may consider blocking this person and/orreporting abuse