- Notifications
You must be signed in to change notification settings - Fork5.2k
Use multi-reg load/store for EncodeToUtf8#95513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
SwapnilGaikwad commentedDec 1, 2023
Hi@kunalspathak , this is an initial version of encode to UTF8 using multi-register load/stores. This currently fails some asserts in LSRA phase while doing the crossgen for SPC. It fails while emitting |
src/libraries/System.Private.CoreLib/src/System/Buffers/Text/Base64Encoder.cs OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
danmoseley commentedDec 1, 2023
Does this need an entry in their part notices file? |
Change-Id: Ie56b1786cdf8ac8d2067c0ba1fdfd3924dd9ca13
SwapnilGaikwad commentedDec 12, 2023
Sorry@danmoseley , I didn't understand which part notices file you're referring to. |
SwapnilGaikwad commentedDec 12, 2023
Initial benchmarking on N1 system show some good performance results. |
| } | ||
| end=srcMax-16; | ||
| if((Ssse3.IsSupported||AdvSimd.Arm64.IsSupported)&&BitConverter.IsLittleEndian&&(end>=src)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I'm not sure if we should remove the AdvSimd check here.
With this PR, it will use the vector128 version if the buffer length is<48 && >16.
That's probably the best option for speed, but results in a bigger library.
kunalspathak commentedDec 12, 2023
@SwapnilGaikwad - do you mind sharing the disassembly? |
SwapnilGaikwad commentedDec 13, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
A separately compiled Full assembly |
kunalspathak commentedDec 13, 2023
Thanks@SwapnilGaikwad for sharing the disassembly. I wanted to see the code quality when consecutive registers are involved. |
| Vector128<byte>res4; | ||
| Vector128<byte>tbl_enc1=Vector128.Create("ABCDEFGHIJKLMNOP"u8).AsByte(); | ||
| Vector128<byte>tbl_enc2=Vector128.Create("QRSTUVWXYZabcdef"u8).AsByte(); | ||
| Vector128<byte>tbl_enc3=Vector128.Create("ghijklmnopqrstuv"u8).AsByte(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
We can load this encoding table fromEncodingMap fromLine #774. This could help to reduce the code size but loading from memory/cache would be slightly slower than the regs. Benchmarks didn't show signficant difference.from-mem-artifacts load data fromEncodingMap.
| Method | Toolchain | NumberOfBytes | Mean | Error | StdDev | Median | Min | Max | Ratio | MannWhitney(2%) ||-------------------------------- |-------------------------------------------------------------------------------------- |-------------- |---------:|--------:|--------:|---------:|---------:|---------:|------:|---------------- || Base64Encode | /main/artifacts/tests/coreclr/linux.arm64.Release/Tests/Core_Root/corerun | 1000 | 363.8 ns | 0.14 ns | 0.12 ns | 363.8 ns | 363.5 ns | 363.9 ns | 1.00 | Base || Base64Encode | /runtime/artifacts/tests/coreclr/linux.arm64.Release/Tests/Core_Root/corerun | 1000 | 196.8 ns | 0.02 ns | 0.02 ns | 196.8 ns | 196.8 ns | 196.9 ns | 0.54 | Faster || Base64Encode | /runtime/from-mem-artifacts/tests/coreclr/linux.arm64.Release/Tests/Core_Root/corerun | 1000 | 195.3 ns | 0.08 ns | 0.07 ns | 195.3 ns | 195.2 ns | 195.5 ns | 0.54 | Faster || | | | | | | | | | | || Base64EncodeDestinationTooSmall | /main/artifacts/tests/coreclr/linux.arm64.Release/Tests/Core_Root/corerun | 1000 | 368.2 ns | 0.30 ns | 0.28 ns | 368.2 ns | 367.8 ns | 368.7 ns | 1.00 | Base || Base64EncodeDestinationTooSmall | /runtime/artifacts/tests/coreclr/linux.arm64.Release/Tests/Core_Root/corerun | 1000 | 191.5 ns | 0.04 ns | 0.04 ns | 191.5 ns | 191.5 ns | 191.6 ns | 0.52 | Faster || Base64EncodeDestinationTooSmall | /runtime/from-mem-artifacts/tests/coreclr/linux.arm64.Release/Tests/Core_Root/corerun | 1000 | 194.0 ns | 0.22 ns | 0.20 ns | 193.9 ns | 193.7 ns | 194.4 ns | 0.53 | Faster |Would you suggest to read encoding table fromEncodingMap?
kunalspathak commentedDec 14, 2023
/azp run runtime-coreclr libraries-jitstress |
kunalspathak commentedDec 14, 2023
/azp run runtime-coreclr libraries-jitstress2-jitstressregs |
| Azure Pipelines successfully started running 1 pipeline(s). |
1 similar comment
| Azure Pipelines successfully started running 1 pipeline(s). |
kunalspathak commentedDec 15, 2023
/azp run runtime-coreclr libraries-jitstressregs |
| Azure Pipelines successfully started running 1 pipeline(s). |
kunalspathak commentedDec 15, 2023
I guess he is talking about the reference made inhttps://github.com/dotnet/runtime/pull/95513/files#diff-b3b9edcf4c0d62e78954d826c44005cffb306b6ccf155f1a9228669229b7e765R496, but not sure where exactly to add this.@danmoseley - can you please confirm? |
ghost commentedDec 31, 2023
Tagging subscribers to this area: @dotnet/area-system-buffers Issue DetailsThis implements the encode to UTF8 algorithmhere.
|
kunalspathak commentedJan 5, 2024
ping@danmoseley |
danmoseley commentedJan 5, 2024
Oops, yes, that was what caught my eye. Generally if we use significant ideas/code from elsewhere we add a credit in THIRD-PARTY-NOTICES.TXT at the root. Up to you. |
a74nh commentedJan 9, 2024
Looks like there is already an entry in there, but it's not immediately obvious. runtime/THIRD-PARTY-NOTICES.TXT Line 345 in57bcccd
Contains an extra copy of the license fromhttps://github.com/aklomp/base64/blob/master/LICENSE |
kunalspathak left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
LGTM
| [CompExactlyDependsOn(typeof(AdvSimd.Arm64))] | ||
| privatestaticunsafevoidAdvSimdEncode(refbyte*srcBytes,refbyte*destBytes,byte*srcEnd,intsourceLength,intdestLength,byte*srcStart,byte*destStart) | ||
| { | ||
| // C# implementatino of https://github.com/aklomp/base64/blob/3a5add8652076612a8407627a42c768736a4263f/lib/arch/neon64/enc_loop.c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
nit
| // C#implementatino of https://github.com/aklomp/base64/blob/3a5add8652076612a8407627a42c768736a4263f/lib/arch/neon64/enc_loop.c | |
| // C#implementation of https://github.com/aklomp/base64/blob/3a5add8652076612a8407627a42c768736a4263f/lib/arch/neon64/enc_loop.c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
you can do this in a follow-up PR
This reverts commitfdb03ca.
* Use multi-reg load/store for EncodeToUtf8* Use the fixed version of multi-reg store* Update variable naming
…#96944)* Revert "[libs] Skip AdvSimdEncode on Mono (dotnet#96829)"This reverts commit1a76e37.* Revert "Use multi-reg load/store for EncodeToUtf8 (dotnet#95513)"This reverts commitfdb03ca.* Wrap load/store vector APIs in '#if false'* Disable load/store vector tests* remove the trailing space
richlander commentedMay 7, 2024
I assume this is the API being discussed. If so, it would be good to put in the initial comment so that it is easy for folks to fine. https://learn.microsoft.com/dotnet/api/system.buffers.text.base64.encodetoutf8 |
This implements the encode to UTF8 algorithmhere.