- Notifications
You must be signed in to change notification settings - Fork5.2k
Change Vector2/3/4, Quaternion, Plane, Vector<T>, and Vector64/128/256/512<T> to be implemented in managed where trivially possible#102301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Tagging subscribers to this area:@JulieLeeMSFT,@jakobbotsch |
…ented in managed where trivially possible
This comment was marked as outdated.
This comment was marked as outdated.
tannergooding commentedJun 3, 2024
tannergooding commentedJun 3, 2024
Things are already looking much better than before.
There is an assert around |
…PIs as intrinsic for the profitability boost
| /// <returns><paramref name="value" /> reinterpreted as a new <see cref="Plane" />.</returns> | ||
| internalstaticPlaneAsPlane(thisVector4value) | ||
| { | ||
| #ifMONO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Is this perf problem or functionality problem?
Either way, it is likely going to be fixed as side-effect of#102988 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Possibly a functionality problem that I've not dug into yet. The previous tested commit had seen some failures related to Quaternion/Vector4/Plane and their equality tests, but only on Mono.
My current guess is that there's something subtly incorrect in the Mono handling that breaks for SIMD here and it will need a fix before BitCast can be used, but I'd like to try and ensure CoreCLR is clean without regressions before I do any more in depth Mono changes.
tannergooding commentedJun 5, 2024
tannergooding commentedJun 5, 2024 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
One regression is in - vmovups xmm0, xmmword ptr [rdi]+ vmovss xmm0, dword ptr [rdi]+ vinsertps xmm0, xmm0, dword ptr [rdi+0x04], 16+ vinsertps xmm0, xmm0, dword ptr [rdi+0x08], 32+ vinsertps xmm0, xmm0, dword ptr [rdi+0x0C], 48 This one is because while we have There's a small regression in G_M1352_IG02: mov rax, rsi mov ecx, edx- vmovups xmm0, xmmword ptr [rax] cmp ecx, 128 jge SHORT G_M1352_IG04- ;; size=17 bbWeight=1 PerfScore 5.75+;; size=13 bbWeight=1 PerfScore 1.75 G_M1352_IG03:- vmovd xmm1, rdi- vpxor xmm0, xmm1, xmm0+ vmovd xmm0, rdi+ vpxor xmm0, xmm0, xmmword ptr [rax] jmp G_M1352_IG07 align [0 bytes for IG05]- ;; size=13 bbWeight=0.50 PerfScore 2.17+;; size=13 bbWeight=0.50 PerfScore 3.50 G_M1352_IG04:+ vmovups xmm0, xmmword ptr [rsi] vmovups xmm1, xmmword ptr [rsi+0x10] Nothing major, just a place where we don't share a load anymore. In general the JIT ends up needing to create many additional Most of the corelib regressions are in the xplat APIs that don't have any accelerated implementation today, like All in all, I think we are in a position where taking this is feasible and it doesn't look to regress any meaningful scenarios, only the already unaccelerated edge cases. -- Mono is being left "as is" for now since their inliner isn't as robust as the RyuJIT one, but we should be able to independently investigate removing the Mono support for the xplat APIs as well so they only need to support the platform specific APIs. CC. @dotnet/jit-contrib |
EgorBo left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Nice clean up in jit! 👍
LoopedBard3 commentedJun 11, 2024 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
lewing commentedJun 11, 2024 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
wasm regressions heredotnet/perf-autofiling-issues#36108 (aot) and improvementsdotnet/perf-autofiling-issues#36093 (interp) |
tannergooding commentedJun 11, 2024
These should be resolved with#103177
Little bit surprised that
👍, fairly significant improvements at that. I know that |
lewing commentedJun 12, 2024
I find it surprising too,@radekdoulik can you take a look at our codegen here please
Yes, nice improvements |
matouskozak commentedJun 14, 2024 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Possibly related noticeable size improvements on Mono iOS HelloWorlddotnet/perf-autofiling-issues#35768 |
Uh oh!
There was an error while loading.Please reload this page.
This is a basic prototype for#102275 and covers the all vectorized types where the handling can be trivially done in managed (such as by deferring to another intrinsic API).
As part of that, it also removes
MethodImpl(MethodImplOptions.AggressiveInlining)from various APIs where the method IL size is approx. less than the always inline threshold for RyuJIT (this was estimated by examining a few functions and finding that 2 or less calls with no complex logic typically fits the need).