- Notifications
You must be signed in to change notification settings - Fork5.2k
Expose the FusedMultiplyAdd and MultiplyAddEstimate APIs on relevant vector and scalar types#102181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
ghost commentedMay 14, 2024
Note regarding the |
tannergooding commentedMay 15, 2024
CC. @dotnet/jit-contrib for the JIT side. This is one of the few remaining cases that should need JIT side work, most of the remaining APIs in#93513 will be implemented purely in managed code. The |
EgorBo commentedMay 15, 2024
Should we rather introduce a new attribute to prevent methods from being pre-jitted (and inlined into pre-jitted code)? |
| // AdvSimd.FusedMultiplyAdd expects (addend, left, right), while the APIs take (left, right, addend) | ||
| // We expect op1 and op2 to have already been spilled | ||
| std::swap(op1, op3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I am just curious - who's responsible to spill them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Ah, nvm, I see
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Typically it's the importer code, prior to calling this method. Latter phases that might call such an API (like morph) are responsible for ensuring the swap is safe in the face of potential side effects.
There's, unfortunately, not really a way for us to do such a swap safely in the API itself (at least that I know of) nor to know if the caller has actually done it. So the typical approach has been to do the swap and comment that callers should be doing the validation.
| impSpillSideEffect(true, verCurrentState.esStackDepth - | ||
| 3DEBUGARG("Spilling op1 side effects for MultiplyAddEstimate")); | ||
| impSpillSideEffect(true, verCurrentState.esStackDepth - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Isn't better/simpler to useimpSpillSideEffects(true, CHECK_SPILL_ALL DEBUGARG("spilling side-effects")); ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
It's not something we've been doing in other scenarios.
AFAIR it comes down to not spilling values on the stack that aren't impacted. For example, we take 3 args, but the stack could have 4+ on it (spilling these is unnecessary since they are still processed in order with respect to our ownop1/op2/op3; otherwise everyone who pops the stack would need to consider the need to spill such additional entries) and we don't need to ever spill the stack top (op3).
So we're doing this to ensure only the minimum number of items that need to be spilled are spilled.
tannergooding commentedMay 15, 2024
We technically have one already, it's I'm also not sure the number of -- I do want to get to a point, someday, where we can move a lot of complexity out of the JIT and into the BCL instead. Ideally just the platform specific intrinsics and core xplat APIs which were exposed instead of platform specific APIs (like |
tannergooding commentedMay 15, 2024
CC.@stephentoub for the libraries side. The one thing to note is that given how |
tannergooding commentedMay 15, 2024
I've logged#102275 to track prototyping this work and seeing if we can identify some of the places that are still preventing it from happening. This seems like a good thing to try and do around the time .NET 9 RC1 snaps and we have a bit of a small break before .NET 10 work really gets going. |
…vector and scalar types (dotnet#102181)* Expose FusedMultiplyAdd and MultiplyAddEstimate on the scalar and vector types* Adding tests covering FusedMultiplyAdd and MultiplyAddEstimate for the vector types* Ensure TensorPrimitives uses the xplat APIs on .NET 9+* Apply formatting patch* Fix an accidental change to GenericVectorTests* Ensure Arm64 passes fma operands in the correct order* Apply formatting patch* Ensure all the Arm64 code paths are spilling and swapping operands* Apply formatting patch* Don't pop the stack value twice
Thisresolves#98053 and makes progress towards#93513