- Notifications
You must be signed in to change notification settings - Fork5.2k
[mono] Implement AdvSimd#49260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
[mono] Implement AdvSimd#49260
Uh oh!
There was an error while loading.Please reload this page.
Conversation
1bef631 to7d9469cCompare(Insert meaningful description here)
Remove `MonoLLVMModule::intrins_by_id`, which doesn't do anything otherthan serve as a lookup table for data contained in `intrins_id_to_intrins`Don't emit table-driven intrinsics when the corresponding intrinsicgroup isn't fully supported.
… ShiftArithmeticSaturateScalar, ShiftLeftLogicalSaturate and ShiftLeftLogicalSaturateScalarFix ShiftLeftLogicalSaturate and ShiftLeftLogicalSaturateScalar:decompose it into a promotion of the second argument into a vectorfollowed by an overloaded invocation of @llvm.aarch64.neon.uqshl or@llvm.aarch64.neon.sqshl
…teScalarShiftLeftLogicalSaturateUnsignedScalar: move scalar-op-from-vector-op code into shared functions
MultiplyDoublingSaturateHighScalarMultiplyDoublingScalarBySelectedScalarSaturateHighMultiplyDoublingWideningSaturateScalarBySelectedScalarMultiplyDoublingWideningScalarBySelectedScalarAndAddSaturateMultiplyDoublingWideningScalarBySelectedScalarAndSubtractSaturateMultiplyRoundedDoublingByScalarSaturateHighMultiplyRoundedDoublingBySelectedScalarSaturateHighMultiplyRoundedDoublingSaturateHighScalarMultiplyRoundedDoublingScalarBySelectedScalarSaturateHigh - remove unnecessary special casesMultiplyDoublingWideningSaturateScalar - add support for the special-case scalar LLVM intrinsic for sqdmull
…pe when loading a single element
…num) to a separate header
fanyang-mono left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Thanks for this massive change!
| staticvoidset_nonnull_load_flag (LLVMValueRefv); | ||
| enum { | ||
| INTRIN_scalar=1 <<0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Is there any particular reason we are defining some of these with constant bit shifts, some with decimal literals, and some with hex literals?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
They're hints to the reader: the enumeration constants given values by constant bit shifts are meant to be used as bit selectors in a bit set, the enumeration constants given values by decimal literals are meant to be used to bound loop ranges, and the enumeration constants given values by hex literals are meant to be used as logical masks.
…calar or scalar-in-vector return value in a Vector64Remove OP_ARM64_ZERO_UPPER, which is unused
… opsundef can apparently pass through intrinsic functions duringoptimization, so bias towards slightly worse but correct codegen for now
Uh oh!
There was an error while loading.Please reload this page.
This change adds AdvSimd and AdvSimd.Arm64 support to LLVM-enabled Mono.
Most aarch64 LLVM intrinsic functions are overloaded and have names determined
by an invariant base string prepended to a string representation of one or two
type parameters. Intrinsic functions used by an LLVM module must have a
declaration somewhere in memory when JITting or somewhere in the output bitcode
file when AOTing. Currently Mono maintains a hash table that maps internal
intrinsic IDs to LLVM intrinsic declarations. These IDs have been extended: a
simplified type representation is added to the key's upper bits. This
representation is not especially compact, and currently uses 9 bits to label 18
states, but it's easy to look at in a debugger. (A simple base-18 encoding
could encode three parameters in 13 bits.)
These overload-tagged IDs can be passed to
OP_XOP_OVR{_,_SCALAR,_BYSCALAR}X_{X,X_X,X_X_X}. The return type of theintrinsic that generates these mini ops is used to derive the overload tag to
find the corresponding LLVM intrinsic function declaration.
MonoLLVMModule::intrins_by_idis removed, because LLVM intrinsic lookup keysare no longer small contiguous integers. It only seemed to serve as a lookup
table for data already contained in a hash table.
The corresponding instructions for some of these .NET-level intrinsics take
immediate parameters. For some of these instructions, the LLVM IR code that
selects these immediate-argument instructions can emit a fallback for
non-constant parameters, either by using an equivalent instruction with a
register operand or by using a longer and less-efficient instruction sequence.
For the rest, a branching code sequence is emitted. Helper functions
(
immediate_unroll_beginetc.) are added to make this a little lessrepetitious.
Some operations take an immediate operand denoting a lane to select in a vector
before proceeding with another generic vector or scalar operation. These are
decomposed into a sequence of
OP_ARM64_SELECT_SCALARfollowed by thenon-lane-specific operation. LLVM can still optimize this to the lane-selecting
instruction when possible, and can generate fallback code for non-immediate
lane selection.
The tables describing the intrinsics supported by the runtime are extended to
support intrinsics with different target instructions for signed, unsigned and
floating point parameters. Whenever possible, .NET-level intrinsics that
correspond to a single LLVM intrinsic function are stored as a single entry in
these tables. Unfortunately many intrinsics need to be translated into a
sequence of LLVM IR operations; for these, new mini IR opcodes are added to
select the LLVM IR builder code that should run.