Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[mono] Implement AdvSimd#49260

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
imhameed merged 58 commits intodotnet:mainfromimhameed:mono-arm64-advsimd
Mar 12, 2021
Merged

Conversation

@imhameed
Copy link
Contributor

@imhameedimhameed commentedMar 6, 2021
edited
Loading

This change adds AdvSimd and AdvSimd.Arm64 support to LLVM-enabled Mono.

Most aarch64 LLVM intrinsic functions are overloaded and have names determined
by an invariant base string prepended to a string representation of one or two
type parameters. Intrinsic functions used by an LLVM module must have a
declaration somewhere in memory when JITting or somewhere in the output bitcode
file when AOTing. Currently Mono maintains a hash table that maps internal
intrinsic IDs to LLVM intrinsic declarations. These IDs have been extended: a
simplified type representation is added to the key's upper bits. This
representation is not especially compact, and currently uses 9 bits to label 18
states, but it's easy to look at in a debugger. (A simple base-18 encoding
could encode three parameters in 13 bits.)

These overload-tagged IDs can be passed to
OP_XOP_OVR{_,_SCALAR,_BYSCALAR}X_{X,X_X,X_X_X}. The return type of the
intrinsic that generates these mini ops is used to derive the overload tag to
find the corresponding LLVM intrinsic function declaration.

MonoLLVMModule::intrins_by_id is removed, because LLVM intrinsic lookup keys
are no longer small contiguous integers. It only seemed to serve as a lookup
table for data already contained in a hash table.

The corresponding instructions for some of these .NET-level intrinsics take
immediate parameters. For some of these instructions, the LLVM IR code that
selects these immediate-argument instructions can emit a fallback for
non-constant parameters, either by using an equivalent instruction with a
register operand or by using a longer and less-efficient instruction sequence.
For the rest, a branching code sequence is emitted. Helper functions
(immediate_unroll_begin etc.) are added to make this a little less
repetitious.

Some operations take an immediate operand denoting a lane to select in a vector
before proceeding with another generic vector or scalar operation. These are
decomposed into a sequence ofOP_ARM64_SELECT_SCALAR followed by the
non-lane-specific operation. LLVM can still optimize this to the lane-selecting
instruction when possible, and can generate fallback code for non-immediate
lane selection.

The tables describing the intrinsics supported by the runtime are extended to
support intrinsics with different target instructions for signed, unsigned and
floating point parameters. Whenever possible, .NET-level intrinsics that
correspond to a single LLVM intrinsic function are stored as a single entry in
these tables. Unfortunately many intrinsics need to be translated into a
sequence of LLVM IR operations; for these, new mini IR opcodes are added to
select the LLVM IR builder code that should run.

SamMonoRT, CoffeeFlux, and fanyang-mono reacted with heart emoji
(Insert meaningful description here)
Remove `MonoLLVMModule::intrins_by_id`, which doesn't do anything otherthan serve as a lookup table for data contained in `intrins_id_to_intrins`Don't emit table-driven intrinsics when the corresponding intrinsicgroup isn't fully supported.
… ShiftArithmeticSaturateScalar, ShiftLeftLogicalSaturate and ShiftLeftLogicalSaturateScalarFix ShiftLeftLogicalSaturate and ShiftLeftLogicalSaturateScalar:decompose it into a promotion of the second argument into a vectorfollowed by an overloaded invocation of @llvm.aarch64.neon.uqshl or@llvm.aarch64.neon.sqshl
MultiplyDoublingSaturateHighScalarMultiplyDoublingScalarBySelectedScalarSaturateHighMultiplyDoublingWideningSaturateScalarBySelectedScalarMultiplyDoublingWideningScalarBySelectedScalarAndAddSaturateMultiplyDoublingWideningScalarBySelectedScalarAndSubtractSaturateMultiplyRoundedDoublingByScalarSaturateHighMultiplyRoundedDoublingBySelectedScalarSaturateHighMultiplyRoundedDoublingSaturateHighScalarMultiplyRoundedDoublingScalarBySelectedScalarSaturateHigh    - remove unnecessary special casesMultiplyDoublingWideningSaturateScalar    - add support for the special-case scalar LLVM intrinsic for sqdmull
Copy link
Member

@fanyang-monofanyang-mono left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks for this massive change!

staticvoidset_nonnull_load_flag (LLVMValueRefv);

enum {
INTRIN_scalar=1 <<0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Is there any particular reason we are defining some of these with constant bit shifts, some with decimal literals, and some with hex literals?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

They're hints to the reader: the enumeration constants given values by constant bit shifts are meant to be used as bit selectors in a bit set, the enumeration constants given values by decimal literals are meant to be used to bound loop ranges, and the enumeration constants given values by hex literals are meant to be used as logical masks.

…calar or scalar-in-vector return value in a Vector64Remove OP_ARM64_ZERO_UPPER, which is unused
… opsundef can apparently pass through intrinsic functions duringoptimization, so bias towards slightly worse but correct codegen for now
Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.

Reviewers

@fanyang-monofanyang-monofanyang-mono approved these changes

@CoffeeFluxCoffeeFluxAwaiting requested review from CoffeeFlux

@EgorBoEgorBoAwaiting requested review from EgorBo

@lambdageeklambdageekAwaiting requested review from lambdageek

@SamMonoRTSamMonoRTAwaiting requested review from SamMonoRT

@kunalspathakkunalspathakAwaiting requested review from kunalspathak

@echesakovechesakovAwaiting requested review from echesakov

+2 more reviewers

@vargazvargazvargaz approved these changes

@nariccnariccnaricc approved these changes

Reviewers whose approvals may not affect merge requirements

Assignees

No one assigned

Projects

None yet

Milestone

6.0.0

Development

Successfully merging this pull request may close these issues.

6 participants

@imhameed@SamMonoRT@vargaz@naricc@fanyang-mono@karelz

[8]ページ先頭

©2009-2025 Movatter.jp