Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[arm64] Add RCPC ISA (8.3+) and use ldap for volatile reads#67384

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
EgorBo merged 12 commits intodotnet:mainfromEgorBo:arm64-rcpc
Apr 12, 2022

Conversation

EgorBo
Copy link
Member

@EgorBoEgorBo commentedMar 31, 2022
edited
Loading

This PR adds a new ISA RCPC (Release Consistent Processor Consistent support) for arm64-v8.3+ (optionally available on arm64-v8.2) in order to rely onldapr/b/h for volatile reads with acquire/release semantics, see#67374

Apple M1 seems support it but most likely it's just an alias forldar there so no boost.

Closes#67374

staticvolatileinta;staticvoidTest()=>a++;

codegen diff

; Assembly listing for method Test()G_M16289_IG01:         A9BF7BFD          stp     fp, lr, [sp,#-16]!        910003FD          mov     fp, spG_M16289_IG02:        D287BA80          movz    x0, #0x3dd4        F2B00C00          movk    x0, #0x8060 LSL #16        F2C00040          movk    x0, #2 LSL #32-       88DFFC01          ldar    w1, [x0]+       B8BFC001          ldapr   w1, [x0]        11000421          add     w1, w1, #1        889FFC01          stlr    w1, [x0]G_M16289_IG03:        A8C17BFD          ldp     fp, lr, [sp],#16        D65F03C0          ret     lr; Total bytes of code 40; ============================================================

am11, VSadov, xsoheilalizadeh, and omariom reacted with thumbs up emoji
@ghostghost added the area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labelMar 31, 2022
@ghostghost assignedEgorBoMar 31, 2022
@ghost
Copy link

Tagging subscribers to this area:@JulieLeeMSFT
See info inarea-owners.md if you want to be subscribed.

Issue Details

This PR adds a new ISA RCPC (Release Consistent Processor Consistent support) for arm64-v8.3+ (optionally available on arm64-v8.2) in order to rely onldapr/b/h for volatile reads with acquire/release semantics, see#67374

Closes#67374

Author:EgorBo
Assignees:-
Labels:

area-CodeGen-coreclr

Milestone:-

@EgorBoEgorBo changed the titleArm64 rcpc[arm64] Add RCPC ISA (8.3+) and use ldap for volatile readsMar 31, 2022
@EgorBoEgorBo closed thisMar 31, 2022
@EgorBoEgorBo reopened thisMar 31, 2022
@EgorBoEgorBo marked this pull request as ready for reviewMarch 31, 2022 17:31
Co-authored-by: Adeel Mujahid <3840695+am11@users.noreply.github.com>
@EgorBo
Copy link
MemberAuthor

@VSadov do you have a benchmark/scenario in mind to see improvements from this change?

@VSadov
Copy link
Member

@EgorBo This can have effect on scenarios that mix volatile writes and reads. Like writing/reading to ConcurrentQueue in a loop.

The new instruction is not necessarily cheaper by itself - scenarios just doing lots of volatile reads may not be affected. There need to be some volatile writes (or reference writes to heap locations) in the mix as LDAR needs to consider preceding STLR while LDAPR does not.

Also note that in order to see gains, the hardware should take advantage of relaxed semantics. Some early implementations of LDAPR could be just aliases of LDAR.

EgorBo reacted with thumbs up emoji

@EgorBo
Copy link
MemberAuthor

I wasn't able to reproduce improvements on M1 so probably it is indeed is just a renamedldar there but I guess it's still makes sense to have - I've attached a codegen diff example in the description.

@dotnet/jit-contrib PTAL, no diffs but in fact all volatile loads were changed from ldar[b/h] to ldapr[b/h]

@ghostghost locked asresolvedand limited conversation to collaboratorsMay 12, 2022
@EgorBoEgorBo deleted the arm64-rcpc branchOctober 5, 2022 02:16
@EgorBo
Copy link
MemberAuthor

NOTE: Unfortunately, this PR doesn't detect RCPC feature set on Windows as there is no official API for that yet.

@kunalspathak
Copy link
Contributor

NOTE: Unfortunately, this PR doesn't detect RCPC feature set on Windows as there is no official API for that yet.

This was originally added for Mac, right? Were you trying it on Ampere for Windows?

@EgorBo
Copy link
MemberAuthor

NOTE: Unfortunately, this PR doesn't detect RCPC feature set on Windows as there is no official API for that yet.

This was originally added for Mac, right? Were you trying it on Ampere for Windows?

Yep, and Linux. Unfortunately it happened before we added Ampere+Linux to our perf infra. For windows we wait till the official API is updatedIsProcessorFeaturePresent

Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.
Reviewers

@am11am11am11 left review comments

@MichalStrehovskyMichalStrehovskyMichalStrehovsky left review comments

@jakobbotschjakobbotschjakobbotsch approved these changes

Assignees

@EgorBoEgorBo

Labels
area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

[ARM64] Consider using LDAPR to implement volatile reads when instruction is available.
6 participants
@EgorBo@VSadov@kunalspathak@am11@jakobbotsch@MichalStrehovsky

[8]ページ先頭

©2009-2025 Movatter.jp