Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

RFC: First approach to add target specific intriniscs for gfx90a targets#1796

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
sbalint98 wants to merge1 commit intoAdaptiveCpp:develop
base:develop
Choose a base branch
Loading
fromsbalint98:bsoproni-add-unsfate-atomic-add-gfx90a

Conversation

@sbalint98
Copy link
Collaborator

This MR is an initial experimental approach to add target specific sscp builtins. In particular the hip unsafe atomics is exposed through the hipsycl::sycl::detail::__acpp_unsafe_atomic_fetch_add function. It could be used by calling into AdaptiveCpp details the following way:

q.parallel_for(a.size(), [=](sycl::id<1> idx){    constexpr auto global_adress_space = hipsycl::sycl::access::address_space::global_space;    constexpr auto global_memory_scope = hipsycl::sycl::memory_scope::device;    hipsycl::sycl::detail::__acpp_unsafe_atomic_fetch_add<global_adress_space>(&dev_a[0], 4.5f, relaxed_memory_order, global_memory_scope);  });

lin72h reacted with thumbs up emoji
@illuhad
Copy link
Collaborator

Why do we need a new bitcode file? Could we just not implement the unsafe atomic add in the existing one with some JIT reflection?

@sbalint98
Copy link
CollaboratorAuthor

Unfortunately, clang will choke on these builtins if there is no appropriate-mcpu specified.

Compiling it without specifying gfx90a target arch results in the following error:

/home/soproni/Projects/AdaptiveCpp/src/libkernel/sscp/amdgpu/atomic_gfx90a.cpp:18:10: error: '__builtin_amdgcn_global_atomic_fadd_f64' needs target feature gfx90a-insts   18 |   return __builtin_amdgcn_global_atomic_fadd_f64(ptr, x);      |

Which is as far as I can tell is due to checking the sub-target compatibility of the builtin by the frontend here:
https://github.com/intel/llvm/blob/sycl/clang/lib/CodeGen/CodeGenFunction.cpp#L3190

There are some exceptions made for compiling when targeting--hipstdpar however passing this when compiling the bitcode library results in an error about amdgcn not being a valid target for host compilation. Even after adding-Xclang -fcuda-is-device to the the compilation arguments the same issue persists. I am not sure what--hipstdpar changes that results in this new behavior. Do you think it make sense to dig further into the LLVM source to figure out if we could trick the fronted to emitting IR for these builtins?

Additionally the--hipstdpar flag has been only merged into llvm 18.1

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

2 participants

@sbalint98@illuhad

[8]ページ先頭

©2009-2025 Movatter.jp