This MR is an initial experimental approach to add target specific sscp builtins. In particular the hip unsafe atomics is exposed through the hipsycl::sycl::detail::__acpp_unsafe_atomic_fetch_add function. It could be used by calling into AdaptiveCpp details the following way:

q.parallel_for(a.size(), [=](sycl::id<1> idx){    constexpr auto global_adress_space = hipsycl::sycl::access::address_space::global_space;    constexpr auto global_memory_scope = hipsycl::sycl::memory_scope::device;    hipsycl::sycl::detail::__acpp_unsafe_atomic_fetch_add<global_adress_space>(&dev_a[0], 4.5f, relaxed_memory_order, global_memory_scope);  });

RFC: First version to add target specific intriniscs for gfx90a targets

91723ae

Copy link

Collaborator

illuhad commentedMay 5, 2025

Why do we need a new bitcode file? Could we just not implement the unsafe atomic add in the existing one with some JIT reflection?

Copy link

CollaboratorAuthor

sbalint98 commentedMay 6, 2025

Unfortunately, clang will choke on these builtins if there is no appropriate-mcpu specified.

Compiling it without specifying gfx90a target arch results in the following error:

/home/soproni/Projects/AdaptiveCpp/src/libkernel/sscp/amdgpu/atomic_gfx90a.cpp:18:10: error: '__builtin_amdgcn_global_atomic_fadd_f64' needs target feature gfx90a-insts   18 |   return __builtin_amdgcn_global_atomic_fadd_f64(ptr, x);      |

Which is as far as I can tell is due to checking the sub-target compatibility of the builtin by the frontend here:
https://github.com/intel/llvm/blob/sycl/clang/lib/CodeGen/CodeGenFunction.cpp#L3190

There are some exceptions made for compiling when targeting--hipstdpar however passing this when compiling the bitcode library results in an error about amdgcn not being a valid target for host compilation. Even after adding-Xclang -fcuda-is-device to the the compilation arguments the same issue persists. I am not sure what--hipstdpar changes that results in this new behavior. Do you think it make sense to dig further into the LLVM source to figure out if we could trick the fronted to emitting IR for these builtins?

Additionally the--hipstdpar flag has been only merged into llvm 18.1

Labels

None yet

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RFC: First approach to add target specific intriniscs for gfx90a targets#1796

Are you sure you want to change the base?

RFC: First approach to add target specific intriniscs for gfx90a targets#1796

Uh oh!

Conversation

sbalint98 commentedMay 2, 2025

Uh oh!

illuhad commentedMay 5, 2025

Uh oh!

sbalint98 commentedMay 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants