Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[ROCm] cpp_extension allow user to override default flags#152432

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed

Conversation

@jithunnair-amd
Copy link
Collaborator

@jithunnair-amdjithunnair-amd commentedApr 29, 2025
edited
Loading

We need -fgpu-rdc for projects such as DeepEP + rocSHMEM. The default of -no-gpu-rdc doesn't work for such cases.

As per#152432 (comment):
"rocshmem shares the same global variable in different files, as deepEP uses CUDAExtention to build the projecthttps://github.com/deepseek-ai/DeepEP/blob/65e2a700f0330f3fb1c26f49a0250d1f9d0ac1e3/setup.py#L51 and depends on rocshmem, this -fgpu-rdc is needed. The current logic in Pytorch prevents users from overriding this flag."

cc@jeffdaily@sunway513@pruthvistony@ROCmSupport@dllehr-amd@jataylo@hongxiayang@naromero77amd

@pytorch-bot
Copy link

pytorch-botbot commentedApr 29, 2025
edited
Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results athud.pytorch.org/pr/152432

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Unrelated Failure

As of commit0bed116 with merge basee06a080 (image):

NEW FAILURE - The following job has failed:

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-botpytorch-botbot added ciflow/rocmTrigger "default" config CI on ROCm module: rocmAMD GPU support for Pytorch labelsApr 29, 2025
@liligwu
Copy link

rocshmem shares the same global variable in different files, as deepEP uses CUDAExtention to build the projecthttps://github.com/deepseek-ai/DeepEP/blob/65e2a700f0330f3fb1c26f49a0250d1f9d0ac1e3/setup.py#L51 and depends on rocshmem, this-fgpu-rdc is needed. The current logic in Pytorch prevents users from overriding this flag.

@jithunnair-amd
Copy link
CollaboratorAuthor

Attempting to rebase so that we can get a clean run without a test_cpp_extensions* failure

@pytorchbot rebase

pytorch-bot[bot] reacted with thumbs up emoji

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job ontorefs/remotes/origin/viable/strict. Check the current statushere

@pytorchmergebot
Copy link
Collaborator

Successfully rebasedhip_extension_gpu_rdc ontorefs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, viagit checkout hip_extension_gpu_rdc && git pull --rebase)

@jithunnair-amdjithunnair-amd marked this pull request as ready for reviewMay 8, 2025 17:39
@jithunnair-amd
Copy link
CollaboratorAuthor

@malfet Can you please review?

@jeffdaily
Copy link
Collaborator

@pytorchbot rebase

pytorch-bot[bot] reacted with thumbs up emoji

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job ontorefs/remotes/origin/viable/strict. Check the current statushere

@pytorchmergebot
Copy link
Collaborator

Successfully rebasedhip_extension_gpu_rdc ontorefs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, viagit checkout hip_extension_gpu_rdc && git pull --rebase)

@jithunnair-amd
Copy link
CollaboratorAuthor

@pytorchbot merge -f "unrelated CI failures"

pytorch-bot[bot] reacted with thumbs up emoji

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag,bypassing any CI checks (ETA: 1-5 minutes). Please use-f as last resort and instead consider-i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in thewiki.

Questions? Feedback? Please reach out to thePyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

jerrymannil pushed a commit to ROCm/pytorch that referenced this pull requestJul 15, 2025
…2432)We need -fgpu-rdc for projects such as DeepEP + rocSHMEM. The default of -no-gpu-rdc doesn't work for such cases.As perpytorch#152432 (comment):"rocshmem shares the same global variable in different files, as deepEP uses CUDAExtention to build the projecthttps://github.com/deepseek-ai/DeepEP/blob/65e2a700f0330f3fb1c26f49a0250d1f9d0ac1e3/setup.py#L51 and depends on rocshmem, this -fgpu-rdc is needed. The current logic in Pytorch prevents users from overriding this flag."Pull Requestresolved:pytorch#152432Approved by:https://github.com/jeffdailyCo-authored-by: Jeff Daily <jeff.daily@amd.com>
jithunnair-amd added a commit to ROCm/pytorch that referenced this pull requestJul 16, 2025
…2432) (#2374)cherry-pick ofpytorch@e4adf5dWe need -fgpu-rdc for projects such as DeepEP + rocSHMEM. The default of-no-gpu-rdc doesn't work for such cases.As perpytorch#152432 (comment):"rocshmem shares the same global variable in different files, as deepEPuses CUDAExtention to build the projecthttps://github.com/deepseek-ai/DeepEP/blob/65e2a700f0330f3fb1c26f49a0250d1f9d0ac1e3/setup.py#L51and depends on rocshmem, this -fgpu-rdc is needed. The current logic inPytorch prevents users from overriding this flag."Pull Requestresolved:pytorch#152432Approved by:https://github.com/jeffdailyCo-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>Co-authored-by: Jeff Daily <jeff.daily@amd.com>
okakarpa pushed a commit to ROCm/pytorch that referenced this pull requestJul 16, 2025
…2432) (#2374)cherry-pick ofpytorch@e4adf5dWe need -fgpu-rdc for projects such as DeepEP + rocSHMEM. The default of-no-gpu-rdc doesn't work for such cases.As perpytorch#152432 (comment):"rocshmem shares the same global variable in different files, as deepEPuses CUDAExtention to build the projecthttps://github.com/deepseek-ai/DeepEP/blob/65e2a700f0330f3fb1c26f49a0250d1f9d0ac1e3/setup.py#L51and depends on rocshmem, this -fgpu-rdc is needed. The current logic inPytorch prevents users from overriding this flag."Pull Requestresolved:pytorch#152432Approved by:https://github.com/jeffdailyCo-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>Co-authored-by: Jeff Daily <jeff.daily@amd.com>
okakarpa pushed a commit to ROCm/pytorch that referenced this pull requestJul 16, 2025
…2432) (#2374)cherry-pick ofpytorch@e4adf5dWe need -fgpu-rdc for projects such as DeepEP + rocSHMEM. The default of-no-gpu-rdc doesn't work for such cases.As perpytorch#152432 (comment):"rocshmem shares the same global variable in different files, as deepEPuses CUDAExtention to build the projecthttps://github.com/deepseek-ai/DeepEP/blob/65e2a700f0330f3fb1c26f49a0250d1f9d0ac1e3/setup.py#L51and depends on rocshmem, this -fgpu-rdc is needed. The current logic inPytorch prevents users from overriding this flag."Pull Requestresolved:pytorch#152432Approved by:https://github.com/jeffdailyCo-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>Co-authored-by: Jeff Daily <jeff.daily@amd.com>
jerrymannil added a commit to ROCm/pytorch that referenced this pull requestJul 16, 2025
…2432) (#2374)cherry-pick ofpytorch@e4adf5dWe need -fgpu-rdc for projects such as DeepEP + rocSHMEM. The default of-no-gpu-rdc doesn't work for such cases.As perpytorch#152432 (comment):"rocshmem shares the same global variable in different files, as deepEPuses CUDAExtention to build the projecthttps://github.com/deepseek-ai/DeepEP/blob/65e2a700f0330f3fb1c26f49a0250d1f9d0ac1e3/setup.py#L51and depends on rocshmem, this -fgpu-rdc is needed. The current logic inPytorch prevents users from overriding this flag."Pull Requestresolved:pytorch#152432Approved by:https://github.com/jeffdailyCo-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>Co-authored-by: Jeff Daily <jeff.daily@amd.com>
jerrymannil added a commit to ROCm/pytorch that referenced this pull requestJul 16, 2025
…2432) (#2374)cherry-pick ofpytorch@e4adf5dWe need -fgpu-rdc for projects such as DeepEP + rocSHMEM. The default of-no-gpu-rdc doesn't work for such cases.As perpytorch#152432 (comment):"rocshmem shares the same global variable in different files, as deepEPuses CUDAExtention to build the projecthttps://github.com/deepseek-ai/DeepEP/blob/65e2a700f0330f3fb1c26f49a0250d1f9d0ac1e3/setup.py#L51and depends on rocshmem, this -fgpu-rdc is needed. The current logic inPytorch prevents users from overriding this flag."Pull Requestresolved:pytorch#152432Approved by:https://github.com/jeffdailyCo-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>Co-authored-by: Jeff Daily <jeff.daily@amd.com>
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@jeffdailyjeffdailyjeffdaily approved these changes

@fmassafmassaAwaiting requested review from fmassafmassa is a code owner

@soumithsoumithAwaiting requested review from soumith

@ezyangezyangAwaiting requested review from ezyangezyang is a code owner

@malfetmalfetAwaiting requested review from malfetmalfet is a code owner

Assignees

No one assigned

Labels

ciflow/rocmTrigger "default" config CI on ROCmMergedmodule: rocmAMD GPU support for Pytorchopen sourcerelease notes: rocmmandatorylabel

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

5 participants

@jithunnair-amd@liligwu@pytorchmergebot@jeffdaily@pytorchbot

[8]ページ先頭

©2009-2025 Movatter.jp