Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[c10d] init_process_group supports index-only device id#156214

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed
kwen2501 wants to merge4 commits intogh/kwen2501/174/basefromgh/kwen2501/174/head

Conversation

@kwen2501
Copy link
Collaborator

@kwen2501kwen2501 commentedJun 17, 2025
edited
Loading

Stack fromghstack (oldest at bottom):

Before:

acc = torch.accelerator.current_accelerator()if acc:  local_idx = ...  dist.init_process_group(    device_id=torch.device(acc.type, local_idx)  )

After:

dist.init_process_group(device_id=local_idx)

That is,init_process_group checkstorch.accelerator.current_accelerator() internally.

cc@H-Huang@awgu@wanchaol@fegin@fduwjj@wz337@wconstab@d4l3k

guangyey reacted with thumbs up emoji
[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-botbot commentedJun 17, 2025
edited
Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results athud.pytorch.org/pr/156214

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 33 Pending

As of commitc0ac973 with merge basefbbab79 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-botpytorch-botbot added oncall: distributedAdd this issue/PR to distributed oncall triage queue release notes: distributed (c10d)release notes category labelsJun 17, 2025
@kwen2501kwen2501 mentioned this pull requestJun 17, 2025
@kwen2501kwen2501 added the suppress-bc-linterSuppresses the failures of API backward-compatibility linter (Lint/bc_linter) labelJun 17, 2025
@kwen2501
Copy link
CollaboratorAuthor

cc:@guangyey@albanD@newtdms

[ghstack-poisoned]
kwen2501 added a commit that referenced this pull requestJun 17, 2025
@kwen2501kwen2501 added the ciflow/trunkTrigger trunk jobs on your pull request labelJun 18, 2025
Copy link
Collaborator

@albanDalbanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

nice!

group_name:str="",
pg_options:Optional[Any]=None,
device_id:Optional[torch.device]=None,
device_id:Optional[Union[torch.device,int]]=None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

nit: you can usetorch.types.Device

kwen2501 reacted with thumbs up emoji
[ghstack-poisoned]
@kwen2501
Copy link
CollaboratorAuthor

@pytorchbot merge -f "Minor lint; all tests passed"

pytorch-bot[bot] reacted with thumbs up emoji

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag,bypassing any CI checks (ETA: 1-5 minutes). Please use-f as last resort and instead consider-i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in thewiki.

Questions? Feedback? Please reach out to thePyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: Commandgit -C /home/runner/work/pytorch/pytorch cherry-pick -x cb3f3e9747ad838e9ceb5e9a432c79d2eb9f8a73 returned non-zero exit code 1

Auto-merging test/distributed/test_c10d_nccl.pyCONFLICT (content): Merge conflict in test/distributed/test_c10d_nccl.pyAuto-merging torch/distributed/distributed_c10d.pyerror: could not apply cb3f3e9747a... [c10d] init_process_group supports index-only device idhint: After resolving the conflicts, mark them withhint: "git add/rm <pathspec>", then runhint: "git cherry-pick --continue".hint: You can instead skip this commit with "git cherry-pick --skip".hint: To abort and get back to the state before "git cherry-pick",hint: run "git cherry-pick --abort".hint: Disable this message with "git config set advice.mergeConflict false"
Details for Dev Infra teamRaised byworkflow job

[ghstack-poisoned]
kwen2501 added a commit that referenced this pull requestJun 20, 2025
@kwen2501
Copy link
CollaboratorAuthor

@pytorchbot merge -f "Minor rebase; all tests passed"

pytorch-bot[bot] reacted with thumbs up emoji

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag,bypassing any CI checks (ETA: 1-5 minutes). Please use-f as last resort and instead consider-i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in thewiki.

Questions? Feedback? Please reach out to thePyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@github-actionsgithub-actionsbot deleted the gh/kwen2501/174/head branchJuly 21, 2025 02:21
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@albanDalbanDalbanD approved these changes

@guangyeyguangyeyguangyey approved these changes

@fduwjjfduwjjAwaiting requested review from fduwjj

@d4l3kd4l3kAwaiting requested review from d4l3k

@wconstabwconstabAwaiting requested review from wconstab

Assignees

No one assigned

Labels

ciflow/trunkTrigger trunk jobs on your pull requestMergedoncall: distributedAdd this issue/PR to distributed oncall triage queuerelease notes: distributed (c10d)release notes categorysuppress-bc-linterSuppresses the failures of API backward-compatibility linter (Lint/bc_linter)

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

5 participants

@kwen2501@pytorchmergebot@albanD@guangyey

[8]ページ先頭

©2009-2025 Movatter.jp