Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[Intel GPU] Enable safe softmax for XPU SDPA#151999

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed
LuFinch wants to merge4 commits intopytorch:mainfromLuFinch:lfq/safe_softmax

Conversation

@LuFinch
Copy link
Contributor

@LuFinchLuFinch commentedApr 23, 2025
edited by pytorch-botbot
Loading

Fixintel/torch-xpu-ops#1432 (comment)

When one row of Q*K attention score is masked with-inf,softmax(score) would outputNaN for whole row which would cause model corruption.

With this new flag, it would output0 for whole row which is aligned with Pytorch CPU/CUDA's behavior.

cc@jgong5@mingfeima@XiaobingSuper@sanchitintel@ashokei@jingxu10@jerryzh168@voznesenskym@penguinwu@EikanWang@Guobing-Chen@zhuhaozhe@blzheng@wenzhe-nrv@jiayisunx@ipiszy@chenyang78@kadeng@muchulee8@amjames@chauhang@aakhundov@gujinghui@fengyuan14@guangyey

@pytorch-bot
Copy link

pytorch-botbot commentedApr 23, 2025
edited
Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results athud.pytorch.org/pr/151999

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 3 Pending

As of commit7ba3ce8 with merge base9f5153b (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-botpytorch-botbot added the module: cpuCPU specific problem (e.g., perf, algorithm) labelApr 23, 2025
@guangyeyguangyey added ciflow/xpuRun XPU CI tasks release notes: xpurelease notes category module: xpuIntel XPU related issues labelsApr 24, 2025
@pytorch-bot
Copy link

To add the ciflow labelciflow/xpu please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@pytorch-botpytorch-botbot removed the ciflow/xpuRun XPU CI tasks labelApr 24, 2025
@guangyey
Copy link
Collaborator

Could you elaborate on the issues we would encounter if this PR were not applied in PR description? And give a test case if possible.

@guangyeyguangyey moved this toPre-Review Required inPyTorch IntelApr 24, 2025
@guangyeyguangyey added the ciflow/xpuRun XPU CI tasks labelApr 24, 2025
@pytorch-bot
Copy link

To add the ciflow labelciflow/xpu please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@pytorch-botpytorch-botbot removed the ciflow/xpuRun XPU CI tasks labelApr 24, 2025
@LuFinch
Copy link
ContributorAuthor

@guangyey Updated PR description and added UT.

guangyey reacted with thumbs up emoji

@etafetaf added the ciflow/xpuRun XPU CI tasks labelApr 25, 2025
@pytorch-bot
Copy link

To add the ciflow labelciflow/xpu please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@pytorch-botpytorch-botbot removed the ciflow/xpuRun XPU CI tasks labelApr 25, 2025
@guangyey
Copy link
Collaborator

Thanks for update.

@LuFinchLuFinch marked this pull request as ready for reviewJune 4, 2025 06:19
@LuFinch
Copy link
ContributorAuthor

LuFinch commentedJun 4, 2025
edited
Loading

@guangyey OneDNN has been upgraded to v3.8. This PR is ready to merge. Could you help review and trigger CI?

guangyey reacted with thumbs up emoji

@guangyeyguangyey added the ciflow/xpuRun XPU CI tasks labelJun 4, 2025
@guangyeyguangyey moved this fromPre-Review Required toReview Required inPyTorch IntelJun 4, 2025
@guangyeyguangyey requested a review fromdrisspgJune 4, 2025 06:38
@pytorch-botpytorch-botbot removed the ciflow/xpuRun XPU CI tasks labelJun 4, 2025
@guangyeyguangyey added the ciflow/xpuRun XPU CI tasks labelJun 5, 2025
@pytorch-bot
Copy link

To add the ciflow labelciflow/inductor please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@ZhiweiYan-96ZhiweiYan-96 added ciflow/trunkTrigger trunk jobs on your pull request ciflow/inductor labelsJun 13, 2025
@pytorch-bot
Copy link

To add the ciflow labelciflow/trunk please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@pytorch-bot
Copy link

To add the ciflow labelciflow/inductor please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@pytorch-botpytorch-botbot removed ciflow/trunkTrigger trunk jobs on your pull request ciflow/inductor labelsJun 13, 2025
@guangyey
Copy link
Collaborator

@pytorchbot merge

pytorch-bot[bot] reacted with thumbs up emoji

@pytorch-botpytorch-botbot added the ciflow/trunkTrigger trunk jobs on your pull request labelJun 13, 2025
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in thewiki.

Questions? Feedback? Please reach out to thePyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 mandatory check(s) failed. The first few are:

Dig deeper byviewing the failures on hud

Details for Dev Infra teamRaised byworkflow job

Failing merge rule: Core Maintainers

@guangyey
Copy link
Collaborator

@pytorchbot merge

pytorch-bot[bot] reacted with thumbs up emoji

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in thewiki.

Questions? Feedback? Please reach out to thePyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 3 mandatory check(s) failed. The first few are:

Dig deeper byviewing the failures on hud

Details for Dev Infra teamRaised byworkflow job

Failing merge rule: Core Maintainers

@guangyey
Copy link
Collaborator

@pytorchbot merge -f "lint is green, XPU CI pass, ignore unrelated failure and queuing rocm CI"

pytorch-bot[bot] reacted with thumbs up emoji

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag,bypassing any CI checks (ETA: 1-5 minutes). Please use-f as last resort and instead consider-i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in thewiki.

Questions? Feedback? Please reach out to thePyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@github-project-automationgithub-project-automationbot moved this fromApproved toDone inPyTorch IntelJun 13, 2025
@LuFinchLuFinch deleted the lfq/safe_softmax branchOctober 13, 2025 08:42
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@drisspgdrisspgdrisspg approved these changes

@EikanWangEikanWangEikanWang approved these changes

@guangyeyguangyeyguangyey approved these changes

@gujinghuigujinghuiAwaiting requested review from gujinghuigujinghui is a code owner

Assignees

No one assigned

Labels

ciflow/trunkTrigger trunk jobs on your pull requestciflow/xpuRun XPU CI tasksMergedmodule: cpuCPU specific problem (e.g., perf, algorithm)module: inductormodule: xpuIntel XPU related issuesopen sourcerelease notes: xpurelease notes category

Projects

Status: Done

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

SDPA cases failed after XPU enabled in stock pytorch

8 participants

@LuFinch@guangyey@pytorchmergebot@drisspg@EikanWang@etaf@ZhiweiYan-96@pytorchbot

[8]ページ先頭

©2009-2025 Movatter.jp