Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Disable KAI int4mm kernels when no bf16 HW support.#170788

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
robert-hardwick wants to merge1 commit intogh/robert-hardwick/12/base
base:gh/robert-hardwick/12/base
Choose a base branch
Loading
fromgh/robert-hardwick/12/head

Conversation

@robert-hardwick
Copy link
Collaborator

@robert-hardwickrobert-hardwick commentedDec 18, 2025
edited
Loading

Stack fromghstack (oldest at bottom):

Ideally, we would have just disabled this for bf16, and kept the float32 going through kleidi kernels, however because the weights are packed differently in kleidi compared to the fallback kernel this wasn't possible. We don't know the input dtype at weight packing time, so we had to disable kleidi entirely when no bf16 harware support.

Fixes failing unit test on AArch64 CPU without bf16

Fixes#170787

cc@jgong5@mingfeima@XiaobingSuper@sanchitintel@ashokei@jingxu10@jerryzh168@aditew01

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-botbot commentedDec 18, 2025
edited
Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results athud.pytorch.org/pr/170788

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commitba0a9fe with merge base3854d69 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-botpytorch-botbot added the module: cpuCPU specific problem (e.g., perf, algorithm) labelDec 18, 2025
robert-hardwick added a commit that referenced this pull requestDec 18, 2025
Fixes failing unit test on AArch64 CPU without bf16ghstack-source-id:78e112bPull-Request:#170788
@github-actions
Copy link
Contributor

This PR needs arelease notes: label

If your changes are user facing and intended to be a part of release notes, please use a label starting withrelease notes:.

If not, please add thetopic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@robert-hardwick
Copy link
CollaboratorAuthor

@pytorchbot label "topic: not user facing"

pytorch-bot[bot] reacted with thumbs up emoji

@pytorch-botpytorch-botbot added the topic: not user facingtopic category labelDec 18, 2025
@robert-hardwickrobert-hardwick added ciflow/linux-aarch64linux aarch64 CI workflow and removed topic: not user facingtopic category labelsDec 18, 2025
@robert-hardwick
Copy link
CollaboratorAuthor

This should be added to 2.10 patch release, as it's a regression. But will wait for CI tests to pass first.

ifplatform.machine().lower()notin ("arm64","aarch64"):
returnFalse
try:
withopen("/proc/cpuinfo")asf:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

clean way to expose an API:https://github.com/pytorch/pytorch/blob/main/torch/_C/_cpu.pyi and usecpuinfo_has_arm_bf16() ?

Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Ah ok, yeah i didn't know about that file, I was looking for direct exposure of cpuinfo functions but didn't realise we had some existing wrappers. Yeah we should add a wrapper function here
https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/cpu/Utils.cpp and inhttps://github.com/pytorch/pytorch/blob/main/torch/_C/_cpu.pyi, i will make that change tomorrow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Can we perhaps separate eager and compile regressions and fix them in separate PRs?

And also, it feels a bit weird that for Meta registrations one needs to check some runtime rather than a compile time capabilities. May be we need to do rerouting in the lower level

Copy link
Contributor

@malfetmalfet left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Eager part looks good to me, let's move and discuss compiler part as part of different PR(s), i.e. may be I should finally write a long overdue PR that exposes CPU capabilities as dictionary

ifplatform.machine().lower()notin ("arm64","aarch64"):
returnFalse
try:
withopen("/proc/cpuinfo")asf:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Can we perhaps separate eager and compile regressions and fix them in separate PRs?

And also, it feels a bit weird that for Meta registrations one needs to check some runtime rather than a compile time capabilities. May be we need to do rerouting in the lower level

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@malfetmalfetmalfet requested changes

@aditew01aditew01aditew01 requested changes

@nikhil-armnikhil-armAwaiting requested review from nikhil-arm

Assignees

No one assigned

Labels

ciflow/linux-aarch64linux aarch64 CI workflowmodule: cpuCPU specific problem (e.g., perf, algorithm)open source

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

5 participants

@robert-hardwick@malfet@aditew01@pytorchbot

[8]ページ先頭

©2009-2025 Movatter.jp