NotificationsYou must be signed in to change notification settings
Fork14.2k
Star91.6k

[WIP] Reduce the number of fa rows for Intel#18138

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Draft

mmerecki wants to merge1 commit intoggml-org:master

base:master

Choose a base branch

frommmerecki:reduce-fa-num-rows

Draft

[WIP] Reduce the number of fa rows for Intel#18138

mmerecki wants to merge1 commit intoggml-org:masterfrommmerecki:reduce-fa-num-rows

+19 −16

Conversation

Copy link

mmerecki commentedDec 17, 2025

Reduce the number of fa rows for Intel to reduce registers usage.

Reduce the number of fa rows for Intel

4e35585

loci-dev mentioned this pull request

Dec 17, 2025

UPSTREAM PR #18138: Reduce the number of fa rows for Intelauroralabs-loci/llama.cpp#606

Open

mmerecki changed the title~~Reduce the number of fa rows for Intel~~[WIP] Reduce the number of fa rows for Intel

Dec 17, 2025

Copy link

Collaborator

jeffbolznv commentedDec 17, 2025

Should this depend on head size? Some models have small head sizes like 64 or even 40, 2 rows seems pretty small for that. But if 2 is best, I don't object.

github-actionsbot added Vulkan

Issues specific to the Vulkan backend

ggmlchanges relating to the ggml tensor library for machine learning labels

Dec 17, 2025

Copy link

Author

mmerecki commentedDec 18, 2025

Thanks Jeff. I will verify this change with more models and potentially update the value for small head sizes.
I will also add information about the test results before I make the PR ready for review.