- Notifications
You must be signed in to change notification settings - Fork13.9k
Pull requests: ggml-org/llama.cpp
Author
Uh oh!
There was an error while loading.Please reload this page.
Label
Uh oh!
There was an error while loading.Please reload this page.
Projects
Uh oh!
There was an error while loading.Please reload this page.
Milestones
Uh oh!
There was an error while loading.Please reload this page.
Reviews
Assignee
Assigned to nobodyLoading
Uh oh!
There was an error while loading.Please reload this page.
Sort
Pull requests list
Feature/kimi linear support ggmlchanges relating to the ggml tensor library for machine learning modelModel specific Nvidia GPUIssues specific to Nvidia GPUs pythonpython script changes
#17592 openedNov 29, 2025 bycacaviewLoading…
update
LLAMA_ARG_KV_SPLIT -->LLAMA_ARG_KV_UNIFIED to match CLI argument #17588 openedNov 29, 2025 byddh0Loading…
Override SSM_A op for Qwen3 Next to reduce splits modelModel specific
#17587 openedNov 29, 2025 bypwilkinLoading…
Add support for CUMSUM and TRI for CUDA. ggmlchanges relating to the ggml tensor library for machine learning Nvidia GPUIssues specific to Nvidia GPUs testingEverything test related
#17584 openedNov 28, 2025 bypwilkinLoading…
cmake: fix macOS build withchanges relating to the ggml tensor library for machine learning
-DGGML_BACKEND_DL=ON ggml #17581 openedNov 28, 2025 bygiladgdLoading…
Add PagedAttention support (experimental, CUDA only) ggmlchanges relating to the ggml tensor library for machine learning Nvidia GPUIssues specific to Nvidia GPUs
#17579 openedNov 28, 2025 byericcurtinLoading…
model: LFM2-VL fixes examples ggmlchanges relating to the ggml tensor library for machine learning Nvidia GPUIssues specific to Nvidia GPUs testingEverything test related
#17577 openedNov 28, 2025 bytdakhranLoading…
HIP: enable WMMA-MMQ INT kernels for RDNA 3 ggmlchanges relating to the ggml tensor library for machine learning Nvidia GPUIssues specific to Nvidia GPUs
#17576 openedNov 28, 2025 byjiachengjason • Draft
[SYCL] enhance argsort for UT ggmlchanges relating to the ggml tensor library for machine learning SYCLhttps://en.wikipedia.org/wiki/SYCL - GPU programming language
#17573 openedNov 28, 2025 byNeoZhangJianyuLoading…
Server: Change Invalid Schema from Server Error (500) to User Error (400) examples pythonpython script changes server testingEverything test related
#17572 openedNov 28, 2025 bychadvoegeleLoading…
ggml-hexagon: fixchanges relating to the ggml tensor library for machine learning
rope failure attest-backend-ops ggml #17565 openedNov 28, 2025 bychraacLoading…
CANN: The Ger operator of OUT_PROD is not supported on the 310p device Ascend NPUissues specific to Ascend NPUs ggmlchanges relating to the ggml tensor library for machine learning
#17563 openedNov 28, 2025 byTianHao324Loading…
Fix unreadable user markdown colors and truncate long texts in deletion dialogs examples server
#17555 openedNov 27, 2025 byServeurpersoComLoading…
ggml-cpu: Add operator-level execution time profiling ggmlchanges relating to the ggml tensor library for machine learning
#17544 openedNov 27, 2025 bykimminsu38ooLoading…
CANN: add support for partial RoPE and Vision mode Ascend NPUissues specific to Ascend NPUs ggmlchanges relating to the ggml tensor library for machine learning
#17543 openedNov 27, 2025 bynoemotiovonLoading…
server: explicitly set the function name in lambda examples server
#17538 openedNov 27, 2025 byhaiyuewaLoading…
llama.cpp with sentencepiece testingEverything test related
#17529 openedNov 26, 2025 byawenzel67Loading…
ggml-cpu: BMI2 is only available on amd64 ggmlchanges relating to the ggml tensor library for machine learning
#17528 openedNov 26, 2025 bycandrewsLoading…
ProTip! What’s not been updated in a month:updated:<2025-10-29.