- Notifications
You must be signed in to change notification settings - Fork13.9k
Pull requests: ggml-org/llama.cpp
Author
Uh oh!
There was an error while loading.Please reload this page.
Label
Uh oh!
There was an error while loading.Please reload this page.
Projects
Uh oh!
There was an error while loading.Please reload this page.
Milestones
Uh oh!
There was an error while loading.Please reload this page.
Reviews
Assignee
Assigned to nobodyLoading
Uh oh!
There was an error while loading.Please reload this page.
Sort
Pull requests list
gguf: llama: use
= default for trivial constructors and destructors #17649 openedDec 1, 2025 byGermanAizekLoading…
sgemm: reuse loaded vector in AVX dot product calculation
#17648 openedDec 1, 2025 byGermanAizekLoading…
llama-vocab: replace postfix with prefix increment for iterators
#17646 openedDec 1, 2025 byGermanAizekLoading…
vec: optimize AVX2/FMA sum-of-squares with loop unrolling and FMA ggmlchanges relating to the ggml tensor library for machine learning
#17642 openedDec 1, 2025 byGermanAizekLoading…
ggml-quants: use _mm256_testz_si256 for mask checks in AVX2 ggmlchanges relating to the ggml tensor library for machine learning
#17641 openedDec 1, 2025 byGermanAizekLoading…
ggml-alloc: optimize free block shifting withchanges relating to the ggml tensor library for machine learning
memmove ggml #17640 openedDec 1, 2025 byGermanAizekLoading…
ggml-cuda: reorder only relevant nodes ggmlchanges relating to the ggml tensor library for machine learning Nvidia GPUIssues specific to Nvidia GPUs
#17639 openedDec 1, 2025 byam17anLoading…
vulkan: Replace deprecated VK_EXT_validation_features ggmlchanges relating to the ggml tensor library for machine learning VulkanIssues specific to the Vulkan backend
#17637 openedDec 1, 2025 byrillomasLoading…
common : compute average token length from vocabulary
#17632 openedDec 1, 2025 byyifant-code • Draft
llama-router, the C++ "llama-swap" for llama.cpp examples need feedbackTesting and feedback with results are needed testingEverything test related
#17629 openedNov 30, 2025 byServeurpersoCom • Draft
vulkan: set all memory allocations to high priority ggmlchanges relating to the ggml tensor library for machine learning VulkanIssues specific to the Vulkan backend
#17624 openedNov 30, 2025 byjeffbolznv • Draft
vulkan: Reduce temporary memory usage for TOP_K ggmlchanges relating to the ggml tensor library for machine learning VulkanIssues specific to the Vulkan backend
#17623 openedNov 30, 2025 byjeffbolznvLoading…
model : Fix marker placement for LFM2-VL in single turn llama-mtmd-cli examples
#17616 openedNov 30, 2025 bytdakhranLoading…
ggml : remove redundant n_copies check when setting input/output ggmlchanges relating to the ggml tensor library for machine learning
#17612 openedNov 30, 2025 bydanbevLoading…
Feature/kimi linear support ggmlchanges relating to the ggml tensor library for machine learning modelModel specific Nvidia GPUIssues specific to Nvidia GPUs pythonpython script changes
#17592 openedNov 29, 2025 bycacaviewLoading…
Override SSM_A op for Qwen3 Next to reduce splits modelModel specific
#17587 openedNov 29, 2025 bypwilkinLoading…
Add support for CUMSUM and TRI for CUDA. ggmlchanges relating to the ggml tensor library for machine learning Nvidia GPUIssues specific to Nvidia GPUs testingEverything test related
#17584 openedNov 28, 2025 bypwilkinLoading…
ProTip! What’s not been updated in a month:updated:<2025-11-01.