- Notifications
You must be signed in to change notification settings - Fork12.4k
Pull requests: ggml-org/llama.cpp
Author
Uh oh!
There was an error while loading.Please reload this page.
Label
Uh oh!
There was an error while loading.Please reload this page.
Projects
Uh oh!
There was an error while loading.Please reload this page.
Milestones
Uh oh!
There was an error while loading.Please reload this page.
Reviews
Assignee
Assigned to nobodyLoading
Uh oh!
There was an error while loading.Please reload this page.
Sort
Pull requests list
feat: Add extended sampling API with candidate token lists #14612
#14765 openedJul 19, 2025 bybaonudesifeizhaiLoading…
webui: add missing messages in export (#13552) examples server
#14764 openedJul 18, 2025 bysrogmannLoading…
cuda : implement bf16 cpy ops and enable bf16 cont ggmlchanges relating to the ggml tensor library for machine learning Nvidia GPUIssues specific to Nvidia GPUs
#14763 openedJul 18, 2025 byCISCLoading…
tests : add non-cont K,V FA tests testingEverything test related
#14756 openedJul 18, 2025 byggerganovLoading…
Fix MinicpmV model converter and clip to avoid using hardcode. examples pythonpython script changes
#14750 openedJul 18, 2025 bygryffindor-rrLoading…
[ROCm] Fix HIP version check for HIPBLAS V2 API compatibility ggmlchanges relating to the ggml tensor library for machine learning Nvidia GPUIssues specific to Nvidia GPUs
#14744 openedJul 17, 2025 bydanielholandaLoading…
metal: SSM_SCAN performance Apple Metalhttps://en.wikipedia.org/wiki/Metal_(API) ggmlchanges relating to the ggml tensor library for machine learning
#14743 openedJul 17, 2025 bygabe-l-hartLoading…
examples : predicted output for text generation examples
#14739 openedJul 17, 2025 byiamlemecLoading…
Improve Mistral models integration with llama.cpp pythonpython script changes
#14737 openedJul 17, 2025 byjuliendenize • Draft
Documentation: Update build.md's Vulkan section documentationImprovements or additions to documentation
#14736 openedJul 17, 2025 byrspOverflowLoading…
CUDA: skip masked out KQ slices in mma FA kernel ggmlchanges relating to the ggml tensor library for machine learning Nvidia GPUIssues specific to Nvidia GPUs pythonpython script changes
#14735 openedJul 17, 2025 byJohannesGaesslerLoading…
feat: Add optional prompt processing progress streaming examples server
#14731 openedJul 17, 2025 bybaonudesifeizhaiLoading…
mtmd : Support jinja in libmtmd (Only for QwenVL and Qwen Omni) examples
#14730 openedJul 17, 2025 byalielmorsyLoading…
server: add prompt processing progress streaming for /completion endpoint #14685 examples server
#14728 openedJul 16, 2025 bybaonudesifeizhaiLoading…
vulkan: Add logging for bf16 features to ggml_vk_print_gpu_info (#13274) ggmlchanges relating to the ggml tensor library for machine learning VulkanIssues specific to the Vulkan backend
#14707 openedJul 16, 2025 byPeter0x44Loading…
Fix KleidiAI compilation errors with -DGGML_NATIVE=OFF (issue #14464) ggmlchanges relating to the ggml tensor library for machine learning
#14700 openedJul 15, 2025 bybaonudesifeizhaiLoading…
Adding a simple-function-call example - hopefully not doing anything wrong examples
#14682 openedJul 14, 2025 byklogdotwebsitenotdotcomLoading…
kleidiai: add support for get_rows ggmlchanges relating to the ggml tensor library for machine learning
#14676 openedJul 14, 2025 bychaxu01Loading…
bug fix: handle saving/loading null layers in recurrent memory
#14675 openedJul 14, 2025 byl3utterflyLoading…
Add Pad Reflect 1D CUDA support ggmlchanges relating to the ggml tensor library for machine learning Nvidia GPUIssues specific to Nvidia GPUs
#14659 openedJul 13, 2025 byYavorGIvanovLoading…
webui : add a preset feature to the settings examples server
#14649 openedJul 12, 2025 bygabriellarsonLoading…
Add CUDA non-contiguous Unary Ops support buildCompilation issues documentationImprovements or additions to documentation ggmlchanges relating to the ggml tensor library for machine learning Nvidia GPUIssues specific to Nvidia GPUs testingEverything test related
#14639 openedJul 11, 2025 byYavorGIvanovLoading…
OpenCL: addchanges relating to the ggml tensor library for machine learning OpenCLIssues specific to the OpenCL backend
mul_mat_f16_f32_image
kernel ggml #14635 openedJul 11, 2025 byrmatifLoading…
HIP: Enable Matrix cores for MMQ Kernels, Enable stream-K for CDNA 3 devopsimprovements to build systems and github actions ggmlchanges relating to the ggml tensor library for machine learning Nvidia GPUIssues specific to Nvidia GPUs
#14624 openedJul 10, 2025 bydeepsekLoading…
ProTip! What’s not been updated in a month:updated:<2025-06-18.