Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Releases: ggml-org/llama.cpp

b4988

28 Mar 21:56
3714c3e
This commit was created on GitHub.com and signed with GitHub’sverified signature.
GPG key ID:B5690EEEBB952194
Verified
Learn about vigilant mode.
Compare
Choose a tag to compare
Loading
llama : fix incorrect Qwen2Moe ffn_moe_out graph callback (#12631)
Assets25

b4987

28 Mar 19:06
b4ae508
This commit was created on GitHub.com and signed with GitHub’sverified signature.
GPG key ID:B5690EEEBB952194
Verified
Learn about vigilant mode.
Compare
Choose a tag to compare
Loading
metal : improve FA + improve MoE (#12612)* ggml : FA with different K, V head sizes (CPU)ggml-ci* metal : add FA with HS=192* metal : extend FA to support different K and V head sizesggml-ci* metal : add FA vector kernels for heads K 192 and V 128ggml-ci* ggml : restrict op on other backends to equal head sizesggml-ci* metal : optimize FA-vec kernelggml-ci* metal : FA remove mq registers* metal : improve MoE mul_mat_id conditionggml-ci* metal : fix comments + remove unnecessary additionggml-ci* metal : avoid too much shared memory usage with mul_mat_idggml-ci
Loading

b4986

28 Mar 18:42
b86f600
This commit was created on GitHub.com and signed with GitHub’sverified signature.
GPG key ID:B5690EEEBB952194
Verified
Learn about vigilant mode.
Compare
Choose a tag to compare
Loading
vulkan: fix coopmat shader generation when cross-compiling (#12272)* vulkan: fix coopmat shader generation when cross-compilingPreviously the status of coopmat{,2} support isn't passed to thevulkan-shaders-gen project building on the host, which leads to buildfailure because of the cross-compiling code expecting coopmat{,2}shaders that didn't get generated.Fix this by passing the coopmat{,2} support status to vulkan-shaderssubproject.Signed-off-by: Icenowy Zheng <uwu@icenowy.me>* Only call coop-mat shaders once* Fix whitespace---------Signed-off-by: Icenowy Zheng <uwu@icenowy.me>Co-authored-by: bandoti <141645996+bandoti@users.noreply.github.com>
Loading

b4985

28 Mar 18:02
dd373dd
This commit was created on GitHub.com and signed with GitHub’sverified signature.
GPG key ID:B5690EEEBB952194
Verified
Learn about vigilant mode.
Compare
Choose a tag to compare
Loading
llama: fix error on bad grammar (#12628)
Loading

b4984

28 Mar 08:59
5d01670
This commit was created on GitHub.com and signed with GitHub’sverified signature.
GPG key ID:B5690EEEBB952194
Verified
Learn about vigilant mode.
Compare
Choose a tag to compare
Loading
server : include speculative decoding stats when timings_per_token is…
Loading
ghchris2021 reacted with thumbs up emoji
1 person reacted

b4982

28 Mar 08:31
1373176
This commit was created on GitHub.com and signed with GitHub’sverified signature.
GPG key ID:B5690EEEBB952194
Verified
Learn about vigilant mode.
Compare
Choose a tag to compare
Loading
llamafile : ppc64le GEMV forwarding for FP32. (#12594)This patch enables usage of MMA when one of thedimensions of the matrix(ie either M or N) is 1. Thisis useful in case of token generation where N < 2.The concept of 'GEMV Forwarding' is used where when oneof the matrix has a single row/column, the elements arebroadcasted, instead of using packing routine to prepackthe matrix elements.This change results in 5% - 15% improvement in totalspeed(ie all tokens/total time), across various batchsizes. This is in comparision with the correspondingdot product implementation.The patch is tested with FP32 models of Meta-Lllama-3-8B,Mistral-7B, Llama-2-7B-chat-hf on a IBM POWER10 machine.Signed-off-by: Amrita H S <amritahs@linux.vnet.ibm.com>
Loading

b4981

28 Mar 07:03
ab6ab8f
This commit was created on GitHub.com and signed with GitHub’sverified signature.
GPG key ID:B5690EEEBB952194
Verified
Learn about vigilant mode.
Compare
Choose a tag to compare
Loading
rpc : send hash when tensor data is above some fixed threshold (#12496)* rpc : send hash when tensor data is above some fixed thresholdref #10095* rpc : put cache under $HOME/.cache/llama.cpp* try to fix win32 build* another try to fix win32 build* remove llama as dependency
Loading
ghchris2021 reacted with thumbs up emoji
1 person reacted

b4980

27 Mar 23:32
2099a9d
This commit was created on GitHub.com and signed with GitHub’sverified signature.
GPG key ID:B5690EEEBB952194
Verified
Learn about vigilant mode.
Compare
Choose a tag to compare
Loading
server : Support listening on a unix socket (#12613)* server : Bump cpp-httplib to include AF_UNIX windows supportSigned-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>* server : Allow running the server example on a unix socketSigned-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>---------Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>
Loading

b4978

27 Mar 16:00
5dec47d
This commit was created on GitHub.com and signed with GitHub’sverified signature.
GPG key ID:B5690EEEBB952194
Verified
Learn about vigilant mode.
Compare
Choose a tag to compare
Loading
opencl: add multi and vision rope, `gelu_quick` and `im2col` (#12600)* opencl: add `im2col`* opencl: add `gelu_quick`* opencl: add mrope* opencl: add vision rope
Loading
ghchris2021 reacted with thumbs up emoji
1 person reacted

b4977

27 Mar 11:54
f125b8d
This commit was created on GitHub.com and signed with GitHub’sverified signature.
GPG key ID:B5690EEEBB952194
Verified
Learn about vigilant mode.
Compare
Choose a tag to compare
Loading
llama : add PLM GGUF Conversion & Inference Support (#12457)* add edgellm model arch[conversation feature doesn't work]* remove output.weight layer for edgellm arch* [Model] update the name of the model* update the name of model arch in convert gguf* [Model] Refarctor the model arch into llama-model* [Bug] Fix the bug in create attn kv* [Code] Fix editorconfig erros* [Code] Remove Trailing whitespace* [Code] Remove Trailing whitespace* [Code] Change the order of model arch in list* [Code] Fix flake8 Lint errors* Remove trailing white space* [Code] Remove  call in model arch
Loading
Previous134599100
Previous

[8]ページ先頭

©2009-2025 Movatter.jp