ggml-org/llama.cppPublic

NotificationsYou must be signed in to change notification settings
Fork12.3k
Star82.7k

Hugging Face Inference Endpoints now supports GGUF out of the box!#9669

Pinned

ngxson started this conversation inShow and tell

ngxson

Sep 27, 2024

· 1 comment

Return to top

Discussion options

ngxson
Sep 27, 2024
Collaborator

You can now deploy any GGUF model on your own endpoint, in just a few clicks!

Simply select GGUF, select hardware configuration and done! An endpoint powered byllama-server (built frommaster branch) will be deployed automatically. It works with all llama.cpp-compatible models, with all size, from 0.1B up to 405B parameters.

Try it now -->https://ui.endpoints.huggingface.co/

And the best part is:

@ggerganov: ggml.ai will be receiving a revenue share from all llama.cpp-powered endpoints used on HF. So for anyone who wants to support us, make sure to give those endpoints a try♥️

A huge thanks to@ggerganov @slaren and@huggingface team for making this possible!

llama.hfe.ok.mp4

You must be logged in to vote

Replies: 1 comment

Comment options

ngxson
Sep 27, 2024
Collaborator Author

Hermes 405B model can be deployed on 2xA100. The generation speed is around 8t/s, which is not bad!

You must be logged in to vote

0 replies

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Hugging Face Inference Endpoints now supports GGUF out of the box!#9669

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

ngxson
Sep 27, 2024
Collaborator

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

ngxson
Sep 27, 2024
Collaborator Author

Select a reply

Uh oh!

Movatterモバイル変換

Hugging Face Inference Endpoints now supports GGUF out of the box!#9669

Uh oh!

Uh oh!

ngxsonSep 27, 2024 Collaborator

Replies: 1 comment

Uh oh!

ngxsonSep 27, 2024 Collaborator Author

Uh oh!

ngxson
Sep 27, 2024
Collaborator

ngxson
Sep 27, 2024
Collaborator Author