Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
#

llama-cpp

Here are 189 public repositories matching this topic...

A self-hosted, offline, ChatGPT-like chatbot. Powered by Llama 2. 100% private, with no data leaving your device. New: Code Llama support!

  • UpdatedApr 23, 2024
  • TypeScript

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.

  • UpdatedNov 1, 2025
  • C#
maid

Maid is a cross-platform Flutter app for interfacing with GGUF / llama.cpp models locally, and with Ollama and OpenAI models remotely.

  • UpdatedJul 28, 2025
  • Dart
node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level

  • UpdatedOct 26, 2025
  • TypeScript

llama.go is like llama.cpp in pure Golang!

  • UpdatedSep 20, 2024
  • Go

React Native binding of llama.cpp

  • UpdatedNov 6, 2025
  • C

Build and run AI agents using Docker Compose. A collection of ready-to-use examples for orchestrating open-source LLMs, tools, and agent runtimes.

  • UpdatedOct 24, 2025
  • TypeScript

Self-evaluating interview for AI coders

  • UpdatedJun 21, 2025
  • Python

LLama.cpp rust bindings

  • UpdatedJun 27, 2024
  • Rust

Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.

  • UpdatedMay 21, 2025
  • Python

This repo is to showcase how you can run a model locally and offline, free of OpenAI dependencies.

  • UpdatedJul 12, 2024
  • Python

Review/Check GGUF files and estimate the memory usage and maximum tokens per second.

  • UpdatedAug 18, 2025
  • Go

A Pure Rust based LLM (Any LLM based MLLM such as Spark-TTS) Inference Engine, powering by Candle framework.

  • UpdatedOct 25, 2025
  • Rust

Run LLMs locally. A clojure wrapper for llama.cpp.

  • UpdatedMar 29, 2025
  • Clojure

Booster - open accelerator for LLM models. Better inference and debugging for AI hackers

  • UpdatedAug 15, 2024
  • C++
shady.ai

LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.

  • UpdatedJun 10, 2023
  • Python

Improve this page

Add a description, image, and links to thellama-cpp topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with thellama-cpp topic, visit your repo's landing page and select "manage topics."

Learn more


[8]ページ先頭

©2009-2025 Movatter.jp