Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Replace Copilot with a more powerful and local AI

License

NotificationsYou must be signed in to change notification settings

sorokinvld/llama-coder

 
 

Repository files navigation

Llama Coder is a better and self-hosted Github Copilot replacement for VS Studio Code. Llama Coder usesOllama and codellama to provide autocomplete that runs on your hardware. Works best with Mac M1/M2/M3 or with RTX 4090.

VS Code Plugin

Features

  • 🚀 As good as Copilot
  • ⚡️ Fast. Works well on consumer GPUs. RTX 4090 is recommended for best performance.
  • 🔐 No telemetry or tracking
  • 🔬 Works with any language coding or human one.

Recommended hardware

Minimum required RAM: 16GB is a minimum, more is better since even smallest model takes 5GB of RAM.The best way: dedicated machine with RTX 4090. InstallOllama on this machine and configure endpoint in extension settings to offload to this machine.Second best way: run on MacBooc M1/M2/M3 with enougth RAM (more == better, but 10gb extra would be enougth).For windows notebooks: it runs good with decent GPU, but dedicated machine with a good GPU is recommended. Perfect if you have a dedicated gaming PC.

Local Installation

InstallOllama on local machine and then launch the extension in VSCode, everything should work as it is.

Remote Installation

InstallOllama on dedicated machine and configure endpoint to it in extension settings. Ollama usually uses port 11434 and binds to127.0.0.1, to change it you should setOLLAMA_HOST to0.0.0.0.

Models

Currently Llama Coder supports only Codellama. Model is quantized in different ways, but our tests shows thatq4 is an optimal way to run network. When selecting model the bigger the model is, it performs better. Always pick the model with the biggest size and the biggest possible quantization for your machine. Default one iscodellama:7b-code-q4_K_M and should work everywhere,codellama:34b-code-q4_K_M is the best possible one.

NameRAM/VRAMNotes
codellama:7b-code-q4_K_M5GB
codellama:7b-code-q6_K6GBm
codellama:7b-code-fp1614GBg
codellama:13b-code-q4_K_M10GB
codellama:13b-code-q6_K14GBm
codellama:34b-code-q4_K_M24GB
codellama:34b-code-q6_K32GBm
  • m - slow on MacOS
  • g - slow on older NVidia cards (pre 30xx)

Changelog

[0.0.8]

  • Improved DeepSeek support and language detection

[0.0.7]

  • Added DeepSeek support
  • Ability to change temperature and top p
  • Fixed some bugs

[0.0.6]

  • Fix ollama links
  • Added more models

[0.0.4]

  • Initial release of Llama Coder

About

Replace Copilot with a more powerful and local AI

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript99.0%
  • JavaScript1.0%

[8]ページ先頭

©2009-2025 Movatter.jp