Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
.vscode		.vscode
src		src
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.vscodeignore		.vscodeignore
.yarnrc		.yarnrc
LICENSE		LICENSE
README.md		README.md
icon.png		icon.png
jest.config.js		jest.config.js
jest.setup.js		jest.setup.js
package.json		package.json
tsconfig.json		tsconfig.json
yarn.lock		yarn.lock

Repository files navigation

Llama Coder

Llama Coder is a better and self-hosted Github Copilot replacement forVS Code. Llama Coder usesOllama and codellama to provide autocomplete that runs on your hardware. Works best with Mac M1/M2/M3 or with RTX 4090.

VS Code Plugin

Features

🚀 As good as Copilot
⚡️ Fast. Works well on consumer GPUs. Apple Silicon or RTX 4090 is recommended for best performance.
🔐 No telemetry or tracking
🔬 Works with any language coding or human one.

Recommended hardware

Minimum required RAM: 16GB is a minimum, more is better since even smallest model takes 5GB of RAM.The best way: dedicated machine with RTX 4090. InstallOllama on this machine and configure endpoint in extension settings to offload to this machine.Second best way: run on MacBook M1/M2/M3 with enough RAM (more == better, but 10gb extra would be enough).For windows notebooks: it runs good with decent GPU, but dedicated machine with a good GPU is recommended. Perfect if you have a dedicated gaming PC.

Local Installation

InstallOllama on local machine and then launch the extension in VSCode, everything should work as it is.

Remote Installation

InstallOllama on dedicated machine and configure endpoint to it in extension settings. Ollama usually uses port 11434 and binds to127.0.0.1, to change it you should setOLLAMA_HOST to0.0.0.0.

Models

Currently Llama Coder supports only Codellama. Model is quantized in different ways, but our tests shows thatq4 is an optimal way to run network. When selecting model the bigger the model is, it performs better. Always pick the model with the biggest size and the biggest possible quantization for your machine. Default one isstable-code:3b-code-q4_0 and should work everywhere and outperforms most other models.

Name	RAM/VRAM	Notes
stable-code:3b-code-q4_0	3GB
codellama:7b-code-q4_K_M	5GB
codellama:7b-code-q6_K	6GB	m
codellama:7b-code-fp16	14GB	g
codellama:13b-code-q4_K_M	10GB
codellama:13b-code-q6_K	14GB	m
codellama:34b-code-q4_K_M	24GB
codellama:34b-code-q6_K	32GB	m