gptq
Here are 26 public repositories matching this topic...
Language:All
Sort:Most stars
SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime
- Updated
Dec 18, 2025 - Python
LLM model quantization (compression) toolkit with hw acceleration support for Nvidia CUDA, AMD ROCm, Intel XPU and Intel/AMD/Apple CPU via HF, vLLM, and SGLang.
- Updated
Dec 18, 2025 - Python
Large Language Models for All, 🦙 Cult and More, Stay in touch !
- Updated
Jun 1, 2023 - HTML
🦖 X—LLM: Cutting Edge & Easy LLM Finetuning
- Updated
Jan 17, 2024 - Python
Run any Large Language Model behind a unified API
- Updated
Nov 13, 2023 - Python
🪶 Lightweight OpenAI drop-in replacement for Kubernetes
- Updated
Feb 5, 2024 - Python
A guide about how to use GPTQ models with langchain
- Updated
Aug 19, 2023 - Jupyter Notebook
Run gguf LLM models in Latest Version TextGen-webui and koboldcpp
- Updated
Aug 6, 2025 - Jupyter Notebook
Private self-improvement coaching with open-source LLMs
- Updated
Mar 7, 2024 - Python
ChatSakura:Open-source multilingual conversational model.(开源多语言对话大模型)
- Updated
Apr 2, 2023 - Python
This repository is for profiling, extracting, visualizing and reusing generative AI weights to hopefully build more accurate AI models and audit/scan weights at rest to identify knowledge domains for risk(s).
- Updated
Dec 18, 2023 - Python
A.L.I.C.E (Artificial Labile Intelligence Cybernated Existence). A REST API of A.I companion for creating more complex system
- Updated
Feb 6, 2025 - Python
🎯 Fine-tune large language models and use them for text-related tasks. This repository provides a straightforward approach to fine-tuning models like Gemma, Llama 🦙, and Mistral 🌪️ for various NLP tasks. 🔧 It includes training 📚, fine-tuning 🛠️, and inference pipelines ⚙️. 🚀
- Updated
Nov 28, 2025 - Jupyter Notebook
Conversation AI model for open domain dialogs
- Updated
Nov 15, 2023 - Python
Code for NAACL paper When Quantization Affects Confidence of Large Language Models?
- Updated
Dec 30, 2024 - Jupyter Notebook
This project will develop a NEPSE chatbot using an open-source LLM, incorporating sentence transformers, vector database and reranking.
- Updated
Dec 31, 2023 - Jupyter Notebook
Optimized Qwen2.5-3B using GPTQ, reducing size from 5.75GB → 1.93GB and improving inference speed. Ideal for efficient edge AI deployments.
- Updated
May 24, 2025 - Python
Improve this page
Add a description, image, and links to thegptq topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thegptq topic, visit your repo's landing page and select "manage topics."