Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
#

gptq

Here are 26 public repositories matching this topic...

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

  • UpdatedDec 18, 2025
  • Python

LLM model quantization (compression) toolkit with hw acceleration support for Nvidia CUDA, AMD ROCm, Intel XPU and Intel/AMD/Apple CPU via HF, vLLM, and SGLang.

  • UpdatedDec 18, 2025
  • Python
LLaMA-Cult-and-More

Large Language Models for All, 🦙 Cult and More, Stay in touch !

  • UpdatedJun 1, 2023
  • HTML

Run any Large Language Model behind a unified API

  • UpdatedNov 13, 2023
  • Python

🪶 Lightweight OpenAI drop-in replacement for Kubernetes

  • UpdatedFeb 5, 2024
  • Python

记录量化LLM中的总结。

  • UpdatedDec 16, 2025
  • Python

A guide about how to use GPTQ models with langchain

  • UpdatedAug 19, 2023
  • Jupyter Notebook

Run gguf LLM models in Latest Version TextGen-webui and koboldcpp

  • UpdatedAug 6, 2025
  • Jupyter Notebook

An OpenAI Compatible API which integrates LLM, Embedding and Reranker. 一个集成 LLM、Embedding 和 Reranker 的 OpenAI 兼容 API

  • UpdatedAug 21, 2025
  • Python

Private self-improvement coaching with open-source LLMs

  • UpdatedMar 7, 2024
  • Python

ChatSakura:Open-source multilingual conversational model.(开源多语言对话大模型)

  • UpdatedApr 2, 2023
  • Python

This repository is for profiling, extracting, visualizing and reusing generative AI weights to hopefully build more accurate AI models and audit/scan weights at rest to identify knowledge domains for risk(s).

  • UpdatedDec 18, 2023
  • Python

A.L.I.C.E (Artificial Labile Intelligence Cybernated Existence). A REST API of A.I companion for creating more complex system

  • UpdatedFeb 6, 2025
  • Python

🎯 Fine-tune large language models and use them for text-related tasks. This repository provides a straightforward approach to fine-tuning models like Gemma, Llama 🦙, and Mistral 🌪️ for various NLP tasks. 🔧 It includes training 📚, fine-tuning 🛠️, and inference pipelines ⚙️. 🚀

  • UpdatedNov 28, 2025
  • Jupyter Notebook

Code for NAACL paper When Quantization Affects Confidence of Large Language Models?

  • UpdatedDec 30, 2024
  • Jupyter Notebook

This project will develop a NEPSE chatbot using an open-source LLM, incorporating sentence transformers, vector database and reranking.

  • UpdatedDec 31, 2023
  • Jupyter Notebook

Optimized Qwen2.5-3B using GPTQ, reducing size from 5.75GB → 1.93GB and improving inference speed. Ideal for efficient edge AI deployments.

  • UpdatedMay 24, 2025
  • Python

LLM quantization techniques: absmax, zero-point, GPTQ and GGUF

  • UpdatedAug 2, 2024
  • Jupyter Notebook

Improve this page

Add a description, image, and links to thegptq topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with thegptq topic, visit your repo's landing page and select "manage topics."

Learn more


[8]ページ先頭

©2009-2025 Movatter.jp