int8-quantization
Here are 18 public repositories matching this topic...
Language:All
Sort:Most stars
CVT, a Computer Vision Toolkit.
- Updated
Aug 24, 2022 - C
Winner solution of mobile AI (CVPRW 2021).
- Updated
May 14, 2022 - Python
A resource-conscious neural network library for microcontrollers
- Updated
Dec 17, 2025 - C++
FrostNet: Towards Quantization-Aware Network Architecture Search
- Updated
May 3, 2024 - Python
Quantization Aware Training
- Updated
Jan 13, 2024 - Python
将端上模型部署过程中,常见的问题以及解决办法记录并汇总,希望能给其他人带来一点帮助。
- Updated
Aug 17, 2022 - Python
Generating tensorrt model using onnx
- Updated
Jun 22, 2023 - C++
VB.NET api wrapper for llm-inference chatllm.cpp
- Updated
Nov 26, 2024 - Visual Basic .NET
C# api wrapper for llm-inference chatllm.cpp
- Updated
Nov 20, 2024 - C#
Corrects your grammar in 5 languages directly in your browser. Powered by an open-source AI model.
- Updated
Jul 12, 2025 - JavaScript
TinyML project. This system monitors your room or surrounding with an onboard microphone of Arduino nano BLE sense. Still Under Developement
- Updated
Oct 18, 2021 - Jupyter Notebook
gemma-2-2b-it int8 cpu inference in one file of pure C#
- Updated
Jun 14, 2025 - C#
ZoneBurst is an efficient PRNG algorithm for 8-bit output.
- Updated
Dec 7, 2025 - C
High-performance LLM inference platform with vLLM continuous batching achieving 12.3K+ req/sec, 42ms P50/178ms P99 latency, INT8/INT4 quantization (70% memory savings), tensor parallelism across 4 GPUs, and comprehensive monitoring serving 1500+ concurrent users.
- Updated
Oct 3, 2025 - Python
Post-Training quantization perfomed on the model trained with CLIC dataset.
- Updated
Sep 1, 2025 - Jupyter Notebook
Translation API using Meta's NLLB-200 model with 200+ languages
- Updated
Dec 5, 2025 - Python
Yandex LLM Scaling Week 2025
- Updated
Dec 8, 2025 - Jupyter Notebook
Improve this page
Add a description, image, and links to theint8-quantization topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with theint8-quantization topic, visit your repo's landing page and select "manage topics."