model-acceleration

Star

Here are 27 public repositories matching this topic...

Language:All

Filter by language

All27 Python7 Jupyter Notebook5 Makefile1

Sort:Most stars

Sort options

Most stars Fewest stars Most forks Fewest forks Recently updated Least recently updated

he-y /Awesome-Pruning

Star2.5k

A curated list of neural network pruning resources.

awesome-list pruning model-compression model-acceleration

UpdatedApr 4, 2024

Efficient-ML /Awesome-Model-Quantization

Star2.3k

A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

awesome deep-learning quantization model-compression model-acceleration binary-network binarized-neural-networks lightweight-neural-network model-quantization efficient-deep-learning

UpdatedJan 29, 2026

guan-yuan /Awesome-AutoML-and-Lightweight-Models

Star857

A list of high-quality (newest) AutoML works and lightweight models including 1.) Neural Architecture Search, 2.) Lightweight Structures, 3.) Model Compression, Quantization and Acceleration, 4.) Hyperparameter Optimization, 5.) Automated Feature Engineering.

tensorflow pytorch hyperparameter-optimization awesome-list quantization nas automl model-compression neural-architecture-search meta-learning architecture-search quantized-training model-acceleration automated-feature-engineering quantized-neural-network

UpdatedJun 19, 2021

chester256 /Model-Compression-Papers

Star402

Papers for deep neural network compression and acceleration

deep-neural-networks deep-learning papers model-compression model-acceleration

UpdatedJun 21, 2021

xuyang-liu16 /Awesome-Generation-Acceleration

Star388

📚 Collection of awesome generation acceleration resources.

image-generation text-to-image efficient-inference video-generation model-acceleration diffusion-models text-to-video efficient-deep-learning

UpdatedJul 7, 2025

cokeshao /Awesome-Multimodal-Token-Compression

Star303

[TMLR 2026] Survey:https://arxiv.org/pdf/2507.20198

awesome-list model-acceleration long-context mllm efficient-ai token-compression efficient-mllm

UpdatedFeb 10, 2026

xuyang-liu16 /Awesome-Token-level-Model-Compression

Star190

📚 Collection of token-level model compression resources.

computer-vision model-compression model-acceleration efficient-deep-learning token-pruning token-merging token-compression

UpdatedSep 3, 2025

czg1225 /CoDe

Star108

[CVPR 2025] CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient

transformers auto-regressive-model model-acceleration efficient-image-generation

UpdatedSep 27, 2025
Python

wangxb96 /Awesome-EdgeAI

Star99

Resources of our survey paper "Optimizing Edge AI: A Comprehensive Survey on Data, Model, and System Strategies"

machine-learning deep-learning awesome-list data-preprocessing efficient-algorithm model-compression edge-computing model-deployment model-acceleration edge-ai tiny-ml model-design model-inference

UpdatedSep 11, 2025

InternScience /AdaptiveDiffusion

Star73

[NeurIPS'24] Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy

efficient-inference model-acceleration adaptive-inference diffusion-models stable-diffusion training-free

UpdatedJan 22, 2025
Python

musco-ai /musco-pytorch

Star72

MUSCO: MUlti-Stage COmpression of neural networks

deep-neural-networks pytorch tensor-decomposition cp-decomposition tucker model-compression network-acceleration model-acceleration truncated-svd network-compression low-rank vbmf

UpdatedFeb 16, 2021
Jupyter Notebook

StargazerX0 /ScaleKV

Star50

[NeurIPS 2025] ScaleKV: Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression

transformers auto-regressive-model model-acceleration efficient-image-generation

UpdatedNov 4, 2025
Python

wlfeng0509 /Awesome-Diffusion-Distillation

Star40

A list of papers, docs, codes about diffusion distillation.This repo collects various distillation methods for the Diffusion model. Welcome to PR the works (papers, repositories) missed by the repo.

awesome deep-learning model-compression distillation model-acceleration diffusion-models lightweight-neural-network

UpdatedDec 10, 2023

signalogic /SigDL

Star20

Deep Learning Compression and Acceleration SDK -- deep model compression for Edge and IoT embedded systems, and deep model acceleration for clouds and private servers

cloud acceleration compression deep-learning model-compression flow-diagram nvidia-jetson-tx2 embedded-targets model-acceleration

UpdatedMar 17, 2018

Lee-Gihun /MicroNet_OSI-AI

Star19

(NeurIPS-2019 MicroNet Challenge - 3rd Winner) Open source code for "SIPA: A simple framework for efficient networks"

pruning model-compression micronet model-acceleration neurips-2019 micronet-challenge early-exiting adaptive-computation compact-neural-network

UpdatedDec 18, 2022
Python

ksm26 /Efficiently-Serving-LLMs

Star17

Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.

text-generation batch-processing server-optimization model-serving model-acceleration inference-optimization optimization-techniques machine-learning-operations deep-learning-techniques model-inference-service performance-enhancement scalability-strategies serving-infrastructure large-scale-deployment

UpdatedApr 12, 2024
Jupyter Notebook

thkimKETI /CNN_compression_rank_selection_BayesOpt

Star16

Bayesian Optimization-Based Global Optimal Rank Selection for Compression of Convolutional Neural Networks, IEEE Access

cnn pytorch bayesopt convolutional-neural-networks bayesian-optimization tensorly tucker model-compression cnn-compression network-acceleration rank-selection model-acceleration neural-network-compression gpyopt low-rank network-compression-acceleration

UpdatedMar 21, 2021
Python

MingSun-Tse /Caffe_IncReg

Star14

[IJCNN'19, IEEE JSTSP'19] Caffe code for our paper "Structured Pruning for Efficient ConvNets via Incremental Regularization"; [BMVC'18] "Structured Probabilistic Pruning for Convolutional Neural Network Acceleration"

pruning model-compression model-acceleration

UpdatedFeb 14, 2020
Makefile

wlfeng0509 /Awesome-Diffusion-Quantization

Star12

A list of papers, docs, codes about diffusion quantization.This repo collects various quantization methods for the Diffusion Models. Welcome to PR the works (papers, repositories) missed by the repo.

awesome model-compression model-acceleration diffusion-models model-quantization

UpdatedFeb 2, 2026

bhllx /On-Efficient-Variants-of-Segment-Anything-Model

Star10

On Efficient Variants of Segment Anything Model

model-compression model-acceleration segment-anything-model efficient-backbone light-weight-architecture

UpdatedJul 2, 2025

Improve this page

Add a description, image, and links to themodel-acceleration topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with themodel-acceleration topic, visit your repo's landing page and select "manage topics."

Learn more

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly