Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
#

model-acceleration

Here are 27 public repositories matching this topic...

A curated list of neural network pruning resources.

  • UpdatedApr 4, 2024

A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

  • UpdatedJan 29, 2026

A list of high-quality (newest) AutoML works and lightweight models including 1.) Neural Architecture Search, 2.) Lightweight Structures, 3.) Model Compression, Quantization and Acceleration, 4.) Hyperparameter Optimization, 5.) Automated Feature Engineering.

  • UpdatedJun 19, 2021

Papers for deep neural network compression and acceleration

  • UpdatedJun 21, 2021

📚 Collection of token-level model compression resources.

  • UpdatedSep 3, 2025

[CVPR 2025] CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient

  • UpdatedSep 27, 2025
  • Python

Resources of our survey paper "Optimizing Edge AI: A Comprehensive Survey on Data, Model, and System Strategies"

  • UpdatedSep 11, 2025

[NeurIPS'24] Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy

  • UpdatedJan 22, 2025
  • Python

[NeurIPS 2025] ScaleKV: Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression

  • UpdatedNov 4, 2025
  • Python

A list of papers, docs, codes about diffusion distillation.This repo collects various distillation methods for the Diffusion model. Welcome to PR the works (papers, repositories) missed by the repo.

  • UpdatedDec 10, 2023

Deep Learning Compression and Acceleration SDK -- deep model compression for Edge and IoT embedded systems, and deep model acceleration for clouds and private servers

  • UpdatedMar 17, 2018

(NeurIPS-2019 MicroNet Challenge - 3rd Winner) Open source code for "SIPA: A simple framework for efficient networks"

  • UpdatedDec 18, 2022
  • Python

Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.

  • UpdatedApr 12, 2024
  • Jupyter Notebook

[IJCNN'19, IEEE JSTSP'19] Caffe code for our paper "Structured Pruning for Efficient ConvNets via Incremental Regularization"; [BMVC'18] "Structured Probabilistic Pruning for Convolutional Neural Network Acceleration"

  • UpdatedFeb 14, 2020
  • Makefile

A list of papers, docs, codes about diffusion quantization.This repo collects various quantization methods for the Diffusion Models. Welcome to PR the works (papers, repositories) missed by the repo.

  • UpdatedFeb 2, 2026

Improve this page

Add a description, image, and links to themodel-acceleration topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with themodel-acceleration topic, visit your repo's landing page and select "manage topics."

Learn more


[8]ページ先頭

©2009-2026 Movatter.jp