- Notifications
You must be signed in to change notification settings - Fork20
A Unified and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for 🤗Diffusers.
License
vipshop/cache-dit
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
We are excited to announce that thefirst API-stable version (v1.0.0) of cache-dit has finally been released!cache-dit is aUnified andFlexible Inference Engine for 🤗Diffusers, enabling acceleration with just
pip3 install -U cache-dit# pip3 install git+https://github.com/vipshop/cache-dit.gitYou can install the stable release of cache-dit from PyPI, or the latest development version from GitHub. Then try
>>>importcache_dit>>>fromdiffusersimportDiffusionPipeline>>>pipe=DiffusionPipeline.from_pretrained("Qwen/Qwen-Image")# Can be any diffusion pipeline>>>cache_dit.enable_cache(pipe)# One-line code with default cache options.>>>output=pipe(...)# Just call the pipe as normal.>>>stats=cache_dit.summary(pipe)# Then, get the summary of cache acceleration stats.>>>cache_dit.disable_cache(pipe)# Disable cache and run original pipe.
- 🎉Full 🤗Diffusers Support: Notably,cache-dit now supports nearlyall of Diffusers'DiT-based pipelines, include30+ series, nearly100+ pipelines, such as FLUX.1, Qwen-Image, Qwen-Image-Lightning, Wan 2.1/2.2, HunyuanImage-2.1, HunyuanVideo, HiDream, AuraFlow, CogView3Plus, CogView4, CogVideoX, LTXVideo, ConsisID, SkyReelsV2, VisualCloze, PixArt, Chroma, Mochi, SD 3.5, DiT-XL, etc.
- 🎉Extremely Easy to Use: In most cases, you only needone line of code:
cache_dit.enable_cache(...). After calling this API, just use the pipeline as normal. - 🎉Easy New Model Integration: Features likeUnified Cache APIs,Forward Pattern Matching,Automatic Block Adapter,Hybrid Forward Pattern, andPatch Functor make it highly functional and flexible. For example, we achieved 🎉 Day 1 support forHunyuanImage-2.1 with 1.7x speedup w/o precision loss—even before it was available in the Diffusers library.
- 🎉State-of-the-Art Performance: Compared with algorithms including Δ-DiT, Chipmunk, FORA, DuCa, TaylorSeer and FoCa, cache-dit achieved theSOTA performance w/7.4x↑🎉 speedup on ClipScore!
- 🎉Support for 4/8-Steps Distilled Models: Surprisingly, cache-dit'sDBCache works for extremely few-step distilled models—something many other methods fail to do.
- 🎉Compatibility with Other Optimizations: Designed to work seamlessly with torch.compile, Quantization (torchao,🔥nunchaku), CPU or Sequential Offloading,🔥Context Parallelism,🔥Tensor Parallelism, etc.
- 🎉Hybrid Cache Acceleration: Now supports hybridBlock-wise Cache + Calibrator schemes (e.g., DBCache or DBPrune + TaylorSeerCalibrator). DBCache or DBPrune acts as theIndicator to decidewhen to cache, while the Calibrator decideshow to cache. More mainstream cache acceleration algorithms (e.g., FoCa) will be supported in the future, along with additional benchmarks—stay tuned for updates!
- 🤗Diffusers Ecosystem Integration: 🔥cache-dit has joined the Diffusers community ecosystem as thefirst DiT-specific cache acceleration framework! Check out the documentation here:
Tip
OneModel Series may containmany pipelines. cache-dit applies optimizations at theTransformer level; thus, any pipelines that include the supported transformer are already supported by cache-dit. ✅: known work and official supported now; ✖️: unofficial supported now, but maybe support in the future;4-bits: w/ nunchaku + svdq int4.
| 📚Model | Cache | CP | TP | 📚Model | Cache | CP | TP |
|---|---|---|---|---|---|---|---|
| 🎉FLUX.1 | ✅ | ✅ | ✅ | 🎉FLUX.1 4-bits | ✅ | ✅ | ✖️ |
| 🎉Qwen-Image | ✅ | ✅ | ✅ | 🎉Qwen-Image 4-bits | ✅ | ✅ | ✖️ |
| 🎉Qwen...Lightning | ✅ | ✅ | ✅ | 🎉Qwen...Lightning 4-bits | ✅ | ✅ | ✖️ |
| 🎉CogVideoX | ✅ | ✅ | ✖️ | 🎉OmniGen | ✅ | ✖️ | ✖️ |
| 🎉Wan 2.1 | ✅ | ✅ | ✅ | 🎉PixArt Sigma | ✅ | ✅ | ✖️ |
| 🎉Wan 2.1 VACE | ✅ | ✅ | ✅ | 🎉PixArt Alpha | ✅ | ✅ | ✖️ |
| 🎉Wan 2.2 | ✅ | ✅ | ✅ | 🎉CogVideoX 1.5 | ✅ | ✅ | ✖️ |
| 🎉HunyuanVideo | ✅ | ✅ | ✅ | 🎉Sana | ✅ | ✖️ | ✖️ |
| 🎉LTXVideo | ✅ | ✅ | ✖️ | 🎉VisualCloze | ✅ | ✅ | ✅ |
| 🎉Allegro | ✅ | ✖️ | ✖️ | 🎉AuraFlow | ✅ | ✖️ | ✖️ |
| 🎉CogView4 | ✅ | ✅ | ✖️ | 🎉ShapE | ✅ | ✖️ | ✖️ |
| 🎉CogView3Plus | ✅ | ✅ | ✖️ | 🎉Chroma | ✅ | ✅ | ️✅ |
| 🎉Cosmos | ✅ | ✖️ | ✖️ | 🎉HiDream | ✅ | ✖️ | ✖️ |
| 🎉EasyAnimate | ✅ | ✖️ | ✖️ | 🎉HunyuanDiT | ✅ | ✖️ | ✅ |
| 🎉SkyReelsV2 | ✅ | ✖️ | ✖️ | 🎉HunyuanDiTPAG | ✅ | ✖️ | ✖️ |
| 🎉StableDiffusion3 | ✅ | ✖️ | ✖️ | 🎉Kandinsky5 | ✅ | ✖️ | ✅️ |
| 🎉ConsisID | ✅ | ✅ | ✖️ | 🎉PRX | ✅ | ✖️ | ✖️ |
| 🎉DiT | ✅ | ✅ | ✖️ | 🎉HunyuanImage | ✅ | ✅ | ✅ |
| 🎉Amused | ✅ | ✖️ | ✖️ | 🎉LongCatVideo | ✅ | ✖️ | ✖️ |
| 🎉StableAudio | ✅ | ✖️ | ✖️ | 🎉Bria | ✅ | ✖️ | ✖️ |
| 🎉Mochi | ✅ | ✖️ | ✅ | 🎉Lumina | ✅ | ✖️ | ✖️ |
🔥Click here to show manyImage/Video cases🔥
🎉Now, cache-dit covers almost All Diffusers' DiT Pipelines🎉
🔥Qwen-Image |Qwen-Image-Edit |Qwen-Image-Edit-Plus 🔥
🔥FLUX.1 |Qwen-Image-Lightning 4/8 Steps | Wan 2.1 | Wan 2.2🔥
🔥HunyuanImage-2.1 |HunyuanVideo |HunyuanDiT |HiDream |AuraFlow🔥
🔥CogView3Plus |CogView4 |LTXVideo |CogVideoX |CogVideoX 1.5 |ConsisID🔥
🔥Cosmos |SkyReelsV2 |VisualCloze |OmniGen 1/2 |Lumina 1/2 |PixArt🔥
🔥Chroma |Sana |Allegro |Mochi |SD 3/3.5 |Amused | ... |DiT-XL🔥




🔥Wan2.2 MoE |+cache-dit:2.0x↑🎉 |HunyuanVideo |+cache-dit:2.1x↑🎉




🔥Qwen-Image |+cache-dit:1.8x↑🎉 |FLUX.1-dev |+cache-dit:2.1x↑🎉




🔥Qwen...Lightning |+cache-dit:1.14x↑🎉 |HunyuanImage |+cache-dit:1.7x↑🎉




🔥Qwen-Image-Edit | Input w/o Edit | Baseline |+cache-dit:1.6x↑🎉 | 1.9x↑🎉





🔥FLUX-Kontext-dev | Baseline |+cache-dit:1.3x↑🎉 | 1.7x↑🎉 | 2.0x↑ 🎉





🔥HiDream-I1 |+cache-dit:1.9x↑🎉 |CogView4 |+cache-dit:1.4x↑🎉 | 1.7x↑🎉





🔥CogView3 |+cache-dit:1.5x↑🎉 | 2.0x↑🎉|Chroma1-HD |+cache-dit:1.9x↑🎉




🔥Mochi-1-preview |+cache-dit:1.8x↑🎉 |SkyReelsV2 |+cache-dit:1.6x↑🎉





🔥VisualCloze-512 | Model | Cloth | Baseline |+cache-dit:1.4x↑🎉 | 1.7x↑🎉




🔥LTX-Video-0.9.7 |+cache-dit:1.7x↑🎉 |CogVideoX1.5 |+cache-dit:2.0x↑🎉





🔥OmniGen-v1 |+cache-dit:1.5x↑🎉 | 3.3x↑🎉 |Lumina2 |+cache-dit:1.9x↑🎉




🔥Allegro |+cache-dit:1.36x↑🎉 |AuraFlow-v0.3 |+cache-dit:2.27x↑🎉





🔥Sana |+cache-dit:1.3x↑🎉 | 1.6x↑🎉|PixArt-Sigma |+cache-dit:2.3x↑🎉





🔥PixArt-Alpha |+cache-dit:1.6x↑🎉 | 1.8x↑🎉|SD 3.5 |+cache-dit:2.5x↑🎉





🔥Asumed |+cache-dit:1.1x↑🎉 | 1.2x↑🎉 |DiT-XL-256 |+cache-dit:1.8x↑🎉
For more advanced features such asUnified Cache APIs,Forward Pattern Matching,Automatic Block Adapter,Hybrid Forward Pattern,Patch Functor,DBCache,DBPrune,TaylorSeer Calibrator,Hybrid Cache CFG,Context Parallelism andTensor Parallelism, please refer to the🎉User_Guide.md for details.
- ⚙️Installation
- 🔥Supported DiTs
- 🔥Benchmarks
- 🎉Unified Cache APIs
- ⚡️DBCache: Dual Block Cache
- ⚡️DBPrune: Dynamic Block Prune
- ⚡️Hybrid Cache CFG
- 🔥Hybrid TaylorSeer Calibrator
- ⚡️Hybrid Context Parallelism
- ⚡️Hybrid Tensor Parallelism
- 🤖Low-bits Quantization
- 🛠Metrics Command Line
- ⚙️Torch Compile
- 📚API Documents
How to contribute? Star ⭐️ this repo to support us or checkCONTRIBUTE.md.
Here is a curated list of open-source projects integratingCacheDiT, including popular repositories likejetson-containers,flux-fast, andsdnext. 🎉CacheDiT has beenrecommended by:Wan 2.2,Qwen-Image-Lightning,Qwen-Image,LongCat-Video,Kandinsky-5,🤗diffusers andHelloGitHub, among others.
Special thanks to vipshop's Computer Vision AI Team for supporting document, testing and production-level deployment of this project.
@misc{cache-dit@2025,title={cache-dit: A Unified and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for Diffusers.},url={https://github.com/vipshop/cache-dit.git},note={Open-source software available at https://github.com/vipshop/cache-dit.git},author={DefTruth, vipshop.com},year={2025}}
About
A Unified and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for 🤗Diffusers.
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Contributors2
Uh oh!
There was an error while loading.Please reload this page.

