xlite-dev
🛠 Repositories: lite.ai.toolkit | 📚Awesome-LLM-Inference | 📚LeetCUDA 🎧
🤖 ffpa-attn | 📈HGEMM | 🤗flux-faster | 📚Awesome-DiT-Inference 🖱
⚙️ RVM-Inference | lihang-notes(📚PDF, 200 Pages) | 💎torchlm 🔥
🤖 Contact: qyjdef@163.com | GitHub: DefTruth | 知乎: DefTruth 📞
PinnedLoading
- lite.ai.toolkit
lite.ai.toolkit Public🛠A lite C++ AI toolkit: 100+ models with MNN, ORT and TRT, including Det, Seg, Stable-Diffusion, Face-Fusion, etc.🎉
- Awesome-LLM-Inference
Awesome-LLM-Inference Public📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
- Awesome-DiT-Inference
Awesome-DiT-Inference Public📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉
Repositories
- lite.ai.toolkit Public
🛠A lite C++ AI toolkit: 100+ models with MNN, ORT and TRT, including Det, Seg, Stable-Diffusion, Face-Fusion, etc.🎉
xlite-dev/lite.ai.toolkit’s past year of commit activity - diffusers Public Forked fromhuggingface/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
xlite-dev/diffusers’s past year of commit activity - sglang Public Forked fromsgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
Uh oh!
There was an error while loading.Please reload this page.
xlite-dev/sglang’s past year of commit activity - vllm-omni Public Forked fromvllm-project/vllm-omni
A framework for efficient model inference with omni-modality models
Uh oh!
There was an error while loading.Please reload this page.
xlite-dev/vllm-omni’s past year of commit activity - LeetCUDA Public
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
Uh oh!
There was an error while loading.Please reload this page.
xlite-dev/LeetCUDA’s past year of commit activity - SageAttention Public Forked fromthu-ml/SageAttention
Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
Uh oh!
There was an error while loading.Please reload this page.
xlite-dev/SageAttention’s past year of commit activity Uh oh!
There was an error while loading.Please reload this page.
xlite-dev/Z-Image’s past year of commit activity - Awesome-LLM-Inference Public
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
Uh oh!
There was an error while loading.Please reload this page.
xlite-dev/Awesome-LLM-Inference’s past year of commit activity - Awesome-DiT-Inference Public
📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉
Uh oh!
There was an error while loading.Please reload this page.
xlite-dev/Awesome-DiT-Inference’s past year of commit activity
Top languages
Loading…
Uh oh!
There was an error while loading.Please reload this page.
Most used topics
Loading…
Uh oh!
There was an error while loading.Please reload this page.
