Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings
xlite-dev

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
@xlite-dev

xlite-dev

Develop ML/AI toolkits and ML/AI/CUDA Learning resources.

PinnedLoading

  1. LeetCUDALeetCUDAPublic

    📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

    Cuda 9k 876

  2. lite.ai.toolkitlite.ai.toolkitPublic

    🛠A lite C++ AI toolkit: 100+ models with MNN, ORT and TRT, including Det, Seg, Stable-Diffusion, Face-Fusion, etc.🎉

    C++ 4.3k 768

  3. Awesome-LLM-InferenceAwesome-LLM-InferencePublic

    📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

    Python 4.8k 327

  4. Awesome-DiT-InferenceAwesome-DiT-InferencePublic

    📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉

    Python 469 24

  5. torchlmtorchlmPublic

    💎An easy-to-use PyTorch library for face landmarks detection: training, evaluation, inference, and 100+ data augmentations.🎉

    Python 268 27

  6. ffpa-attnffpa-attnPublic

    🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.

    Cuda 241 12

Repositories

Loading
Type
Select type
Language
Select language
Sort
Select order
Showing 10 of 52 repositories
  • lite.ai.toolkit Public

    🛠A lite C++ AI toolkit: 100+ models with MNN, ORT and TRT, including Det, Seg, Stable-Diffusion, Face-Fusion, etc.🎉

    xlite-dev/lite.ai.toolkit’s past year of commit activity
    C++ 4,323GPL-3.0 768 0 0 UpdatedDec 12, 2025
  • diffusers Public Forked fromhuggingface/diffusers

    🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.

    xlite-dev/diffusers’s past year of commit activity
    Python0Apache-2.0 6,681 0 0 UpdatedDec 12, 2025
  • sglang Public Forked fromsgl-project/sglang

    SGLang is a fast serving framework for large language models and vision language models.

    xlite-dev/sglang’s past year of commit activity
    Python0Apache-2.0 3,801 0 0 UpdatedDec 12, 2025
  • vllm-omni Public Forked fromvllm-project/vllm-omni

    A framework for efficient model inference with omni-modality models

    xlite-dev/vllm-omni’s past year of commit activity
    Python0Apache-2.0 132 0 0 UpdatedDec 11, 2025
  • LeetCUDA Public

    📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

    xlite-dev/LeetCUDA’s past year of commit activity
    Cuda 8,958GPL-3.0 876 3 1 UpdatedDec 4, 2025
  • SageAttention Public Forked fromthu-ml/SageAttention

    Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.

    xlite-dev/SageAttention’s past year of commit activity
    Cuda0Apache-2.0 287 0 0 UpdatedDec 3, 2025
  • Z-Image Public Forked fromTongyi-MAI/Z-Image
    xlite-dev/Z-Image’s past year of commit activity
    1Apache-2.0 422 0 0 UpdatedNov 28, 2025
  • Awesome-LLM-Inference Public

    📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

    xlite-dev/Awesome-LLM-Inference’s past year of commit activity
    Python 4,837GPL-3.0 327 1 0 UpdatedNov 28, 2025
  • Awesome-DiT-Inference Public

    📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉

    xlite-dev/Awesome-DiT-Inference’s past year of commit activity
    Python 469GPL-3.0 24 0 0 UpdatedNov 28, 2025
  • .github Public
    xlite-dev/.github’s past year of commit activity
    10 0 0 UpdatedNov 25, 2025

[8]ページ先頭

©2009-2025 Movatter.jp