Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings
xlite-dev

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
@xlite-dev

xlite-dev

Develop ML/AI toolkits and ML/AI/CUDA Learning resources.

PinnedLoading

  1. LeetCUDALeetCUDAPublic

    📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

    Cuda 8.3k 823

  2. lite.ai.toolkitlite.ai.toolkitPublic

    🛠A lite C++ AI toolkit: 100+ models with MNN, ORT and TRT, including Det, Seg, Stable-Diffusion, Face-Fusion, etc.🎉

    C++ 4.3k 763

  3. Awesome-LLM-InferenceAwesome-LLM-InferencePublic

    📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

    Python 4.7k 319

  4. Awesome-DiT-InferenceAwesome-DiT-InferencePublic

    📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉

    Python 435 21

  5. torchlmtorchlmPublic

    💎An easy-to-use PyTorch library for face landmarks detection: training, evaluation, inference, and 100+ data augmentations.🎉

    Python 266 27

  6. ffpa-attnffpa-attnPublic

    🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.

    Cuda 226 10

Repositories

Loading
Type
Select type
Language
Select language
Sort
Select order
Showing 10 of 49 repositories

[8]ページ先頭

©2009-2025 Movatter.jp