Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings
xlite-dev

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
@xlite-dev

xlite-dev

Develop ML/AI toolkits and ML/AI/CUDA Learning resources.

PinnedLoading

  1. LeetCUDALeetCUDAPublic

    📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

    Cuda 9.6k 948

  2. lite.ai.toolkitlite.ai.toolkitPublic

    🛠A lite C++ AI toolkit: 100+ models with MNN, ORT and TRT, including Det, Seg, Stable-Diffusion, Face-Fusion, etc.🎉

    C++ 4.4k 773

  3. Awesome-LLM-InferenceAwesome-LLM-InferencePublic

    📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

    Python 5k 338

  4. Awesome-DiT-InferenceAwesome-DiT-InferencePublic

    📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉

    Python 518 25

  5. torchlmtorchlmPublic

    💎An easy-to-use PyTorch library for face landmarks detection: training, evaluation, inference, and 100+ data augmentations.🎉

    Python 267 27

  6. ffpa-attnffpa-attnPublic

    🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.

    Cuda 250 13

Repositories

Loading
Type
Select type
Language
Select language
Sort
Select order
Showing 10 of 55 repositories

Top languages

Loading…


[8]ページ先頭

©2009-2026 Movatter.jp