cuda-kernels
Here are 277 public repositories matching this topic...
Language:All
Sort:Most stars
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
- Updated
Dec 4, 2025 - Cuda
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
- Updated
Sep 5, 2025 - C
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
- Updated
Dec 18, 2025 - Python
Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.
- Updated
Dec 16, 2025 - Rust
Deep learning in Rust, with shape checked tensors and neural networks
- Updated
Jul 23, 2024 - Rust
Safe rust wrapper around CUDA toolkit
- Updated
Dec 11, 2025 - Rust
CUDA Kernel Benchmarking Library
- Updated
Dec 10, 2025 - Cuda
Kernel Tuner
- Updated
Dec 16, 2025 - Python
Simple utilities to enable code reuse and portability between CUDA C/C++ and standard C/C++.
- Updated
Apr 14, 2022 - C++
CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning
- Updated
Dec 15, 2025 - Cuda
This is an archive of materials produced for an introductory class on CUDA programming at Stanford University in 2010
- Updated
Jun 24, 2022 - C++
CUDA tutorials for Maths & ML tutorials with examples, covers multi-gpus, fused attention, winograd convolution, reinforcement learning.
- Updated
Jun 11, 2025 - Cuda
Amplifier allows .NET developers to easily run complex applications with intensive mathematical computation on Intel CPU/GPU, NVIDIA, AMD without writing any additional C kernel code. Write your function in .NET and Amplifier will take care of running it on your favorite hardware.
- Updated
Apr 9, 2025 - C#
Some CUDA design patterns and a bit of template magic for CUDA
- Updated
Jun 3, 2023 - C++
Triton implementation of FlashAttention2 that adds Custom Masks.
- Updated
Aug 14, 2024 - Python
Spiking Neural Networks in C++ with strong GPU acceleration through CUDA
- Updated
Jul 3, 2020 - Cuda
High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline.
- Updated
Jul 13, 2024 - Cuda
CUDA kernel author's tools
- Updated
Apr 24, 2022 - Cuda
Improve this page
Add a description, image, and links to thecuda-kernels topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thecuda-kernels topic, visit your repo's landing page and select "manage topics."