- Notifications
You must be signed in to change notification settings - Fork0
HROlive/Fundamentals-of-Accelerated-Computing-with-CUDA-Python
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
This course explores how to use Numba—the just-in-time, type-specializing Python function compiler—to accelerate Python programs to run on massively parallel NVIDIA GPUs.
You’ll learn how to:
- Use Numba to compile CUDA kernels from NumPy universal functions (ufuncs);
- Use Numba to create and launch custom CUDA kernels;
- Apply key GPU memory management techniques.
- Upon completion, you’ll be able to use Numba to compile and launch CUDA kernels to accelerate your Python applications on NVIDIA GPUs.
At the conclusion of the workshop, you’ll have an understanding of the fundamental tools and techniques for GPU-accelerated Python applications with CUDA and Numba:
- GPU-accelerate NumPy ufuncs with a few lines of code.
- Configure code parallelization using the CUDA thread hierarchy.
- Write custom CUDA device kernels for maximum performance and flexibility.
- Use memory coalescing and on-device shared memory to increase CUDA kernel bandwidth.
More detailed information and links for the course can be found on thecourse website.
The certificate for the course can be found below:
- "Fundamentals of Accelerated Computing with CUDA Python" - NVIDIA Deep Learning Institute (Issued On: January 2025)
About
Fundamental tools and techniques for running GPU-accelerated Python applications using CUDA® GPUs and the Numba compiler.
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
No releases published
Packages0
No packages published
