- Notifications
You must be signed in to change notification settings - Fork1
RidgeRun/getting-started-with-cuda-opencv
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This repository contains the code presented in the GTC2021 S31701talk.
This project presents a series of programs that guide you through theprocess of optimizing a CUDA accelerated OpenCV algorithm. Thisoptimization is done through a series of well defined steps withoutgetting into low-level CUDA programming.
The algorithm chosen to illustrate the optimization process is thecalculation of themagnitude of the SobelDerivatives. Whilenot very interesting on its own, this algorithm is a foundational stepin many algorithms such as edge detection, image segmentation, featureextraction, computer vision and more. While many optimizations can beachieved by approximating the underlying math, the original definitionis kept for didactic purposes. The purpose is to focus the study onthe appropriate OpenCV+CUDA handling.
As usual with OpenCV projects, the chosen build system wasCMake. Start by making sure you have these dependencies installed:
- CMake
- OpenCV (with CUDA enabled)
Then proceed normally as follows:
# Clone the projectgit clone https://github.com/RidgeRun/getting-started-with-cuda-opencv.gitcd getting-started-with-cuda-opencv# Configure the projectmkdir buildcd buildcmake ..# Build the projectmake
If everything went okay, you should be able to run the demos. You mayspecify the input and output images as the first and second parametersrespectively. Otherwise, "dog.jpg" and "dog_gradient_XXX.jpg" will beused by default.
# Run from the build directory./sobel_cpu ../dog.jpg# Specify an alternative output./sobel_cpu ../dog.jpg alternative_output.jpg# Run from top-level with default parameterscd .../build/sobel_cpu
The idea of the project is to use the CPU implementation as a baselineand then apply each optimization step incrementally.
- sobel_cpu: CPU baseline implementation
- sobel_gpu_1_naive: Literal port to GPU
- sobel_gpu_2_single_alloc: Allocate only once the GPU memoriesand recicle them through all the iterations.
- sobel_gpu_3_pinned_mem: Allocate host memory asnon-pageable/pinned so that the transfer is highly optimized.
- sobel_gpu_4_shared_mem: Allocate shared memory (if possible) forthe GPU/CPU to eliminate the memory transfer.
- sobel_gpu_5_shared_mem_streams: Use CUDA streams to processcertain parts of the pipeline in parallel.
- sobel_gpu_5_pinned_mem_streams: Use CUDA streams to processcertain parts of the pipeline in parallel (alternative implementationfor pinned memory instead of shared memory).
About
GTC2021: [S31701] Getting Started with CUDA Accelerated OpenCV
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Uh oh!
There was an error while loading.Please reload this page.