- Notifications
You must be signed in to change notification settings - Fork764
[ARCHIVED] The C++ parallel algorithms library. Seehttps://github.com/NVIDIA/cccl
License
NVIDIA/thrust
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Examples | Godbolt | Documentation |
---|
Thrust is the C++ parallel algorithms library which inspired the introductionof parallel algorithms to the C++ Standard Library.Thrust'shigh-level interface greatly enhances programmerproductivitywhile enabling performance portability between GPUs and multicore CPUs.It builds on top of established parallel programming frameworks (such as CUDA,TBB, and OpenMP).It also provides a number of general-purpose facilities similar to those foundin the C++ Standard Library.
The NVIDIA C++ Standard Library is an open source project; it is available onGitHub and included in the NVIDIA HPC SDK and CUDA Toolkit.If you have one of those SDKs installed, no additional installation or compilerflags are needed to use libcu++.
Thrust is best learned through examples.
The following example generates random numbers serially and then transfers themto a parallel device where they are sorted.
#include<thrust/host_vector.h>#include<thrust/device_vector.h>#include<thrust/generate.h>#include<thrust/sort.h>#include<thrust/copy.h>#include<thrust/random.h>intmain() {// Generate 32M random numbers serially. thrust::default_random_enginerng(1337); thrust::uniform_int_distribution<int> dist; thrust::host_vector<int>h_vec(32 <<20);thrust::generate(h_vec.begin(), h_vec.end(), [&] {returndist(rng); });// Transfer data to the device. thrust::device_vector<int> d_vec = h_vec;// Sort data on the device.thrust::sort(d_vec.begin(), d_vec.end());// Transfer data back to host.thrust::copy(d_vec.begin(), d_vec.end(), h_vec.begin());}
This example demonstrates computing the sum of some random numbers in parallel:
#include<thrust/host_vector.h>#include<thrust/device_vector.h>#include<thrust/generate.h>#include<thrust/reduce.h>#include<thrust/functional.h>#include<thrust/random.h>intmain() {// Generate random data serially. thrust::default_random_enginerng(1337); thrust::uniform_real_distribution<double>dist(-50.0,50.0); thrust::host_vector<double>h_vec(32 <<20);thrust::generate(h_vec.begin(), h_vec.end(), [&] {returndist(rng); });// Transfer to device and compute the sum. thrust::device_vector<double> d_vec = h_vec;double x =thrust::reduce(d_vec.begin(), d_vec.end(),0, thrust::plus<int>());}
This example show how to perform such a reduction asynchronously:
#include<thrust/host_vector.h>#include<thrust/device_vector.h>#include<thrust/generate.h>#include<thrust/async/copy.h>#include<thrust/async/reduce.h>#include<thrust/functional.h>#include<thrust/random.h>#include<numeric>intmain() {// Generate 32M random numbers serially. thrust::default_random_enginerng(123456); thrust::uniform_real_distribution<double>dist(-50.0,50.0); thrust::host_vector<double>h_vec(32 <<20);thrust::generate(h_vec.begin(), h_vec.end(), [&] {returndist(rng); });// Asynchronously transfer to the device. thrust::device_vector<double>d_vec(h_vec.size()); thrust::device_event e =thrust::async::copy(h_vec.begin(), h_vec.end(), d_vec.begin());// After the transfer completes, asynchronously compute the sum on the device. thrust::device_future<double> f0 =thrust::async::reduce(thrust::device.after(e), d_vec.begin(), d_vec.end(),0.0, thrust::plus<double>());// While the sum is being computed on the device, compute the sum serially on// the host.double f1 =std::accumulate(h_vec.begin(), h_vec.end(),0.0, thrust::plus<double>());}
Thrust is a header-only library; there is no need to build or install the projectunless you want to run the Thrust unit tests.
The CUDA Toolkit provides a recent release of the Thrust source code ininclude/thrust
. This will be suitable for most users.
Users that wish to contribute to Thrust or try out newer features shouldrecursively clone the Thrust Github repository:
git clone --recursive https://github.com/NVIDIA/thrust.git
For CMake-based projects, we provide a CMake package for use withfind_package
. See theCMake README for moreinformation. Thrust can also be added viaadd_subdirectory
or tools liketheCMake Package Manager.
For non-CMake projects, compile with:
- The Thrust include path (
-I<thrust repo root>
) - The libcu++ include path (
-I<thrust repo root>/dependencies/libcudacxx/
) - The CUB include path, if using the CUDA device system (
-I<thrust repo root>/dependencies/cub/
) - By default, the CPP host system and CUDA device system are used.These can be changed using compiler definitions:
-DTHRUST_HOST_SYSTEM=THRUST_HOST_SYSTEM_XXX
,whereXXX
isCPP
(serial, default),OMP
(OpenMP), orTBB
(Intel TBB)-DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_XXX
, whereXXX
isCPP
,OMP
,TBB
, orCUDA
(default).
Thrust uses theCMake build system to build unit tests, examples, and headertests.To build Thrust as a developer, it is recommended that you use ourcontainerized development system:
# Clone Thrust and CUB repos recursively:git clone --recursive https://github.com/NVIDIA/thrust.gitcd thrust# Build and run tests and examples:ci/local/build.bash
That does the equivalent of the following, but in a clean containerizedenvironment which has all dependencies installed:
# Clone Thrust and CUB repos recursively:git clone --recursive https://github.com/NVIDIA/thrust.gitcd thrust# Create build directory:mkdir buildcd build# Configure -- use one of the following:cmake ..# Command line interface.ccmake ..# ncurses GUI (Linux only).cmake-gui# Graphical UI, set source/build directories in the app.# Build:cmake --build. -j${NUM_JOBS}# Invokes make (or ninja, etc).# Run tests and examples:ctest
By default, a serialCPP
host system,CUDA
accelerated device system, andC++14 standard are used.This can be changed in CMake and via flags toci/local/build.bash
More information on configuring your Thrust build and creating a pull requestcan be found in thecontributing section.
Thrust is an open source project developed onGitHub.Thrust is distributed under theApache License v2.0 with LLVM Exceptions;some parts are distributed under theApache License v2.0 and theBoost License v1.0.
About
[ARCHIVED] The C++ parallel algorithms library. Seehttps://github.com/NVIDIA/cccl
Topics
Resources
License
Code of conduct
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.