- Notifications
You must be signed in to change notification settings - Fork3
Light and self-contained implementation of C++17 parallel algorithms.
License
BSD-2-Clause and 2 other licenses found
Licenses found
alugowski/poolSTL
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Light, self-contained, thread pool-based implementation ofC++17 parallel standard library algorithms.
C++17 introduced parallel overloads of standard library algorithms that accept anExecution Policy as the first argument.Policies specify limits on how the implementation may parallelize the algorithm, enabling methods like threads, vectorization, or even GPU.Policies can be supplied by the compiler or by libraries like this one.
std::sort(std::execution::par, vec.begin(), vec.end());// ^^^^^^^^^^^^^^^^^^^ native C++17 parallel Execution Policy
Unfortunately compiler supportvaries. Quick summary of compilers' default standard libraries:
Linux | macOS | Windows | |
---|---|---|---|
GCC 9+ | TBB Required | TBB Required | TBB Required |
GCC 8- | ❌ | ❌ | ❌ |
Clang (libc++) | ❌ | ❌ | ❌ |
Clang (libstdc++) | TBB Required | TBB Required | TBB Required |
Apple Clang | ❌ | ||
MSVC 15.7+ (2017) | ✅ | ||
Parallel STL | TBB Required | TBB Required | TBB Required |
poolSTL | ✅* | ✅* | ✅* |
PoolSTL is asupplement to fill in the support gaps. It is not a full implementation; only the basics are covered.However, it is small, easy to integrate, and has no external dependencies. A good backup to the other options.
Use poolSTL exclusively, or only on platforms lacking native support,or only ifTBB is not present.
Supports C++11 and higher. Algorithms introduced in C++17 require C++17 or higher.
Tested in CI on GCC 7+, Clang/LLVM 5+, Apple Clang, MSVC, MinGW, and Emscripten.
Algorithms are added on an as-needed basis. If you need oneopen an issue or contribute a PR.
Limitations: All iterators must be random access. No nested parallel calls.
all_of
,any_of
,none_of
copy
,copy_n
count
,count_if
fill
,fill_n
find
,find_if
,find_if_not
for_each
,for_each_n
partition
sort
,stable_sort
transform
exclusive_scan
(C++17 only)reduce
transform_reduce
(C++17 only)
All instd::
namespace.
poolstl::iota_iter
- Iterate over integers. Same as iterating over output ofstd::iota
but without materializing anything. Iterator version ofstd::ranges::iota_view
.poolstl::for_each_chunk
- Likestd::for_each
, but explicitly splits the input range into chunks then exposes the chunked parallelism. A user-specified chunk constructor is called for each parallel chunk then its output is passed to each loop iteration. Useful for workloads that need an expensive workspace that can be reused between iterations, but not simultaneously by all iterations in parallel.poolstl::pluggable_sort
- Likestd::sort
, but allows specification of sequential sort method. To parallelizepdqsort:pluggable_sort(par, v.begin(), v.end(), pdqsort)
.
PoolSTL provides:
poolstl::par
: Substitute forstd::execution::par
. Parallelized using athread pool.poolstl::seq
: Substitute forstd::execution::seq
. Simply calls the regular (non-policy) overload.poolstl::par_if()
: Choose parallel or sequential at runtime. See below.
In short, usepoolstl::par
to make your code parallel. Complete example:
#include<iostream>#include<poolstl/poolstl.hpp>intmain() { std::vector<int> v = {0,1,2,3,4,5};auto sum =std::reduce(poolstl::par, vec.cbegin(), vec.cend());// ^^^^^^^^^^^^// Add this to make your code parallel. std::cout <<"Sum=" << sum << std::endl;return0;}
The thread pool used bypoolstl::par
is managed internally by poolSTL. It is started on first use.
Use your ownthread poolwithpoolstl::par.on(pool)
for control over thread count, startup/shutdown, etc.:
task_thread_pool::task_thread_pool pool{4};// 4 threadsstd::reduce(poolstl::par.on(pool), vec.begin(), vec.end());
Sometimes the choice whether to parallelize or not should be made at runtime. For example, small datasets may not amortizethe cost of starting threads, while large datasets do and should be parallelized.
Usepoolstl::par_if
to select betweenpar
andseq
at runtime:
bool is_parallel = vec.size() >10000;std::reduce(poolstl::par_if(is_parallel), vec.begin(), vec.end());
Usepoolstl::par_if(is_parallel, pool)
to control the thread pool used bypar
, if selected.
std::vector<int> vec = {0,1,2,3,4,5};// Parallel for-eachstd::for_each(poolstl::par, vec.begin(), vec.end(), [](auto& value) { std::cout << value;// loop body});
using poolstl::iota_iter;// parallel for loopstd::for_each(poolstl::par, iota_iter<int>(0), iota_iter<int>(100), [](auto i) { std::cout << i;// loop body});
std::vector<int> vec = {5,2,1,3,0,4};std::sort(poolstl::par, vec.begin(), vec.end());
Eachrelease publishes a single-file amalgamatedpoolstl.hpp
. Simply copy this into your project.
Build requirements:
- Clang and GCC 8 or older: require
-lpthread
to use C++11 threads. - Emscripten: compile and link with
-pthread
to use C++11 threads.See docs.
include(FetchContent)FetchContent_Declare(poolSTLGIT_REPOSITORYhttps://github.com/alugowski/poolSTLGIT_TAGmainGIT_SHALLOWTRUE)FetchContent_MakeAvailable(poolSTL)target_link_libraries(YOUR_TARGETpoolSTL::poolSTL)
Alternatively copy or checkout the repo into your project and:
add_subdirectory(poolSTL)
Seebenchmark/ to compare poolSTL against the standard sequential implementation, and (if available) thenativestd::execution::par
implementation.
Results on an M1 Pro (6 power, 2 efficiency cores), with GCC 13:
-------------------------------------------------------------------------------------------------------Benchmark Time CPU Iterations-------------------------------------------------------------------------------------------------------all_of()/real_time 19.9 ms 19.9 ms 35all_of(poolstl::par)/real_time 3.47 ms 0.119 ms 198all_of(std::execution::par)/real_time 3.45 ms 3.25 ms 213find_if()/needle_percentile:5/real_time 0.988 ms 0.987 ms 712find_if()/needle_percentile:50/real_time 9.87 ms 9.86 ms 71find_if()/needle_percentile:100/real_time 19.7 ms 19.7 ms 36find_if(poolstl::par)/needle_percentile:5/real_time 0.405 ms 0.050 ms 1730find_if(poolstl::par)/needle_percentile:50/real_time 1.85 ms 0.096 ms 393find_if(poolstl::par)/needle_percentile:100/real_time 3.64 ms 0.102 ms 193find_if(std::execution::par)/needle_percentile:5/real_time 0.230 ms 0.220 ms 3103find_if(std::execution::par)/needle_percentile:50/real_time 1.75 ms 1.60 ms 410find_if(std::execution::par)/needle_percentile:100/real_time 3.51 ms 3.24 ms 204for_each()/real_time 94.6 ms 94.6 ms 7for_each(poolstl::par)/real_time 18.7 ms 0.044 ms 36for_each(std::execution::par)/real_time 15.3 ms 12.9 ms 46sort()/real_time 603 ms 602 ms 1sort(poolstl::par)/real_time 112 ms 6.64 ms 6sort(std::execution::par)/real_time 113 ms 102 ms 6pluggable_sort(poolstl::par, ..., pdqsort)/real_time 71.7 ms 6.67 ms 10transform()/real_time 95.0 ms 94.9 ms 7transform(poolstl::par)/real_time 17.4 ms 0.037 ms 38transform(std::execution::par)/real_time 15.3 ms 13.2 ms 45exclusive_scan()/real_time 33.7 ms 33.7 ms 21exclusive_scan(poolstl::par)/real_time 11.6 ms 0.095 ms 55exclusive_scan(std::execution::par)/real_time 19.8 ms 15.3 ms 32reduce()/real_time 15.2 ms 15.2 ms 46reduce(poolstl::par)/real_time 4.06 ms 0.044 ms 169reduce(std::execution::par)/real_time 3.38 ms 3.16 ms 214
USE AT YOUR OWN RISK! THIS IS A HACK!
Two-line hack for missing compiler support. A no-op on compilers with support.
IfPOOLSTL_STD_SUPPLEMENT
is defined then poolSTL will check for native compiler support.If not found then poolSTL will alias itspoolstl::par
asstd::execution::par
:
#definePOOLSTL_STD_SUPPLEMENT#include<poolstl/poolstl.hpp>
Now just usestd::execution::par
as normal, and poolSTL will fill in as necessary. Seesupplement_test.cpp.
Example use case: Youcan link against TBB, so you'll use native support on GCC 9+, Clang, MSVC, etc.PoolSTL will fill in automatically on GCC <9 and Apple Clang.
Example use case 2: You'dprefer to use the TBB version, but don't want to fail on systems that don't have it.Simply use the supplement as above, but have your build system (CMake, meson, etc.) check for TBB.If not found, definePOOLSTL_STD_SUPPLEMENT_NO_INCLUDE
and the supplement will not#include <execution>
(and neither should your code!),thus dropping the TBB link requirement. The poolSTL supplement fills in.
See the supplement section oftests/CMakeLists.txt for an example.
About
Light and self-contained implementation of C++17 parallel algorithms.
Topics
Resources
License
BSD-2-Clause and 2 other licenses found
Licenses found
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors3
Uh oh!
There was an error while loading.Please reload this page.