Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork1.3k
A General-purpose Task-parallel Programming System using Modern C++
License
taskflow/taskflow
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Taskflow helps you quickly write parallel and heterogeneous task programs in modern C++
Taskflow is faster, more expressive, and easier for drop-in integrationthan many of existing task programming frameworksin handling complex parallel workloads.
Taskflow lets you quickly implement task decomposition strategiesthat incorporate both regular and irregular compute patterns,together with an efficientwork-stealing scheduler to optimize your multithreaded performance.
| Static Tasking | Subflow Tasking |
|---|---|
Taskflow supports conditional tasking for you to make rapid control-flow decisionsacross dependent tasks to implement cycles and conditions that were otherwise difficult to dowith existing tools.
| Conditional Tasking |
|---|
Taskflow is composable. You can create large parallel graphs throughcomposition of modular and reusable blocks that are easier to optimizeat an individual scope.
| Taskflow Composition |
|---|
Taskflow supports heterogeneous tasking for you toaccelerate a wide range of scientific computing applicationsby harnessing the power of CPU-GPU collaborative computing.
| Concurrent CPU-GPU Tasking |
|---|
Taskflow provides visualization and tooling needed for profiling Taskflow programs.
| Taskflow Profiler |
|---|
![]() |
We are committed to support trustworthy developments for both academic and industrial research projectsin parallel computing. Check outWho is Using Taskflow and what our users say:
- "Taskflow is the cleanest Task API I've ever seen."Damien Hocking @Corelium Inc
- "Taskflow has a very simple and elegant tasking interface. The performance also scales very well."Glen Fraser
- "Taskflow lets me handle parallel processing in a smart way."Hayabusa @Learning
- "Taskflow improves the throughput of our graph engine in just a few hours of coding."Jean-Michaël @KDAB
- "Best poster award for open-source parallel programming library."Cpp Conference 2018
- "Second Prize of Open-source Software Competition."ACM Multimedia Conference 2019
See a quick poster presentation below andvisit thedocumentation to learn more about Taskflow.Technical details can be referred to ourIEEE TPDS paper.
The following program (simple.cpp) creates a taskflow of four tasksA,B,C, andD, whereA runs beforeB andC, andDruns afterB andC.WhenA finishes,B andC can run in parallel.Try it live onCompiler Explorer (godbolt)!
#include<taskflow/taskflow.hpp>// Taskflow is header-onlyintmain(){ tf::Executor executor; tf::Taskflow taskflow;auto [A, B, C, D] = taskflow.emplace(// create four tasks [] () { std::cout <<"TaskA\n"; }, [] () { std::cout <<"TaskB\n"; }, [] () { std::cout <<"TaskC\n"; }, [] () { std::cout <<"TaskD\n"; } ); A.precede(B, C);// A runs before B and C D.succeed(B, C);// D runs after B and C executor.run(taskflow).wait();return0;}
Taskflow isheader-only and there is no wrangle with installation.To compile the program, clone the Taskflow project andtell the compiler to include theheaders.
~$ git clone https://github.com/taskflow/taskflow.git# clone it only once~$ g++ -std=c++20 examples/simple.cpp -I. -O2 -pthread -o simple~$ ./simpleTaskATaskC TaskB TaskD
Taskflow comes with a built-in profiler,TFProf,for you to profile and visualize taskflow programsin an easy-to-use web-based interface.
# run the program with the environment variable TF_ENABLE_PROFILER enabled~$ TF_ENABLE_PROFILER=simple.json ./simple~$ cat simple.json[{"executor":"0","data":[{"worker":0,"level":0,"data":[{"span":[172,186],"name":"0_0","type":"static"},{"span":[187,189],"name":"0_1","type":"static"}]},{"worker":2,"level":0,"data":[{"span":[93,164],"name":"2_0","type":"static"},{"span":[170,179],"name":"2_1","type":"static"}]}]}]# paste the profiling json data to https://taskflow.github.io/tfprof/
In addition to execution diagram, you can dump the graph to a DOT formatand visualize it using a number of freeGraphViz tools.
// dump the taskflow graph to a DOT format through std::couttaskflow.dump(std::cout);Taskflow empowers users with both static and dynamic task graph constructionsto express end-to-end parallelism in a task graph thatembeds in-graph control flow.
- Create a Subflow Graph
- Integrate Control Flow to a Task Graph
- Offload a Task to a GPU
- Compose Task Graphs
- Launch Asynchronous Tasks
- Execute a Taskflow
- Leverage Standard Parallel Algorithms
Taskflow supportsdynamic tasking for you to create a subflowgraph from the execution of a task to perform dynamic parallelism.The following program spawns a task dependency graph parented at taskB.
tf::Task A = taskflow.emplace([](){}).name("A"); tf::Task C = taskflow.emplace([](){}).name("C"); tf::Task D = taskflow.emplace([](){}).name("D"); tf::Task B = taskflow.emplace([] (tf::Subflow& subflow) { tf::Task B1 = subflow.emplace([](){}).name("B1"); tf::Task B2 = subflow.emplace([](){}).name("B2"); tf::Task B3 = subflow.emplace([](){}).name("B3"); B3.succeed(B1, B2);// B3 runs after B1 and B2}).name("B");A.precede(B, C);// A runs before B and CD.succeed(B, C);// D runs after B and CTaskflow supportsconditional tasking for you to make rapidcontrol-flow decisions across dependent tasks to implement cyclesand conditions in anend-to-end task graph.
tf::Task init = taskflow.emplace([](){}).name("init");tf::Task stop = taskflow.emplace([](){}).name("stop");// creates a condition task that returns a random binarytf::Task cond = taskflow.emplace( [](){returnstd::rand() %2; }).name("cond");init.precede(cond);// creates a feedback loop {0: cond, 1: stop}cond.precede(cond, stop);Taskflow supports GPU tasking for you to accelerate a wide range of scientific computing applications by harnessing the power of CPU-GPU collaborative computing using Nvidia CUDA Graph.
__global__voidsaxpy(size_t N,float alpha,float* dx,float* dy) {int i = blockIdx.x*blockDim.x + threadIdx.x;if (i < n) { y[i] = a*x[i] + y[i]; }}// create a CUDA Graph tasktf::Task cudaflow = taskflow.emplace([&]() { tf::cudaGraph cg; tf::cudaTask h2d_x = cg.copy(dx, hx.data(), N); tf::cudaTask h2d_y = cg.copy(dy, hy.data(), N); tf::cudaTask d2h_x = cg.copy(hx.data(), dx, N); tf::cudaTask d2h_y = cg.copy(hy.data(), dy, N); tf::cudaTask saxpy = cg.kernel((N+255)/256,256,0, saxpy, N,2.0f, dx, dy); saxpy.succeed(h2d_x, h2d_y) .precede(d2h_x, d2h_y);// instantiate an executable CUDA graph and run it through a stream tf::cudaGraphExecexec(cg); tf::cudaStream stream; stream.run(exec).synchronize();}).name("CUDA Graph Task");
Taskflow is composable.You can create large parallel graphs through composition of modularand reusable blocks that are easier to optimize at an individual scope.
tf::Taskflow f1, f2;// create taskflow f1 of two taskstf::Task f1A = f1.emplace([]() { std::cout <<"Task f1A\n"; }) .name("f1A");tf::Task f1B = f1.emplace([]() { std::cout <<"Task f1B\n"; }) .name("f1B");// create taskflow f2 with one module task composed of f1tf::Task f2A = f2.emplace([]() { std::cout <<"Task f2A\n"; }) .name("f2A");tf::Task f2B = f2.emplace([]() { std::cout <<"Task f2B\n"; }) .name("f2B");tf::Task f2C = f2.emplace([]() { std::cout <<"Task f2C\n"; }) .name("f2C");tf::Task f1_module_task = f2.composed_of(f1) .name("module");f1_module_task.succeed(f2A, f2B) .precede(f2C);
Taskflow supportsasynchronous tasking.You can launch tasks asynchronously to dynamically explore task graph parallelism.
tf::Executor executor;// create asynchronous tasks directly from an executorstd::future<int> future = executor.async([](){ std::cout <<"async task returns 1\n";return1;}); executor.silent_async([](){ std::cout <<"async task does not return\n"; });// create asynchronous tasks with dynamic dependenciestf::AsyncTask A = executor.silent_dependent_async([](){printf("A\n"); });tf::AsyncTask B = executor.silent_dependent_async([](){printf("B\n"); }, A);tf::AsyncTask C = executor.silent_dependent_async([](){printf("C\n"); }, A);tf::AsyncTask D = executor.silent_dependent_async([](){printf("D\n"); }, B, C);executor.wait_for_all();
The executor provides severalthread-safe methods to run a taskflow.You can run a taskflow once, multiple times, or until a stopping criteria is met.These methods are non-blocking with atf::Future<void> returnto let you query the execution status.
// runs the taskflow oncetf::Future<void> run_once = executor.run(taskflow);// wait on this run to finishrun_once.get();// run the taskflow four timesexecutor.run_n(taskflow,4);// runs the taskflow five timesexecutor.run_until(taskflow, [counter=5](){return --counter ==0; });// block the executor until all submitted taskflows completeexecutor.wait_for_all();
Taskflow defines algorithms for you to quickly express common parallelpatterns using standard C++ syntaxes,such as parallel iterations, parallel reductions, and parallel sort.
tf::Task task1 = taskflow.for_each(// assign each element to 100 in parallel first, last, [] (auto& i) { i =100; } );tf::Task task2 = taskflow.reduce(// reduce a range of items in parallel first, last, init, [] (auto a,auto b) {return a + b; });tf::Task task3 = taskflow.sort(// sort a range of items in parallel first, last, [] (auto a,auto b) {return a < b; });
Additionally, Taskflow provides composable graph building blocks for you toefficiently implement common parallel algorithms, such as parallel pipeline.
// create a pipeline to propagate five tokens through three serial stagestf::Pipelinepl(num_parallel_lines, tf::Pipe{tf::PipeType::SERIAL, [](tf::Pipeflow& pf) {if(pf.token() ==5) { pf.stop(); } }}, tf::Pipe{tf::PipeType::SERIAL, [](tf::Pipeflow& pf) {printf("stage 2: input buffer[%zu] = %d\n", pf.line(), buffer[pf.line()]); }}, tf::Pipe{tf::PipeType::SERIAL, [](tf::Pipeflow& pf) {printf("stage 3: input buffer[%zu] = %d\n", pf.line(), buffer[pf.line()]); }});taskflow.composed_of(pl)executor.run(taskflow).wait();
To use Taskflow v4.0.0, you need a compiler that supports C++20:
- GNU C++ Compiler at least v11.0 with -std=c++20
- Clang C++ Compiler at least v12.0 with -std=c++20
- Microsoft Visual Studio at least v19.29 (VS 2019) with /std:c++20
- Apple Clang (Xcode) at least v13.0 with -std=c++20
- NVIDIA CUDA Toolkit and Compiler (nvcc) at least v12.0 with host compiler supporting C++20
- Intel oneAPI DPC++/C++ Compiler at least v2022.0 with -std=c++20
Taskflow works on Linux, Windows, and Mac OS X.
Visit ourproject website anddocumentationto learn more about Taskflow. To get involved:
- Seerelease notes to stay up-to-date with newest versions
- Read the step-by-step tutorial atcookbook
- Submit an issue atGitHub issues
- Find out our technical details atreferences
- Watch our technical talks at YouTube
We are committed to support trustworthy developments forboth academic and industrial research projects in paralleland heterogeneous computing.If you are using Taskflow, please cite the following paper we published at 2021 IEEE TPDS:
- Tsung-Wei Huang, Dian-Lun Lin, Chun-Xun Lin, and Yibo Lin, "Taskflow: A Lightweight Parallel and Heterogeneous Task Graph Computing System,"IEEE Transactions on Parallel and Distributed Systems (TPDS), vol. 33, no. 6, pp. 1303-1320, June 2022
More importantly, we appreciate all Taskflowcontributors andthe following organizations for sponsoring the Taskflow project!
![]() | ![]() | ![]() | ![]() |
![]() | ![]() |
Taskflow project is also supported by ADS.FUND.
Taskflow is licensed with theMIT License.You are completely free to re-distribute your work derived from Taskflow.
About
A General-purpose Task-parallel Programming System using Modern C++
Topics
Resources
License
Contributing
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Sponsor this project
Uh oh!
There was an error while loading.Please reload this page.
Packages0
Uh oh!
There was an error while loading.Please reload this page.











