Back to Taskflow

A General-purpose Task-parallel Programming System

docs/index.html

4.0.012.3 KB
Original Source

A General-purpose Task-parallel Programming System

Taskflow helps you quickly write high-performance task-parallel programs with high programming productivity. It is faster, more expressive, fewer lines of code, and easier for drop-in integration than many of existing task programming libraries. The source code is available in our Project GitHub.

Start Your First Taskflow Program

The following program (simple.cpp) creates a taskflow of four tasks A, B, C, and D, where A runs before B and C, and D runs after B and C. When A finishes, B and C can run in parallel.

#include \<taskflow/taskflow.hpp\> // Taskflow is header-onlyint main(){tf::Executor executor;tf::Taskflow taskflow;auto [A, B, C, D] = taskflow.emplace(// create four tasks[] () { std::cout \<\< "TaskA\n"; },[] () { std::cout \<\< "TaskB\n"; },[] () { std::cout \<\< "TaskC\n"; },[] () { std::cout \<\< "TaskD\n"; } );A.precede(B, C);// A runs before B and CD.succeed(B, C);// D runs after B and Cexecutor.run(taskflow).wait(); return 0;}

GAABBA->BCCA->CDDB->DC->D

Taskflow is header-only and there is no struggle with installation. To compile the program, clone the Taskflow project and tell the compiler to include the headers under taskflow/.

~$ git clone https://github.com/taskflow/taskflow.git# clone it only once~$ g++ -std=c++20 simple.cpp -I taskflow/ -O2 -pthread -o simple
~$ ./simple
TaskA
TaskC 
TaskB 
TaskD

Taskflow comes with a built-in profiler, Taskflow Profiler, for you to profile and visualize taskflow programs in an easy-to-use web-based interface.

# run the program with the environment variable TF\_ENABLE\_PROFILER enabled~$TF\_ENABLE\_PROFILER=simple.json ./simple
~$ cat simple.json[{"executor":"0","data":[{"worker":0,"level":0,"data":[{"span":[172,186],"name":"0\_0","type":"static"},{"span":[187,189],"name":"0\_1","type":"static"}]},{"worker":2,"level":0,"data":[{"span":[93,164],"name":"2\_0","type":"static"},{"span":[170,179],"name":"2\_1","type":"static"}]}]}]# paste the profiling json data to https://taskflow.github.io/tfprof/

Create a Subflow Graph

Taskflow supports recursive tasking for you to create a subflow graph from the execution of a task to perform recursive parallelism. The following program spawns a task dependency graph parented at task B.

tf::Task A = taskflow.emplace([](){}).name("A");tf::Task C = taskflow.emplace([](){}).name("C");tf::Task D = taskflow.emplace([](){}).name("D");tf::Task B = taskflow.emplace([] (tf::Subflow& subflow) { // subflow task Btf::Task B1 = subflow.emplace([](){}).name("B1");tf::Task B2 = subflow.emplace([](){}).name("B2");tf::Task B3 = subflow.emplace([](){}).name("B3");B3.succeed(B1, B2);// B3 runs after B1 and B2}).name("B");A.precede(B, C);// A runs before B and CD.succeed(B, C);// D runs after B and C

Taskflowcluster_p0x7ffee9781810Taskflowcluster_p0x7f9866c01b70Subflow: Bp0x7f9866c01820Ap0x7f9866c01b70Bp0x7f9866c01820->p0x7f9866c01b70p0x7f9866c01930Cp0x7f9866c01820->p0x7f9866c01930p0x7f9866c01a40Dp0x7f9866c01b70->p0x7f9866c01a40p0x7f9866c01930->p0x7f9866c01a40p0x7f9866d01880B1p0x7f9866d01ac0B3p0x7f9866d01880->p0x7f9866d01ac0p0x7f9866d01ac0->p0x7f9866c01b70p0x7f9866d019a0B2p0x7f9866d019a0->p0x7f9866d01ac0

Integrate Control Flow into a Task Graph

Taskflow supports conditional tasking for you to make rapid control-flow decisions across dependent tasks to implement cycles and conditions in an end-to-end task graph.

tf::Task init = taskflow.emplace([](){}).name("init");tf::Task stop = taskflow.emplace([](){}).name("stop");// creates a condition task that returns a random binarytf::Task cond = taskflow.emplace([](){ return std::rand() % 2; }).name("cond");// creates a feedback loop {0: cond, 1: stop}init.precede(cond);cond.precede(cond, stop);// moves on to 'cond' on returning 0, or 'stop' on 1

Taskflowcondcondcond->cond0stopstopcond->stop1initinitinit->cond

Compose Task Graphs

Taskflow is composable. You can create large parallel graphs through composition of modular and reusable blocks that are easier to optimize at an individual scope.

tf::Taskflow f1, f2;// create taskflow f1 of two taskstf::Task f1A = f1.emplace([]() { std::cout \<\< "Task f1A\n"; }).name("f1A");tf::Task f1B = f1.emplace([]() { std::cout \<\< "Task f1B\n"; }).name("f1B");// create taskflow f2 with one module task composed of f1tf::Task f2A = f2.emplace([]() { std::cout \<\< "Task f2A\n"; }).name("f2A");tf::Task f2B = f2.emplace([]() { std::cout \<\< "Task f2B\n"; }).name("f2B");tf::Task f2C = f2.emplace([]() { std::cout \<\< "Task f2C\n"; }).name("f2C");tf::Task f1\_module\_task = f2.composed\_of(f1).name("module");f1\_module\_task.succeed(f2A, f2B).precede(f2C);

Taskflowcluster_p0x7ffeeb8ff970Taskflow: f2cluster_p0x7ffeeb8ff8d0Taskflow: f1p0x7ffb03813838f2Cp0x7ffb03813938f2Bp0x7ffb03813b38module [Taskflow: f1]p0x7ffb03813938->p0x7ffb03813b38p0x7ffb03813b38->p0x7ffb03813838p0x7ffb03813a38f2Ap0x7ffb03813a38->p0x7ffb03813b38p0x7ffb03813638f1Bp0x7ffb03813738f1A

Launch Asynchronous Tasks

Taskflow supports asynchronous tasking. You can launch tasks asynchronously to dynamically explore task graph parallelism.

tf::Executor executor;// create asynchronous tasks directly from an executorstd::future\<int\> future = executor.async([](){ std::cout \<\< "async task returns 1\n";return 1;}); executor.silent\_async([](){ std::cout \<\< "async task does not return\n"; });// create asynchronous tasks with dynamic dependenciestf::AsyncTask A = executor.silent\_dependent\_async([](){ printf("A\n"); });tf::AsyncTask B = executor.silent\_dependent\_async([](){ printf("B\n"); }, A);tf::AsyncTask C = executor.silent\_dependent\_async([](){ printf("C\n"); }, A);tf::AsyncTask D = executor.silent\_dependent\_async([](){ printf("D\n"); }, B, C);executor.wait\_for\_all();

Leverage Standard Parallel Algorithms

Taskflow defines algorithms for you to quickly express common parallel patterns using standard C++ syntaxes, such as parallel iterations, parallel reductions, and parallel sort.

// standard parallel CPU algorithmstf::Task task1 = taskflow.for\_each( // assign each element to 100 in parallelfirst, last, [] (auto& i) { i = 100; });tf::Task task2 = taskflow.reduce(// reduce a range of items in parallelfirst, last, init, [] (auto a, auto b) { return a + b; });tf::Task task3 = taskflow.sort(// sort a range of items in parallelfirst, last, [] (auto a, auto b) { return a \< b; });

Additionally, Taskflow provides composable graph building blocks for you to efficiently implement common parallel algorithms, such as parallel pipeline.

// create a pipeline to propagate five tokens through three serial stagestf::Pipeline pl(num\_lines,tf::Pipe{tf::PipeType::SERIAL, [](tf::Pipeflow& pf) {if(pf.token() == 5) {pf.stop();}}},tf::Pipe{tf::PipeType::SERIAL, [](tf::Pipeflow& pf) {printf("stage 2: input buffer[%zu] = %d\n", pf.line(), buffer[pf.line()]);}},tf::Pipe{tf::PipeType::SERIAL, [](tf::Pipeflow& pf) {printf("stage 3: input buffer[%zu] = %d\n", pf.line(), buffer[pf.line()]);}});taskflow.composed\_of(pl)executor.run(taskflow).wait();

Run a Taskflow through an Executor

The executor provides several thread-safe methods to run a taskflow. You can run a taskflow once, multiple times, or until a stopping criteria is met. These methods are non-blocking with a tf::Future<void> return to let you query the execution status.

// runs the taskflow oncetf::Future\<void\> run\_once = executor.run(taskflow); // wait on this run to finishrun\_once.get();// run the taskflow four timesexecutor.run\_n(taskflow, 4);// runs the taskflow five timesexecutor.run\_until(taskflow, counter=5{ return --counter == 0; });// blocks the executor until all submitted taskflows completeexecutor.wait\_for\_all();

Offload Tasks to a GPU

Taskflow supports GPU tasking for you to accelerate a wide range of scientific computing applications by harnessing the power of CPU-GPU collaborative computing using Nvidia CUDA Graph.

\_\_global\_\_ void saxpy(int n, float a, float \*x, float \*y) {int i = blockIdx.x\*blockDim.x + threadIdx.x;if (i \< n) {y[i] = a\*x[i] + y[i];}}// create a CUDA Graph tasktf::Task cudaflow = taskflow.emplace(& {tf::cudaGraph cg;tf::cudaTask h2d\_x = cg.copy(dx, hx.data(), N);tf::cudaTask h2d\_y = cg.copy(dy, hy.data(), N);tf::cudaTask d2h\_x = cg.copy(hx.data(), dx, N);tf::cudaTask d2h\_y = cg.copy(hy.data(), dy, N);tf::cudaTask saxpy = cg.kernel((N+255)/256, 256, 0, saxpy, N, 2.0f, dx, dy);saxpy.succeed(h2d\_x, h2d\_y).precede(d2h\_x, d2h\_y);// instantiate an executable CUDA graph and run it through a streamtf::cudaGraphExec exec(cg);tf::cudaStream stream;stream.run(exec).synchronize();}).name("CUDA Graph Task");

Taskflowp0x7f2870401a50h2d_xp0x7f2870402bc0saxpyp0x7f2870401a50->p0x7f2870402bc0p0x7f2870402310d2h_xp0x7f2870402bc0->p0x7f2870402310p0x7f2870402780d2h_yp0x7f2870402bc0->p0x7f2870402780p0x7f2870401eb0h2d_yp0x7f2870401eb0->p0x7f2870402bc0

Visualize Taskflow Graphs

You can dump a taskflow graph to a DOT format and visualize it using a number of free GraphViz tools such as GraphViz Online.

tf::Taskflow taskflow;tf::Task A = taskflow.emplace([](){}).name("A");tf::Task B = taskflow.emplace([](){}).name("B");tf::Task C = taskflow.emplace([](){}).name("C");tf::Task D = taskflow.emplace([](){}).name("D");tf::Task E = taskflow.emplace([](){}).name("E");A.precede(B, C, E);C.precede(D);B.precede(D, E);// dump the graph to a DOT file through std::couttaskflow.dump(std::cout);

GAABBA->BCCA->CEEA->EB->EDDB->DC->D

Supported Compilers

To use Taskflow v4.0.0, you need a compiler that supports C++20:

  • GNU C++ Compiler at least v11.0 with -std=c++20
  • Clang C++ Compiler at least v12.0 with -std=c++20
  • Microsoft Visual Studio at least v19.29 (VS 2019) with /std:c++20
  • Apple Clang (Xcode) at least v13.0 with -std=c++20
  • NVIDIA CUDA Toolkit and Compiler (nvcc) at least v12.0 with host compiler supporting C++20
  • Intel oneAPI DPC++/C++ Compiler at least v2022.0 with -std=c++20

Taskflow works on Linux, Windows, and Mac OS X.

Get Involved

Visit our Project Website and showcase presentation to learn more about Taskflow. To get involved:

We are committed to support trustworthy developments for both academic and industrial research projects in parallel and heterogeneous computing. If you are using Taskflow, please cite the following paper we published at 2022 IEEE TPDS:

More importantly, we appreciate all Taskflow Contributors and the following organizations for sponsoring the Taskflow project!

| | | | | | | | | |

License

Taskflow is open-source the under permissive MIT license. You are completely free to use, modify, and redistribute any work on top of Taskflow. The source code is available in Project GitHub and is actively maintained by Dr. Tsung-Wei Huang and his research group at the University of Wisconsin at Madison.