Taskflow: A General - Taskflow

| | Taskflow: A General-purpose Task-parallel Programming System |

Searching...

No Matches

Public Member Functions | Friends | List of all members

tf::FlowBuilder Class Reference

class to build a task dependency graph More...

#include <taskflow/core/flow_builder.hpp>

Inheritance diagram for tf::FlowBuilder:

[Embedded content](classtf_1_1FlowBuilder inherit graph.svg)

[legend]

Public Member Functions

| | | FlowBuilder (Graph &graph) | | | constructs a flow builder with a graph
| | | | template<StaticTaskLike C> | | Task | emplace (C &&callable) | | | creates a static task
| | | | template<RuntimeTaskLike C> | | Task | emplace (C &&callable) | | | creates a runtime task
| | | | template<SubflowTaskLike C> | | Task | emplace (C &&callable) | | | creates a dynamic task
| | | | template<ConditionTaskLike C> | | Task | emplace (C &&callable) | | | creates a condition task
| | | | template<MultiConditionTaskLike C> | | Task | emplace (C &&callable) | | | creates a multi-condition task
| | | | template<typename... C>
requires (sizeof...(C) > 1) | | auto | emplace (C &&... callables) | | | creates multiple tasks from a list of callable objects
| | | | void | erase (Task task) | | | removes a task from a taskflow
| | | | template<GraphLike T> | | Task | composed_of (T &object) | | | creates a module task for the target object
| | | | Task | adopt (Graph &&graph) | | | creates a module task from a graph by taking over its ownership
| | | | Task | placeholder () | | | creates a placeholder task
| | | | void | linearize (std::vector< Task > &tasks) | | | adds adjacent dependency links to a linear list of tasks
| | | | void | linearize (std::initializer_list< Task > tasks) | | | adds adjacent dependency links to a linear list of tasks
| | | | template<typename B, typename E, typename C, PartitionerLike P = DefaultPartitioner> | | Task | for_each (B first, E last, C callable, P part=P()) | | | constructs an STL-styled parallel-for task
| | | | template<typename B, typename E, typename S, typename C, PartitionerLike P = DefaultPartitioner> | | Task | for_each_index (B first, E last, S step, C callable, P part=P()) | | | constructs an index-based parallel-for task
| | | | template<IndexRangesLike R, typename C, PartitionerLike P = DefaultPartitioner> | | Task | for_each_by_index (R range, C callable, P part=P()) | | | constructs a parallel-for task over a one- or multi-dimensional index range
| | | | template<typename B, typename E, typename O, typename C, PartitionerLike P = DefaultPartitioner> | | Task | transform (B first1, E last1, O d_first, C c, P part=P()) | | | constructs a parallel-transform task
| | | | template<typename B1, typename E1, typename B2, typename O, typename C, PartitionerLike P = DefaultPartitioner>
requires (!PartitionerLike<std::decay_t<C>>) | | Task | transform (B1 first1, E1 last1, B2 first2, O d_first, C c, P part=P()) | | | constructs a parallel-transform task
| | | | template<typename B, typename E, typename T, typename O, PartitionerLike P = DefaultPartitioner> | | Task | reduce (B first, E last, T &init, O bop, P part=P()) | | | constructs an STL-styled parallel-reduction task
| | | | template<IndexRanges1DLike R, typename T, typename L, typename G, PartitionerLike P = DefaultPartitioner> | | Task | reduce_by_index (R range, T &init, L lop, G gop, P part=P()) | | | constructs an index range-based parallel-reduction task
| | | | template<typename B, typename E, typename T, typename BOP, typename UOP, PartitionerLike P = DefaultPartitioner> | | Task | transform_reduce (B first, E last, T &init, BOP bop, UOP uop, P part=P()) | | | constructs an STL-styled parallel transform-reduce task
| | | | template<typename B1, typename E1, typename B2, typename T, typename BOP_R, typename BOP_T, PartitionerLike P = DefaultPartitioner>
requires (!PartitionerLike<std::decay_t<BOP_T>>) | | Task | transform_reduce (B1 first1, E1 last1, B2 first2, T &init, BOP_R bop_r, BOP_T bop_t, P part=P()) | | | constructs an STL-styled parallel transform-reduce task
| | | | template<typename B, typename E, typename D, typename BOP> | | Task | inclusive_scan (B first, E last, D d_first, BOP bop) | | | creates an STL-styled parallel inclusive-scan task
| | | | template<typename B, typename E, typename D, typename BOP, typename T> | | Task | inclusive_scan (B first, E last, D d_first, BOP bop, T init) | | | creates an STL-styled parallel inclusive-scan task with an initial value
| | | | template<typename B, typename E, typename D, typename T, typename BOP> | | Task | exclusive_scan (B first, E last, D d_first, T init, BOP bop) | | | creates an STL-styled parallel exclusive-scan task
| | | | template<typename B, typename E, typename D, typename BOP, typename UOP> | | Task | transform_inclusive_scan (B first, E last, D d_first, BOP bop, UOP uop) | | | creates an STL-styled parallel transform-inclusive scan task
| | | | template<typename B, typename E, typename D, typename BOP, typename UOP, typename T> | | Task | transform_inclusive_scan (B first, E last, D d_first, BOP bop, UOP uop, T init) | | | creates an STL-styled parallel transform-inclusive scan task
| | | | template<typename B, typename E, typename D, typename T, typename BOP, typename UOP> | | Task | transform_exclusive_scan (B first, E last, D d_first, T init, BOP bop, UOP uop) | | | creates an STL-styled parallel transform-exclusive scan task
| | | | template<typename B, typename E, typename T, typename UOP, PartitionerLike P = DefaultPartitioner> | | Task | find_if (B first, E last, T &result, UOP predicate, P part=P()) | | | constructs a task to perform STL-styled find-if algorithm
| | | | template<typename B, typename E, typename T, typename UOP, PartitionerLike P = DefaultPartitioner> | | Task | find_if_not (B first, E last, T &result, UOP predicate, P part=P()) | | | constructs a task to perform STL-styled find-if-not algorithm
| | | | template<typename B, typename E, typename T, typename C, PartitionerLike P> | | Task | min_element (B first, E last, T &result, C comp, P part) | | | constructs a task to perform STL-styled min-element algorithm
| | | | template<typename B, typename E, typename T, typename C, PartitionerLike P> | | Task | max_element (B first, E last, T &result, C comp, P part) | | | constructs a task to perform STL-styled max-element algorithm
| | | | template<typename B, typename E, typename C> | | Task | sort (B first, E last, C cmp) | | | constructs a dynamic task to perform STL-styled parallel sort
| | | | template<typename B, typename E> | | Task | sort (B first, E last) | | | constructs a dynamic task to perform STL-styled parallel sort using the std::less<T> comparator, where T is the element type
| | | | template<typename B1, typename E1, typename B2, typename E2, typename O> | | Task | merge (B1 first1, E1 last1, B2 first2, E2 last2, O d_first) | | | merges two sorted ranges into a single sorted output using the std::less comparator
| | | | template<typename B1, typename E1, typename B2, typename E2, typename O, typename C> | | Task | merge (B1 first1, E1 last1, B2 first2, E2 last2, O d_first, C cmp) | | | merges two sorted ranges into a single sorted output using a custom comparator
| | | | template<typename B, typename E, typename V, PartitionerLike P = DefaultPartitioner> | | Task | fill (B first, E last, V value, P part=P()) | | | fills a range with a given value in parallel
| | | | template<typename B, std::integral C, typename V, PartitionerLike P = DefaultPartitioner> | | Task | fill_n (B first, C count, V value, P part=P()) | | | fills N elements with a given value in parallel
| | | | template<typename B, typename E, typename G, PartitionerLike P = DefaultPartitioner> | | Task | generate (B first, E last, G gen, P part=P()) | | | generates values into a range in parallel using a callable
| | | | template<typename B, std::integral C, typename G, PartitionerLike P = DefaultPartitioner> | | Task | generate_n (B first, C count, G gen, P part=P()) | | | generates N values into a range in parallel using a callable
| | |

Friends

Detailed Description

class to build a task dependency graph

The class provides essential methods to construct a task dependency graph from which tf::Taskflow and tf::Subflow are derived.

Member Function Documentation

◆adopt()

| inline |

creates a module task from a graph by taking over its ownership

Parameters

| graph | the graph to adopt (moved into the task) |

Returnsa Task handle to the adopted module task

Unlike tf::FlowBuilder::composed_of, which references an externally-owned tf::Taskflow, adopt transfers ownership of the given tf::Graph into the task. The graph's lifetime is managed by the executor once adopted, and the caller has no access to the moved-from graph afterward.

tf::Taskflow taskflow;

tf::Graph g;

tf::FlowBuilder{g}.emplace([]{ std::cout << "task in adopted graph\n"; });

taskflow.adopt(std::move(g)).name("adopted");

tf::FlowBuilder

class to build a task dependency graph

Definition flow_builder.hpp:22

tf::FlowBuilder::adopt

Task adopt(Graph &&graph)

creates a module task from a graph by taking over its ownership

Definition flow_builder.hpp:1628

tf::FlowBuilder::emplace

Task emplace(C &&callable)

creates a static task

Definition flow_builder.hpp:1571

tf::Graph

class to create a graph object

Definition graph.hpp:47

tf::Task::name

const std::string & name() const

queries the name of the task

Definition task.hpp:1388

tf::Taskflow

class to create a taskflow object

Definition taskflow.hpp:64

NotePlease refer to Composable Tasking for details.

◆composed_of()

template<GraphLike T>

| Task tf::FlowBuilder::composed_of | ( | T & | object | ) | |

creates a module task for the target object

Template Parameters

| T | type satisfying tf::GraphLike |

Parameters

| object | a custom object that defines the method T::graph() |

Returnsa tf::Task handle

The example below demonstrates a taskflow composition using the composed_of method.

tf::Taskflow t1, t2;

t1.emplace({ std::cout << "t1"; });

// t2 is partially composed of t1

tf::Task comp = t2.composed_of(t1);

tf::Task init = t2.emplace({ std::cout << "t2"; });

init.precede(comp);

tf::FlowBuilder::composed_of

Task composed_of(T &object)

creates a module task for the target object

Definition flow_builder.hpp:1621

tf::Task

class to create a task handle over a taskflow node

Definition task.hpp:569

tf::Task::precede

Task & precede(Ts &&... tasks)

adds precedence links from this to other tasks

Definition task.hpp:1258

The taskflow object t2 is composed of another taskflow object t1, preceded by another static task init. When taskflow t2 is submitted to an executor, init will run first and then comp which spawns its definition in taskflow t1.

The target object being composed must define the method T::graph() that returns a reference to a graph object of type tf::Graph such that it can interact with the executor. For example:

// custom struct

struct MyObj {

tf::Graph graph;

MyObj() {

tf::FlowBuilder builder(graph);

tf::Task task = builder.emplace({

std::cout << "a task\n"; // static task

});

}

Graph& graph() { return graph; }

};

MyObj obj;

tf::Task comp = taskflow.composed_of(obj);

tf::Task::composed_of

Task & composed_of(T &object)

creates a module task from a taskflow

Definition task.hpp:1290

Or, simply expose the graph object and pass it to composed_of:

tf::Graph graph;

tf::FlowBuilder builder(graph);

tf::Task task = builder.emplace({

std::cout << "a task\n"; // static task

});

tf::Task comp = taskflow.composed_of(graph);

NotePlease refer to Composable Tasking for details.

◆emplace() [1/6]

template<typename... C>
requires (sizeof...(C) > 1)

| auto tf::FlowBuilder::emplace | ( | C &&... | callables | ) | |

creates multiple tasks from a list of callable objects

Template Parameters

| C | callable types |

Parameters

| callables | one or multiple callable objects constructible from each task category |

Returnsa tf::Task handle

The method returns a tuple of tasks each corresponding to the given callable target. You can use structured binding to get the return tasks one by one. The following example creates four static tasks and assign them to A, B, C, and D using structured binding.

auto [A, B, C, D] = taskflow.emplace(

[] () { std::cout << "A"; },

[] () { std::cout << "B"; },

[] () { std::cout << "C"; },

[] () { std::cout << "D"; }

);

◆emplace() [2/6]

template<MultiConditionTaskLike C>

| Task tf::FlowBuilder::emplace | ( | C && | callable | ) | |

creates a static task

Template Parameters

| C | callable type satisfying tf::StaticTaskLike |

Parameters

| callable | callable to construct a static task |

Returnsa tf::Task handle

The following example creates a static task.

tf::Task static_task = taskflow.emplace({});

NotePlease refer to Static Tasking for details.

◆emplace() [3/6]

template<RuntimeTaskLike C>

| Task tf::FlowBuilder::emplace | ( | C && | callable | ) | |

creates a runtime task

Template Parameters

| C | callable type satisfying tf::RuntimeTaskLike |

Parameters

| callable | callable to construct a runtime task |

Returnsa tf::Task handle

The following example creates a runtime task.

tf::Task static_task = taskflow.emplace({});

tf::Runtime

class to create a runtime task

Definition runtime.hpp:47

NotePlease refer to Runtime Tasking for details.

◆emplace() [4/6]

template<SubflowTaskLike C>

| Task tf::FlowBuilder::emplace | ( | C && | callable | ) | |

creates a dynamic task

Template Parameters

| C | callable type satisfying tf::SubflowTaskLike |

Parameters

| callable | callable to construct a dynamic task |

Returnsa tf::Task handle

The following example creates a dynamic task (tf::Subflow) that spawns two static tasks.

tf::Task dynamic_task = taskflow.emplace([](tf::Subflow& sf){

tf::Task static_task1 = sf.emplace({});

tf::Task static_task2 = sf.emplace({});

});

tf::Subflow

class to construct a subflow graph from the execution of a dynamic task

Definition flow_builder.hpp:1735

NotePlease refer to Subflow Tasking for details.

◆emplace() [5/6]

template<ConditionTaskLike C>

| Task tf::FlowBuilder::emplace | ( | C && | callable | ) | |

creates a condition task

Template Parameters

| C | callable type satisfying tf::ConditionTaskLike |

Parameters

| callable | callable to construct a condition task |

Returnsa tf::Task handle

The following example creates an if-else block using one condition task and three static tasks.

tf::Taskflow taskflow;

auto [init, cond, yes, no] = taskflow.emplace(

[] () { },

[] () { return 0; },

[] () { std::cout << "yes\n"; },

[] () { std::cout << "no\n"; }

);

// executes yes if cond returns 0, or no if cond returns 1

cond.precede(yes, no);

cond.succeed(init);

tf::Task::succeed

Task & succeed(Ts &&... tasks)

adds precedence links from other tasks to this

Definition task.hpp:1266

NotePlease refer to Conditional Tasking for details.

◆emplace() [6/6]

template<MultiConditionTaskLike C>

| Task tf::FlowBuilder::emplace | ( | C && | callable | ) | |

creates a multi-condition task

Template Parameters

| C | callable type satisfying tf::MultiConditionTaskLike |

Parameters

| callable | callable to construct a multi-condition task |

Returnsa tf::Task handle

The following example creates a multi-condition task that selectively jumps to two successor tasks.

tf::Taskflow taskflow;

auto [init, cond, branch1, branch2, branch3] = taskflow.emplace(

[] () { },

[] () { return tf::SmallVector{0, 2}; },

[] () { std::cout << "branch1\n"; },

[] () { std::cout << "branch2\n"; },

[] () { std::cout << "branch3\n"; }

);

// executes branch1 and branch3 when cond returns 0 and 2

cond.precede(branch1, branch2, branch3);

cond.succeed(init);

tf::SmallVector

class to define a vector optimized for small array

Definition small_vector.hpp:931

NotePlease refer to Conditional Tasking for details.

◆erase()

| inline |

removes a task from a taskflow

Parameters

| task | task to remove |

Removes a task and its input and output dependencies from the graph associated with the flow builder. If the task does not belong to the graph, nothing will happen.

tf::Task A = taskflow.emplace({ std::cout << "A"; });

tf::Task B = taskflow.emplace({ std::cout << "B"; });

tf::Task C = taskflow.emplace({ std::cout << "C"; });

tf::Task D = taskflow.emplace({ std::cout << "D"; });

A.precede(B, C, D);

// erase A from the taskflow and its dependencies to B, C, and D

taskflow.erase(A);

◆exclusive_scan()

template<typename B, typename E, typename D, typename T, typename BOP>

| Task tf::FlowBuilder::exclusive_scan | ( | B | first, | | | | E | last, | | | | D | d_first, | | | | T | init, | | | | BOP | bop ) |

creates an STL-styled parallel exclusive-scan task

Template Parameters

| B | beginning iterator type | | E | ending iterator type | | D | destination iterator type | | T | initial value type | | BOP | summation operator type |

Parameters

Performs the cumulative sum (aka prefix sum, aka scan) of the input range and writes the result to the output range. Each element of the output range contains the running total of all earlier elements (and the initial value) using the given binary operator for summation.

This function generates an exclusive scan, meaning the N-th element of the output range is the sum of the first N-1 input elements, so the N-th input element is not included.

std::vector<int> input = {1, 2, 3, 4, 5};

taskflow.exclusive_scan(

input.begin(), input.end(), input.begin(), -1, std::plus<int>{}

);

executor.run(taskflow).wait();

// input is {-1, 0, 2, 5, 9}

Iterators can be made stateful by using std::reference_wrapper

NotePlease refer to Parallel Scan for details.

◆fill()

template<typename B, typename E, typename V, PartitionerLike P = DefaultPartitioner>

| Task tf::FlowBuilder::fill | ( | B | first, | | | | E | last, | | | | V | value, | | | | P | part = P() ) |

fills a range with a given value in parallel

Template Parameters

| B | iterator type | | E | iterator type | | V | value type | | P | type satisfying tf::PartitionerLike |

Parameters

Returnsa tf::Task handle

The task spawns asynchronous tasks to fill the given range [first, last) with the given value in parallel. This is equivalent to calling std::fill(first, last, value) but in parallel.

std::vector<int> vec(1000);

tf::Task task = taskflow.fill(vec.begin(), vec.end(), 42);

◆fill_n()

template<typename B, std::integral C, typename V, PartitionerLike P = DefaultPartitioner>

| Task tf::FlowBuilder::fill_n | ( | B | first, | | | | C | count, | | | | V | value, | | | | P | part = P() ) |

fills N elements with a given value in parallel

Template Parameters

| B | iterator type | | C | count type (integral) | | V | value type | | P | type satisfying tf::PartitionerLike |

Parameters

Returnsa tf::Task handle

The task spawns asynchronous tasks to fill N elements starting from first with the given value in parallel. This is equivalent to calling std::fill_n(first, count, value) but in parallel.

std::vector<int> vec(1000);

tf::Task task = taskflow.fill_n(vec.begin(), 500, 42);

◆find_if()

template<typename B, typename E, typename T, typename UOP, PartitionerLike P = DefaultPartitioner>

| Task tf::FlowBuilder::find_if | ( | B | first, | | | | E | last, | | | | T & | result, | | | | UOP | predicate, | | | | P | part = P() ) |

constructs a task to perform STL-styled find-if algorithm

Template Parameters

| B | beginning iterator type | | E | ending iterator type | | T | resulting iterator type | | UOP | unary predicate type | | P | partitioner type |

Parameters

Returns an iterator to the first element in the range [first, last) that satisfies the given criteria (or last if there is no such iterator). This method is equivalent to the parallel execution of the following loop:

auto find_if(InputIt first, InputIt last, UnaryPredicate p) {

for (; first != last; ++first) {

if (predicate(*first)){

return first;

}

return last;

}

tf::FlowBuilder::find_if

Task find_if(B first, E last, T &result, UOP predicate, P part=P())

constructs a task to perform STL-styled find-if algorithm

For example, the code below find the element that satisfies the given criteria (value plus one is equal to 23) from an input range of 10 elements:

std::vector<int> input = {1, 6, 9, 10, 22, 5, 7, 8, 9, 11};

std::vector<int>::iterator result;

taskflow.find_if(

input.begin(), input.end(), [](int i){ return i+1 = 23; }, result

);

executor.run(taskflow).wait();

assert(*result == 22);

Iterators can be made stateful by using std::reference_wrapper

◆find_if_not()

template<typename B, typename E, typename T, typename UOP, PartitionerLike P = DefaultPartitioner>

| Task tf::FlowBuilder::find_if_not | ( | B | first, | | | | E | last, | | | | T & | result, | | | | UOP | predicate, | | | | P | part = P() ) |

constructs a task to perform STL-styled find-if-not algorithm

Template Parameters

| B | beginning iterator type | | E | ending iterator type | | T | resulting iterator type | | UOP | unary predicate type | | P | partitioner type |

Parameters

auto find_if(InputIt first, InputIt last, UnaryPredicate p) {

for (; first != last; ++first) {

if (!predicate(*first)){

return first;

}

return last;

}

For example, the code below find the element that satisfies the given criteria (value is not equal to 1) from an input range of 10 elements:

std::vector<int> input = {1, 1, 1, 1, 22, 1, 1, 1, 1, 1};

std::vector<int>::iterator result;

taskflow.find_if_not(

input.begin(), input.end(), [](int i){ return i == 1; }, result

);

executor.run(taskflow).wait();

assert(*result == 22);

Iterators can be made stateful by using std::reference_wrapper

◆for_each()

template<typename B, typename E, typename C, PartitionerLike P = DefaultPartitioner>

| Task tf::FlowBuilder::for_each | ( | B | first, | | | | E | last, | | | | C | callable, | | | | P | part = P() ) |

constructs an STL-styled parallel-for task

Template Parameters

| B | beginning iterator type | | E | ending iterator type | | C | callable type | | P | type satisfying tf::PartitionerLike |

Parameters

Returnsa tf::Task handle

The task spawns asynchronous tasks that applies the callable object to each object obtained by dereferencing every iterator in the range [first, last). This method is equivalent to the parallel execution of the following loop:

for(auto itr=first; itr!=last; itr++) {

callable(*itr);

}

Iterators can be made stateful by using std::reference_wrapper The callable needs to take a single argument of the dereferenced iterator type.

NotePlease refer to Parallel Iterations for details.

◆for_each_by_index()

template<IndexRangesLike R, typename C, PartitionerLike P = DefaultPartitioner>

| Task tf::FlowBuilder::for_each_by_index | ( | R | range, | | | | C | callable, | | | | P | part = P() ) |

constructs a parallel-for task over a one- or multi-dimensional index range

Template Parameters

| R | type satisfying tf::IndexRangesLike (i.e., tf::IndexRanges<T, N>); for N == 1 (equivalently, R is tf::IndexRange<T>) the engine uses the 1D unraveling fast path described below, and for N > 1 it partitions the Cartesian product as described further down | | C | callable type that is invocable with a single argument of type R | | P | type satisfying tf::PartitionerLike |

Parameters

Returnsa tf::Task handle

The task spawns asynchronous tasks that partition range and invoke callable once per partition, where each partition is itself a (sub-)range of the same type R.

One-dimensional range (N == 1)

For a 1D range tf::IndexRange<T>, the task applies callable to each index subrange of [first, last) with the given step size. This is equivalent to the parallel execution of the following loop:

// case 1: step size is positive

for(auto i=first; i<last; i+=step) {

callable(i);

}

// case 2: step size is negative

for(auto i=first; i>last; i+=step) {

callable(i);

}

// [0, 17) with a step size of 2 using tf::IndexRange

tf::IndexRange<int> range(0, 17, 2);

// parallelize the sequence [0, 2, 4, 6, 8, 10, 12, 14, 16]

taskflow.for_each_by_index(range, [](tf::IndexRange<int> subrange) {

// iterate each index in the subrange

for(int i=subrange.begin(); i<subrange.end(); i+=subrange.step_size()) {

printf("iterate %d\n", i);

}

});

executor.run(taskflow).wait();

tf::IndexRanges::end

T end() const

queries the ending index of the range (only available when N == 1)

Definition iterator.hpp:358

tf::IndexRanges::begin

T begin() const

queries the starting index of the range (only available when N == 1)

Definition iterator.hpp:346

tf::IndexRanges::step_size

T step_size() const

queries the step size of the range (only available when N == 1)

Definition iterator.hpp:370

tf::IndexRange

IndexRanges< T, 1 > IndexRange

alias for the common 1D case of tf::IndexRanges

Definition iterator.hpp:971

Multi-dimensional ranges (N > 1)

For N > 1, the function parallelises iteration over the Cartesian product of N independent 1D ranges. The total iteration space is linearized in row-major order (last dimension varies fastest) and divided among workers according to part. Each worker receives one or more orthogonal sub-boxes and invokes callable once per sub-box.

Each sub-box is guaranteed to be a valid hyper-rectangle: every dimension of the sub-box lies entirely within the corresponding dimension of range and preserves its original step size, including negative strides. Each dimension of a tf::IndexRanges is a std::tuple<T, T, T> of (begin, end, step) accessible through dim(d), so the callable typically destructures it via structured bindings and must iterate the sub-box using the step sizes reported by each dimension:

// 3D range: depth x height x width

tf::IndexRanges<int, 3> range(

tf::IndexRange<int>(0, D, 1),

tf::IndexRange<int>(0, H, 1),

tf::IndexRange<int>(0, W, 1)

);

taskflow.for_each_by_index(range, [](const tf::IndexRanges<int, 3>& sub) {

auto [d0, d1, ds] = sub.dim(0);

auto [h0, h1, hs] = sub.dim(1);

auto [w0, w1, ws] = sub.dim(2);

for(auto d = d0; d < d1; d += ds) {

for(auto h = h0; h < h1; h += hs) {

for(auto w = w0; w < w1; w += ws) {

// process element (d, h, w)

}

});

tf::IndexRanges

class to create an N-dimensional index range of integral indices

Definition iterator.hpp:188

tf::IndexRanges::dim

const std::tuple< T, T, T > & dim(size_t d) const

returns the (begin, end, step) tuple for dimension d (read-only)

Definition iterator.hpp:297

Stateful ranges

Ranges of any rank can be made stateful by passing them through std::reference_wrapper (via std::ref). This is useful when the range bounds are not known at task-graph construction time. An upstream task must set the bounds before this task runs:

tf::IndexRanges<int, 2> range;

auto init = taskflow.emplace(&{

range.dim(0) = {0, rows, 1};

range.dim(1) = {0, cols, 1};

});

auto loop = taskflow.for_each_by_index(std::ref(range), callable);

init.precede(loop);

The loop condition inside the callable must respect the sign of each dimension's step size: use < for positive steps and > for negative steps.

NotePlease refer to Parallel Iterations for details.

◆for_each_index()

template<typename B, typename E, typename S, typename C, PartitionerLike P = DefaultPartitioner>

| Task tf::FlowBuilder::for_each_index | ( | B | first, | | | | E | last, | | | | S | step, | | | | C | callable, | | | | P | part = P() ) |

constructs an index-based parallel-for task

Template Parameters

| B | beginning index type (must be integral) | | E | ending index type (must be integral) | | S | step type (must be integral) | | C | callable type | | P | type satisfying tf::PartitionerLike |

Parameters

Returnsa tf::Task handle

The task spawns asynchronous tasks that applies the callable object to each index in the range [first, last) with the step size. This method is equivalent to the parallel execution of the following loop:

// case 1: step size is positive

for(auto i=first; i<last; i+=step) {

callable(i);

}

// case 2: step size is negative

for(auto i=first, i>last; i+=step) {

callable(i);

}

Iterators can be made stateful by using std::reference_wrapper The callable needs to take a single argument of the integral index type.

NotePlease refer to Parallel Iterations for details.

◆generate()

template<typename B, typename E, typename G, PartitionerLike P = DefaultPartitioner>

| Task tf::FlowBuilder::generate | ( | B | first, | | | | E | last, | | | | G | gen, | | | | P | part = P() ) |

generates values into a range in parallel using a callable

Template Parameters

| B | iterator type | | E | iterator type | | G | generator callable type | | P | type satisfying tf::PartitionerLike |

Parameters

Returnsa tf::Task handle

The task spawns asynchronous tasks to generate and fill the range [first, last) with values produced by calling the generator gen in parallel. This is equivalent to calling std::generate(first, last, gen) but in parallel.

std::vector<int> vec(1000);

tf::Task task = taskflow.generate(vec.begin(), vec.end(),

&counter { return 42; });

◆generate_n()

template<typename B, std::integral C, typename G, PartitionerLike P = DefaultPartitioner>

| Task tf::FlowBuilder::generate_n | ( | B | first, | | | | C | count, | | | | G | gen, | | | | P | part = P() ) |

generates N values into a range in parallel using a callable

Template Parameters

| B | iterator type | | C | count type (integral) | | G | generator callable type | | P | type satisfying tf::PartitionerLike |

Parameters

Returnsa tf::Task handle

The task spawns asynchronous tasks to generate and fill N elements starting from first with values produced by calling the generator gen in parallel. This is equivalent to calling std::generate_n(first, count, gen) but in parallel.

std::vector<int> vec(1000);

tf::Task task = taskflow.generate_n(vec.begin(), 500,

&counter { return 42; });

◆inclusive_scan() [1/2]

template<typename B, typename E, typename D, typename BOP>

| Task tf::FlowBuilder::inclusive_scan | ( | B | first, | | | | E | last, | | | | D | d_first, | | | | BOP | bop ) |

creates an STL-styled parallel inclusive-scan task

Template Parameters

| B | beginning iterator type | | E | ending iterator type | | D | destination iterator type | | BOP | summation operator type |

Parameters

This function generates an inclusive scan, meaning that the N-th element of the output range is the sum of the first N input elements, so the N-th input element is included.

std::vector<int> input = {1, 2, 3, 4, 5};

taskflow.inclusive_scan(

input.begin(), input.end(), input.begin(), std::plus<int>{}

);

executor.run(taskflow).wait();

// input is {1, 3, 6, 10, 15}

Iterators can be made stateful by using std::reference_wrapper

NotePlease refer to Parallel Scan for details.

◆inclusive_scan() [2/2]

template<typename B, typename E, typename D, typename BOP, typename T>

| Task tf::FlowBuilder::inclusive_scan | ( | B | first, | | | | E | last, | | | | D | d_first, | | | | BOP | bop, | | | | T | init ) |

creates an STL-styled parallel inclusive-scan task with an initial value

Template Parameters

| B | beginning iterator type | | E | ending iterator type | | D | destination iterator type | | BOP | summation operator type | | T | initial value type |

Parameters

This function generates an inclusive scan, meaning the N-th element of the output range is the sum of the first N input elements, so the N-th input element is included.

std::vector<int> input = {1, 2, 3, 4, 5};

taskflow.inclusive_scan(

input.begin(), input.end(), input.begin(), std::plus<int>{}, -1

);

executor.run(taskflow).wait();

// input is {0, 2, 5, 9, 14}

Iterators can be made stateful by using std::reference_wrapper

NotePlease refer to Parallel Scan for details.

◆linearize() [1/2]

| inline |

adds adjacent dependency links to a linear list of tasks

Parameters

| tasks | an initializer list of tasks |

This member function creates linear dependencies over a list of tasks.

tf::Task A = taskflow.emplace({ std::cout << "A"; });

tf::Task B = taskflow.emplace({ std::cout << "B"; });

tf::Task C = taskflow.emplace({ std::cout << "C"; });

tf::Task D = taskflow.emplace({ std::cout << "D"; });

taskflow.linearize({A, B, C, D}); // A->B->C->D

◆linearize() [2/2]

| inline |

adds adjacent dependency links to a linear list of tasks

Parameters

| tasks | a vector of tasks |

This member function creates linear dependencies over a vector of tasks.

tf::Task A = taskflow.emplace({ std::cout << "A"; });

tf::Task B = taskflow.emplace({ std::cout << "B"; });

tf::Task C = taskflow.emplace({ std::cout << "C"; });

tf::Task D = taskflow.emplace({ std::cout << "D"; });

std::vector<tf::Task> tasks {A, B, C, D}

taskflow.linearize(tasks); // A->B->C->D

◆max_element()

template<typename B, typename E, typename T, typename C, PartitionerLike P>

| Task tf::FlowBuilder::max_element | ( | B | first, | | | | E | last, | | | | T & | result, | | | | C | comp, | | | | P | part ) |

constructs a task to perform STL-styled max-element algorithm

Template Parameters

| B | beginning iterator type | | E | ending iterator type | | T | resulting iterator type | | C | comparator type | | P | partitioner type |

Parameters

Finds the largest element in the [first, last) using the given comparison function object. The iterator to that largest element is stored in result. This method is equivalent to the parallel execution of the following loop:

if (first == last){

return last;

}

auto largest = first;

++first;

for (; first != last; ++first) {

if (comp(*largest, *first)) {

largest = first;

}

return largest;

For example, the code below find the largest element from an input range of 10 elements.

std::vector<int> input = {1, 1, 1, 1, 1, 2, 1, 1, 1, 1};

std::vector<int>::iterator result;

taskflow.max_element(

input.begin(), input.end(), std::less<int>(), result

);

executor.run(taskflow).wait();

assert(*result == 2);

Iterators can be made stateful by using std::reference_wrapper

◆merge() [1/2]

template<typename B1, typename E1, typename B2, typename E2, typename O>

| Task tf::FlowBuilder::merge | ( | B1 | first1, | | | | E1 | last1, | | | | B2 | first2, | | | | E2 | last2, | | | | O | d_first ) |

merges two sorted ranges into a single sorted output using the std::less comparator

Template Parameters

| B1 | beginning iterator type of the first range | | E1 | ending iterator type of the first range | | B2 | beginning iterator type of the second range | | E2 | ending iterator type of the second range | | O | destination iterator type |

Parameters

Creates a task that merges two sorted ranges [first1, last1) and [first2, last2) into a single sorted output range beginning at d_first, using std::less as the comparator.

The algorithm partitions the output range into W equal chunks (one per worker thread) and uses the co-rank technique to independently identify each worker's corresponding sub-ranges in seq1 and seq2, then merges them in parallel with no synchronization.

Unlike for_each or find, parallel merge does not benefit from dynamic or guided partitioning because std::merge always costs O(K) for a chunk of size K regardless of data — there is no load imbalance to adapt to. The algorithm therefore always uses static equal partitioning.

NoteUndefined behavior if either input range is not sorted with respect to std::less.

◆merge() [2/2]

template<typename B1, typename E1, typename B2, typename E2, typename O, typename C>

| Task tf::FlowBuilder::merge | ( | B1 | first1, | | | | E1 | last1, | | | | B2 | first2, | | | | E2 | last2, | | | | O | d_first, | | | | C | cmp ) |

merges two sorted ranges into a single sorted output using a custom comparator

Template Parameters

Parameters

Creates a task that merges two sorted ranges [first1, last1) and [first2, last2) into a single sorted output range beginning at d_first, using cmp as the comparator.

NoteUndefined behavior if either input range is not sorted with respect to cmp.

◆min_element()

template<typename B, typename E, typename T, typename C, PartitionerLike P>

| Task tf::FlowBuilder::min_element | ( | B | first, | | | | E | last, | | | | T & | result, | | | | C | comp, | | | | P | part ) |

constructs a task to perform STL-styled min-element algorithm

Template Parameters

| B | beginning iterator type | | E | ending iterator type | | T | resulting iterator type | | C | comparator type | | P | partitioner type |

Parameters

Finds the smallest element in the [first, last) using the given comparison function object. The iterator to that smallest element is stored in result. This method is equivalent to the parallel execution of the following loop:

if (first == last) {

return last;

}

auto smallest = first;

++first;

for (; first != last; ++first) {

if (comp(*first, *smallest)) {

smallest = first;

}

return smallest;

For example, the code below find the smallest element from an input range of 10 elements.

std::vector<int> input = {1, 1, 1, 1, 1, -1, 1, 1, 1, 1};

std::vector<int>::iterator result;

taskflow.min_element(

input.begin(), input.end(), std::less<int>(), result

);

executor.run(taskflow).wait();

assert(*result == -1);

Iterators can be made stateful by using std::reference_wrapper

◆placeholder()

| Task tf::FlowBuilder::placeholder | ( | | ) | |

| inline |

creates a placeholder task

Returnsa tf::Task handle

A placeholder task maps to a node in the taskflow graph, but it does not have any callable work assigned yet. A placeholder task is different from an empty task handle that does not point to any node in a graph.

// create a placeholder task with no callable target assigned

tf::Task placeholder = taskflow.placeholder();

assert(placeholder.empty() == false && placeholder.has_work() == false);

// create an empty task handle

tf::Task task;

assert(task.empty() == true);

// assign the task handle to the placeholder task

task = placeholder;

assert(task.empty() == false && task.has_work() == false);

tf::FlowBuilder::placeholder

Task placeholder()

creates a placeholder task

Definition flow_builder.hpp:1635

tf::Task::empty

bool empty() const

queries if the task handle is associated with a taskflow node

Definition task.hpp:1413

tf::Task::has_work

bool has_work() const

queries if the task has a work assigned

Definition task.hpp:1418

◆reduce()

template<typename B, typename E, typename T, typename O, PartitionerLike P = DefaultPartitioner>

| Task tf::FlowBuilder::reduce | ( | B | first, | | | | E | last, | | | | T & | init, | | | | O | bop, | | | | P | part = P() ) |

constructs an STL-styled parallel-reduction task

Template Parameters

| B | beginning iterator type | | E | ending iterator type | | T | result type | | O | binary reducer type | | P | type satisfying tf::PartitionerLike |

Parameters

Returnsa tf::Task handle

The task spawns asynchronous tasks to perform parallel reduction over init and the elements in the range [first, last). The reduced result is store in init. This method is equivalent to the parallel execution of the following loop:

for(auto itr=first; itr!=last; itr++) {

init = bop(init, *itr);

}

Iterators can be made stateful by using std::reference_wrapper

NotePlease refer to Parallel Reduction for details.

◆reduce_by_index()

template<IndexRanges1DLike R, typename T, typename L, typename G, PartitionerLike P = DefaultPartitioner>

| Task tf::FlowBuilder::reduce_by_index | ( | R | range, | | | | T & | init, | | | | L | lop, | | | | G | gop, | | | | P | part = P() ) |

constructs an index range-based parallel-reduction task

Template Parameters

| R | type satisfying tf::IndexRanges1DLike | | T | result type | | L | local reducer type | | G | global reducer type | | P | type satisfying tf::PartitionerLike |

Parameters

Returnsa tf::Task handle

The task spawns asynchronous tasks to perform parallel reduction over a range with init. The reduced result is store in init. Unlike the iterator-based reduction, index range-based reduction is particularly useful for applications that benefit from SIMD optimizations or other range-based processing strategies.

const size_t N = 1000000;

std::vector<int> data(N); // uninitialized data vector

int res = 1; // res will participate in the reduction

taskflow.reduce_by_index(

tf::IndexRange<size_t>(0, N, 1),

// final result

res,

// local reducer

[&](tf::IndexRange<size_t> subrange, std::optional<int> running_total) -> int {

int residual = running_total ? *running_total : 0.0;

for(size_t i=subrange.begin(); i<subrange.end(); i+=subrange.step_size()) {

data[i] = 1.0;

residual += data[i];

}

printf("partial sum = %lf\n", residual);

return residual;

// global reducer

std::plus<int>()

);

executor.run(taskflow).wait();

assert(res = N + 1);

Range can be made stateful by using std::reference_wrapper.

NotePlease refer to Parallel Reduction for details.

◆sort() [1/2]

template<typename B, typename E>

| Task tf::FlowBuilder::sort | ( | B | first, | | | | E | last ) |

constructs a dynamic task to perform STL-styled parallel sort using the std::less<T> comparator, where T is the element type

Template Parameters

| B | beginning iterator type (random-accessible) | | E | ending iterator type (random-accessible) |

Parameters

The task spawns asynchronous tasks to parallel sort elements in the range [first, last) using the std::less<T> comparator, where T is the dereferenced iterator type.

Iterators can be made stateful by using std::reference_wrapper

NotePlease refer to Parallel Sort for details.

◆sort() [2/2]

template<typename B, typename E, typename C>

| Task tf::FlowBuilder::sort | ( | B | first, | | | | E | last, | | | | C | cmp ) |

constructs a dynamic task to perform STL-styled parallel sort

Template Parameters

| B | beginning iterator type (random-accessible) | | E | ending iterator type (random-accessible) | | C | comparator type |

Parameters

The task spawns asynchronous tasks to sort elements in the range [first, last) in parallel.

Iterators can be made stateful by using std::reference_wrapper

NotePlease refer to Parallel Sort for details.

◆transform() [1/2]

template<typename B, typename E, typename O, typename C, PartitionerLike P = DefaultPartitioner>

| Task tf::FlowBuilder::transform | ( | B | first1, | | | | E | last1, | | | | O | d_first, | | | | C | c, | | | | P | part = P() ) |

constructs a parallel-transform task

Template Parameters

| B | beginning input iterator type | | E | ending input iterator type | | O | output iterator type | | C | callable type | | P | type satisfying tf::PartitionerLike |

Parameters

Returnsa tf::Task handle

The task spawns asynchronous tasks that applies the callable object to an input range and stores the result in another output range. This method is equivalent to the parallel execution of the following loop:

while (first1 != last1) {

*d_first++ = c(*first1++);

}

Iterators can be made stateful by using std::reference_wrapper The callable needs to take a single argument of the dereferenced iterator type.

NotePlease refer to Parallel Transforms for details.

◆transform() [2/2]

template<typename B1, typename E1, typename B2, typename O, typename C, PartitionerLike P = DefaultPartitioner>
requires (!PartitionerLike<std::decay_t<C>>)

| Task tf::FlowBuilder::transform | ( | B1 | first1, | | | | E1 | last1, | | | | B2 | first2, | | | | O | d_first, | | | | C | c, | | | | P | part = P() ) |

constructs a parallel-transform task

Template Parameters

| B1 | beginning input iterator type for the first input range | | E1 | ending input iterator type for the first input range | | B2 | beginning input iterator type for the first second range | | O | output iterator type | | C | callable type | | P | type satisfying tf::PartitionerLike |

Parameters

Returnsa tf::Task handle

The task spawns asynchronous tasks that applies the callable object to two input ranges and stores the result in another output range. This method is equivalent to the parallel execution of the following loop:

while (first1 != last1) {

*d_first++ = c(*first1++, *first2++);

}

Iterators can be made stateful by using std::reference_wrapper The callable needs to take two arguments of dereferenced elements from the two input ranges.

NotePlease refer to Parallel Transforms for details.

◆transform_exclusive_scan()

template<typename B, typename E, typename D, typename T, typename BOP, typename UOP>

| Task tf::FlowBuilder::transform_exclusive_scan | ( | B | first, | | | | E | last, | | | | D | d_first, | | | | T | init, | | | | BOP | bop, | | | | UOP | uop ) |

creates an STL-styled parallel transform-exclusive scan task

Template Parameters

| B | beginning iterator type | | E | ending iterator type | | D | destination iterator type | | BOP | summation operator type | | UOP | transform operator type | | T | initial value type |

Parameters

Write the cumulative sum (aka prefix sum, aka scan) of the input range to the output range. Each element of the output range contains the running total of all earlier elements (including an initial value) using uop to transform the input elements and using bop for summation.

This function generates an exclusive scan, meaning the Nth element of the output range is the sum of the first N-1 input elements, so the Nth input element is not included.

std::vector<int> input = {1, 2, 3, 4, 5};

taskflow.transform_exclusive_scan(

input.begin(), input.end(), input.begin(), -1, std::plus<int>{},

[](int item) { return -item; }

);

executor.run(taskflow).wait();

// input is {-1, -2, -4, -7, -11}

Iterators can be made stateful by using std::reference_wrapper

NotePlease refer to Parallel Scan for details.

◆transform_inclusive_scan() [1/2]

template<typename B, typename E, typename D, typename BOP, typename UOP>

| Task tf::FlowBuilder::transform_inclusive_scan | ( | B | first, | | | | E | last, | | | | D | d_first, | | | | BOP | bop, | | | | UOP | uop ) |

creates an STL-styled parallel transform-inclusive scan task

Template Parameters

| B | beginning iterator type | | E | ending iterator type | | D | destination iterator type | | BOP | summation operator type | | UOP | transform operator type |

Parameters

Write the cumulative sum (aka prefix sum, aka scan) of the input range to the output range. Each element of the output range contains the running total of all earlier elements using uop to transform the input elements and using bop for summation.

This function generates an inclusive scan, meaning the Nth element of the output range is the sum of the first N input elements, so the Nth input element is included.

std::vector<int> input = {1, 2, 3, 4, 5};

taskflow.transform_inclusive_scan(

input.begin(), input.end(), input.begin(), std::plus<int>{},

[] (int item) { return -item; }

);

executor.run(taskflow).wait();

// input is {-1, -3, -6, -10, -15}

Iterators can be made stateful by using std::reference_wrapper

NotePlease refer to Parallel Scan for details.

◆transform_inclusive_scan() [2/2]

template<typename B, typename E, typename D, typename BOP, typename UOP, typename T>

| Task tf::FlowBuilder::transform_inclusive_scan | ( | B | first, | | | | E | last, | | | | D | d_first, | | | | BOP | bop, | | | | UOP | uop, | | | | T | init ) |

creates an STL-styled parallel transform-inclusive scan task

Template Parameters

| B | beginning iterator type | | E | ending iterator type | | D | destination iterator type | | BOP | summation operator type | | UOP | transform operator type | | T | initial value type |

Parameters

This function generates an inclusive scan, meaning the Nth element of the output range is the sum of the first N input elements, so the Nth input element is included.

std::vector<int> input = {1, 2, 3, 4, 5};

taskflow.transform_inclusive_scan(

input.begin(), input.end(), input.begin(), std::plus<int>{},

[] (int item) { return -item; },

-1

);

executor.run(taskflow).wait();

// input is {-2, -4, -7, -11, -16}

Iterators can be made stateful by using std::reference_wrapper

NotePlease refer to Parallel Scan for details.

◆transform_reduce() [1/2]

template<typename B, typename E, typename T, typename BOP, typename UOP, PartitionerLike P = DefaultPartitioner>

| Task tf::FlowBuilder::transform_reduce | ( | B | first, | | | | E | last, | | | | T & | init, | | | | BOP | bop, | | | | UOP | uop, | | | | P | part = P() ) |

constructs an STL-styled parallel transform-reduce task

Template Parameters

| B | beginning iterator type | | E | ending iterator type | | T | result type | | BOP | binary reducer type | | UOP | unary transformation type | | P | type satisfying tf::PartitionerLike |

Parameters

Returnsa tf::Task handle

The task spawns asynchronous tasks to perform parallel reduction over init and the transformed elements in the range [first, last). The reduced result is store in init. This method is equivalent to the parallel execution of the following loop:

for(auto itr=first; itr!=last; itr++) {

init = bop(init, uop(*itr));

}

Iterators can be made stateful by using std::reference_wrapper

NotePlease refer to Parallel Reduction for details.

◆transform_reduce() [2/2]

template<typename B1, typename E1, typename B2, typename T, typename BOP_R, typename BOP_T, PartitionerLike P = DefaultPartitioner>
requires (!PartitionerLike<std::decay_t<BOP_T>>)

constructs an STL-styled parallel transform-reduce task

Template Parameters

Parameters

Returnsa tf::Task handle

The task spawns asynchronous tasks to perform parallel reduction over init and transformed elements in the range [first, last). The reduced result is store in init. This method is equivalent to the parallel execution of the following loop:

for(auto itr1=first1, itr2=first2; itr1!=last1; itr1++, itr2++) {

init = bop_r(init, bop_t(*itr1, *itr2));

}

Iterators can be made stateful by using std::reference_wrapper

NotePlease refer to Parallel Reduction for details.

The documentation for this class was generated from the following file:

taskflow/core/flow_builder.hpp
tf
FlowBuilder
Maintained by Dr. Tsung-Wei Huang — Generated by 1.13.1