docs/ParallelFind.html
| | Taskflow: A General-purpose Task-parallel Programming System |
Loading...
Searching...
No Matches
Parallel Find
Taskflow provides template functions for constructing tasks to perform parallel find over a range of items.
You need to include the header file, taskflow/algorithm/find.hpp, for using parallel-find algorithms.
#include <taskflow/algorithm/find.hpp>
A find algorithm searches a range [first, last) for an element that satisfies a given criterion and returns an iterator to the first such element, or last if no such element exists.
What distinguishes a parallel find from a parallel for-each is its short-circuit semantics: as soon as any worker finds a qualifying element, the remaining workers abandon their portions of the range. This means the algorithm does not always process every element — it stops as early as possible once a result is found.
Taskflow provides the following parallel-find algorithms:
truefalseAll four algorithms store the result iterator in a variable passed by reference, so the result is available after the task completes.
tf::Taskflow::find_if(B first, E last, T& result, UOP predicate, P part) performs a parallel search over the range [first, last) and stores in result an iterator to the first element for which predicate returns true, or last if no such element exists. It resembles a parallel implementation of the following loop:
template <typename InputIt, typename UnaryPredicate>
InputIt find_if(InputIt first, InputIt last, UnaryPredicate predicate) {
for (; first != last; ++first) {
if (predicate(*first)) {
return first;
}
}
return last;
}
The example below creates a task to find the element equal to 22 in a range of 10 integers:
std::vector<int> input = {1, 9, 22, 3, -6, 13, 12, 0, 9, 11};
std::vector<int>::iterator result;
taskflow.find_if(
input.begin(), input.end(), result, [](int i) { return i == 22; }
);
executor.run(taskflow).wait();
assert(*result == 22);
You can pass iterators by reference using std::ref to marshal parameter updates between dependent tasks. This is useful when the range is not known at task-graph construction time but is initialized by an upstream task.
std::vector<int> input;
std::vector<int>::iterator result, first, last;
tf::Task init = taskflow.emplace(& {
input = {1, 9, 22, 3, -6, 13, 12, 0, 9, 11};
first = input.begin();
last = input.end();
});
tf::Task task = taskflow.find_if(
std::ref(first), std::ref(last), result, [](int i) { return i == 22; }
);
// wrong! first and last are captured by copy at construction time
// tf::Task task = taskflow.find_if(first, last, result, [](int i){ return i == 22; });
init.precede(task);
executor.run(taskflow).wait();
assert(*result == 22);
class to create a task handle over a taskflow node
Definition task.hpp:569
Task & precede(Ts &&... tasks)
adds precedence links from this to other tasks
Definition task.hpp:1258
When init finishes, first and last point to the initialized data range of input, and the find-if task searches over all 10 elements.
tf::Taskflow::find_if_not(B first, E last, T& result, UOP predicate, P part) performs a parallel search over the range [first, last) and stores in result an iterator to the first element for which predicate returns false, or last if no such element exists. It resembles a parallel implementation of the following loop:
template <typename InputIt, typename UnaryPredicate>
InputIt find_if_not(InputIt first, InputIt last, UnaryPredicate predicate) {
for (; first != last; ++first) {
if (!predicate(*first)) {
return first;
}
}
return last;
}
The example below creates a task to find the first element that is not equal to 1 in a range where only one element differs:
std::vector<int> input = {1, 1, 22, 1, 1, 1, 1, 1, 1, 1};
std::vector<int>::iterator result;
taskflow.find_if_not(
input.begin(), input.end(), result, [](int i) { return i == 1; }
);
executor.run(taskflow).wait();
assert(*result == 22);
As with tf::Taskflow::find_if, iterators can be passed by reference using std::ref so that an upstream task can set up the range before the parallel find-if-not runs.
std::vector<int> input;
std::vector<int>::iterator result, first, last;
tf::Task init = taskflow.emplace(& {
input = {1, 1, 22, 1, 1, 1, 1, 1, 1, 1};
first = input.begin();
last = input.end();
});
tf::Task task = taskflow.find_if_not(
std::ref(first), std::ref(last), result, [](int i) { return i == 1; }
);
// wrong! first and last are captured by copy at construction time
// tf::Task task = taskflow.find_if_not(first, last, result, [](int i){ return i == 1; });
init.precede(task);
executor.run(taskflow).wait();
assert(*result == 22);
tf::Taskflow::min_element(B first, E last, T& result, C comp, P part) performs a parallel search over the range [first, last) and stores in result an iterator to the element with the smallest value under the comparator comp. It resembles a parallel implementation of the following loop:
template <typename InputIt, typename Compare>
InputIt min_element(InputIt first, InputIt last, Compare comp) {
if (first == last) return last;
InputIt smallest = first;
for (++first; first != last; ++first) {
if (comp(*first, *smallest)) {
smallest = first;
}
}
return smallest;
}
The example below finds the smallest element in a range of 10 integers:
std::vector<int> input = {1, 1, 1, 1, 1, -1, 1, 1, 1, 1};
std::vector<int>::iterator result;
taskflow.min_element(
input.begin(), input.end(), std::less<int>(), result
);
executor.run(taskflow).wait();
assert(*result == -1);
You can pass iterators by reference using std::ref so that an upstream task can set up the range before the parallel min-element search runs.
std::vector<int> input;
std::vector<int>::iterator result, first, last;
tf::Task init = taskflow.emplace(& {
input = {1, 1, 1, 1, 1, -1, 1, 1, 1, 1};
first = input.begin();
last = input.end();
});
tf::Task task = taskflow.min_element(
std::ref(first), std::ref(last), std::less<int>(), result
);
// wrong! first and last are captured by copy at construction time
// tf::Task task = taskflow.min_element(first, last, std::less<int>(), result);
init.precede(task);
executor.run(taskflow).wait();
assert(*result == -1);
tf::Taskflow::max_element(B first, E last, T& result, C comp, P part) performs a parallel search over the range [first, last) and stores in result an iterator to the element with the largest value under the comparator comp. It resembles a parallel implementation of the following loop:
template <typename InputIt, typename Compare>
InputIt max_element(InputIt first, InputIt last, Compare comp) {
if (first == last) return last;
InputIt largest = first;
for (++first; first != last; ++first) {
if (comp(*largest, *first)) {
largest = first;
}
}
return largest;
}
The example below finds the largest element in a range of 10 integers:
std::vector<int> input = {1, 1, 1, 1, 1, 2, 1, 1, 1, 1};
std::vector<int>::iterator result;
taskflow.max_element(
input.begin(), input.end(), std::less<int>(), result
);
executor.run(taskflow).wait();
assert(*result == 2);
Notetf::Taskflow::max_element uses std::less as its comparator, which may seem counterintuitive. The comparator comp defines an ordering such that comp(a, b) returns true if a comes before b. For finding the maximum, we want the element that nothing else comes after, so std::less naturally puts the largest element last — and max_element returns it. Refer to std::max_element for further details.
You can pass iterators by reference using std::ref so that an upstream task can set up the range before the parallel max-element search runs.
std::vector<int> input;
std::vector<int>::iterator result, first, last;
tf::Task init = taskflow.emplace(& {
input = {1, 1, 1, 1, 1, 2, 1, 1, 1, 1};
first = input.begin();
last = input.end();
});
tf::Task task = taskflow.max_element(
std::ref(first), std::ref(last), std::less<int>(), result
);
// wrong! first and last are captured by copy at construction time
// tf::Task task = taskflow.max_element(first, last, std::less<int>(), result);
init.precede(task);
executor.run(taskflow).wait();
assert(*result == 2);
A partitioner controls how the iteration space is divided among workers. Taskflow provides four partitioners, each suited to different workload characteristics:
The following example creates two parallel find-if tasks using different partitioners:
std::vector<int> input(1024, -1);
std::vector<int>::iterator result;
tf::StaticPartitioner static_partitioner(0); // chunk size auto-determined
tf::GuidedPartitioner guided_partitioner(0); // minimum chunk size auto-determined
// parallel find-if with static partitioner
taskflow.find_if(
input.begin(), input.end(), result,
[](int i) { return i == -1; },
static_partitioner
);
// parallel find-if with guided partitioner
taskflow.find_if(
input.begin(), input.end(), result,
[](int i) { return i == -1; },
guided_partitioner
);
class to create a guided partitioner for scheduling parallel algorithms
Definition partitioner.hpp:417
class to construct a static partitioner for scheduling parallel algorithms
Definition partitioner.hpp:262
As a rule of thumb, prefer tf::StaticPartitioner when the predicate or comparator costs the same for every element and tf::GuidedPartitioner for irregular workloads where evaluation cost varies across elements. tf::DynamicPartitioner is a good choice when chunks must be kept small and strictly equal in size.
NoteBy default, parallel-find tasks use tf::DefaultPartitioner (currently tf::GuidedPartitioner) if no partitioner is specified.