Back to Taskflow

What is an Async Task?

docs/AsyncTasking.html

4.1.010.3 KB
Original Source

| | Taskflow: A General-purpose Task-parallel Programming System |

Loading...

Searching...

No Matches

Asynchronous Tasking

Taskflow provides mechanisms to launch tasks asynchronously, enabling dynamic parallelism that goes beyond static task graphs.

What is an Async Task?

An async task is a callable object submitted for execution without being embedded in a pre-defined task graph. Unlike regular taskflow tasks whose dependencies are declared upfront, async tasks are created and dispatched on the fly, making them suitable for dynamic, recursive, or data-dependent parallelism that cannot be fully determined at graph construction time.

The C++ standard library provides std::async for this purpose. However, std::async has fundamental limitations that make it ill-suited for high-performance parallel programs:

// std::async typically spawns a new OS thread for each call

std::future<int> f1 = std::async(std::launch::async, { return 1; });

std::future<int> f2 = std::async(std::launch::async, { return 2; });

std::future<int> f3 = std::async(std::launch::async, { return 3; });

// ... spawning N tasks creates N threads, each with its own stack and

// OS overhead — expensive to create, destroy, and context-switch

The three core problems with std::async are:

  • No thread pool: each call to std::async typically creates a brand-new OS thread, incurring significant creation and destruction overhead. Spawning hundreds of async tasks means hundreds of threads competing for CPU time.
  • No scheduler: there is no work-stealing or load balancing between std::async tasks. If one task finishes early, its thread sits idle rather than picking up work from overloaded threads.
  • No task graph integration: std::async tasks are isolated from one another. You cannot express dependencies between them, embed them in a larger task graph, or coordinate them with other parallel work.

Taskflow's async tasking addresses all three problems. Async tasks run on the executor's existing thread pool under the same work-stealing scheduler, integrate naturally with taskflows and runtimes, and can be launched from any thread without additional overhead.

Launch Async Tasks from an Executor

tf::Executor::async runs a callable asynchronously on the thread pool and returns a std::future that will eventually hold the result:

std::future<int> future = executor.async({ return 1; });

assert(future.get() == 1);

If you do not need the return value or do not require a std::future for synchronisation, use tf::Executor::silent_async instead. It returns nothing and incurs less overhead than tf::Executor::async, as it avoids the cost of managing a shared state:

executor.silent_async({});

Both tf::Executor::async and tf::Executor::silent_async are thread-safe and can be called from any thread — including worker threads already running inside the executor and external threads outside of it. The scheduler automatically detects the submission source and applies work-stealing to distribute the task efficiently across workers:

tf::Task my_task = taskflow.emplace(& {

// launch an async task from a worker thread inside the executor

executor.async(& {

// launch another async task from yet another worker thread

executor.async(& {});

});

});

executor.run(taskflow);

executor.wait_for_all();

tf::Task

class to create a task handle over a taskflow node

Definition task.hpp:569

NoteAsync tasks created from an executor do not belong to any taskflow. Their lifetime is automatically managed by the executor.

Launch Async Tasks from a Runtime

tf::Runtime::async and tf::Runtime::silent_async let you launch async tasks from within a running task that has access to a tf::Runtime object. Like their executor counterparts, both methods are thread-safe and can be called from any context within the runtime's scope.

Unlike executor-level async tasks, tasks created from a runtime belong to that runtime and are implicitly joined at the end of its scope — meaning all async tasks spawned inside a runtime are guaranteed to finish before the runtime completes and control returns to the next task in the graph.

The example below spawns 100 async tasks from a runtime. Because of the implicit join, task B is guaranteed to see counter == 100:

tf::Taskflow taskflow;

tf::Executor executor;

std::atomic<int> counter{0};

tf::Task A = taskflow.emplace([&](tf::Runtime& rt) {

for(int i = 0; i < 100; i++) {

rt.silent_async(&{ ++counter; });

}

}); // implicit join: all 100 tasks finish before A completes

tf::Task B = taskflow.emplace(& {

assert(counter == 100);

});

A.precede(B);

executor.run(taskflow).wait();

tf::Executor

class to create an executor

Definition executor.hpp:62

tf::Executor::run

tf::Future< void > run(Taskflow &taskflow)

runs a taskflow once

tf::FlowBuilder::emplace

Task emplace(C &&callable)

creates a static task

Definition flow_builder.hpp:1571

tf::Runtime

class to create a runtime task

Definition runtime.hpp:47

tf::Runtime::silent_async

void silent_async(F &&f)

runs the given function asynchronously without returning any future object

Definition runtime.hpp:671

tf::Task::precede

Task & precede(Ts &&... tasks)

adds precedence links from this to other tasks

Definition task.hpp:1258

tf::Taskflow

class to create a taskflow object

Definition taskflow.hpp:64

Launching async tasks from a runtime is the key enabler for dynamic parallel algorithms — parallel reduction, divide-and-conquer, and recursive patterns — that need to create work at runtime rather than at graph construction time.

Launch Async Tasks Recursively from a Runtime

Async tasks spawned from a runtime can themselves accept a tf::Runtime reference, allowing them to recursively spawn further async tasks. Combined with tf::Runtime::corun, this enables fork-join style divide-and-conquer parallelism where each level of recursion fans out work to available workers without blocking any thread.

The example below implements parallel Fibonacci using recursive async tasking:

#include <taskflow/taskflow.hpp>

size_t fibonacci(size_t N, tf::Runtime& rt) {

if(N < 2) return N;

size_t res1, res2;

// spawn the left child asynchronously

rt.silent_async([N, &res1](tf::Runtime& rt1) {

res1 = fibonacci(N-1, rt1);

});

// compute the right child inline (tail optimisation)

res2 = fibonacci(N-2, rt);

// wait for all async children without blocking the worker thread

rt.corun();

return res1 + res2;

}

int main() {

tf::Executor executor;

size_t N = 5, res;

executor.silent_async([N, &res](tf::Runtime& rt) {

res = fibonacci(N, rt);

});

executor.wait_for_all();

std::cout << N << "-th Fibonacci number is " << res << '\n';

return 0;

}

tf::Executor::silent_async

void silent_async(P &&params, F &&func)

similar to tf::Executor::async but does not return a future object

tf::Executor::wait_for_all

void wait_for_all()

waits for all tasks to complete

tf::Runtime::corun

void corun()

corun all tasks spawned by this runtime with other workers

Definition runtime.hpp:646

Notert.corun() without arguments waits for all async tasks spawned within the current runtime scope to complete, without blocking the underlying worker thread from executing other work in the meantime. This is what allows the recursive pattern to scale efficiently — a blocked worker can participate in executing the spawned children rather than idling.

The figure below shows the execution diagram for fibonacci(4). The suffix _1 denotes the left child spawned by its parent runtime:

Embedded content