Back to Onetbb

Use Nested Algorithms to Increase Scalability

doc/main/tbb_userguide/use_nested_algorithms.rst

2023.0.02.4 KB
Original Source

.. _use_nested_algorithms:

Use Nested Algorithms to Increase Scalability

One powerful way to increase the scalability of a flow graph is to nest other parallel algorithms inside of node bodies. Doing so, you can use a flow graph as a coordination language, expressing the most coarse-grained parallelism at the level of the graph, with finer grained parallelism nested within.

In the example below, five nodes are created: an input_node, matrix_source, that reads a sequence of matrices from a file, two function_nodes, n1 and n2, that receive these matrices and generate two new matrices by applying a function to each element, and two final function_nodes, n1_sink and n2_sink, that process these resulting matrices. The matrix_source is connected to both n1 and n2. The node n1 is connected to n1_sink, and n2 is connected to n2_sink. In the lambda expressions for n1 and n2, a parallel_for is used to apply the functions to the elements of the matrix in parallel. The functions read_next_matrix, f1, f2, consume_f1 and consume_f2 are not provided below.

::

   graph g;
   input_node< double * > matrix_source( g, [&]( oneapi::tbb::flow_control &fc ) -> double* {
     double *a = read_next_matrix();
     if ( a ) {
       return a;
     } else {
       fc.stop();
       return nullptr;
     }
   } );
   function_node< double *, double * > n1( g, unlimited, [&]( double *a ) -> double * {
     double *b = new double[N];
     parallel_for( 0, N, [&](int i) {
       b[i] = f1(a[i]);
     } );
     return b;
   } );
   function_node< double *, double * > n2( g, unlimited, [&]( double *a ) -> double * {
     double *b = new double[N];
     parallel_for( 0, N, [&](int i) {
       b[i] = f2(a[i]);
     } );
     return b;
   } );
   function_node< double *, double * > n1_sink( g, unlimited, 
     []( double *b ) -> double * {
       return consume_f1(b);
   } );
   function_node< double *, double * > n2_sink( g, unlimited, 
     []( double *b ) -> double * {
       return consume_f2(b);
   } );
   make_edge( matrix_source, n1 );
   make_edge( matrix_source, n2 );
   make_edge( n1, n1_sink );
   make_edge( n2, n2_sink );
   matrix_source.activate();
   g.wait_for_all();