examples/parallel_for_each/parallel_preorder/README.md
Example that uses parallel_for_each to do parallel preorder traversal of a sparse graph.
Each vertex in the graph is called a "cell". Each cell has a value. The value is a matrix. Some of the cells have operators that compute the cell's value, using other cell's values as input. A cell that uses the value of cell x is called a successor of x.
The algorithm works as follows.
root_set.ref_count that is an atomic integer. Initialize ref_count to the number of inputs for the Cell.root_set, by applying a parallel_for_each to a root_set.feeder::add.The times printed are for the traversal and update, and do not include time for computing the root_set.
The example is using custom synchronization via ref_count atomic variable. Correctness checking tools might not take this into account, and report data races between different tasks that are actually synchronized.
Note: It is important to understand that this example is unlikely to show speedup if the cell values are changed to type "float". The reason is twofold.
Cell to be significantly smaller than a cache line, which leads to false sharing conflicts.parallel_for_each swamps the useful work.cmake <path_to_example>
cmake --build .
make run_parallel_preorder - executes the example with predefined parametersmake perf_run_parallel_preorder - executes the example with suggested parameters to measure the oneTBB performancemake light_test_parallel_preorder - executes the example with suggested parameters to reduce execution time.Usage:
parallel_preorder [n-of-threads=value] [n-of-nodes=value] [n-of-traversals=value] [silent] [-h] [n-of-threads [n-of-nodes [n-of-traversals]]]
-h - prints the help for command line options.n-of-threads - the number of threads to use; a range of the form low[:high], where low and optional high are non-negative integers or auto for a platform-specific default number.n-of-nodes - the number of nodes in the graph. Default value is 1000.n-of-traversals - the number of times to evaluate the graph. Default value is 500.silent - no output except elapsed time.