Back to Ray

Anti-pattern: Over-parallelizing with too fine-grained tasks harms speedup

doc/source/ray-core/patterns/too-fine-grained-tasks.rst

1.13.11.4 KB
Original Source

Anti-pattern: Over-parallelizing with too fine-grained tasks harms speedup

TLDR: Avoid over-parallelizing. Parallelizing tasks has higher overhead than using normal functions.

Parallelizing or distributing tasks usually comes with higher overhead than an ordinary function call. Therefore, if you parallelize a function that executes very quickly, the overhead could take longer than the actual function call!

To handle this problem, we should be careful about parallelizing too much. If you have a function or task that’s too small, you can use a technique called batching to make your tasks do more meaningful work in a single call.

Code example

Anti-pattern:

.. literalinclude:: ../doc_code/anti_pattern_too_fine_grained_tasks.py :language: python :start-after: anti_pattern_start :end-before: anti_pattern_end

Better approach: Use batching.

.. literalinclude:: ../doc_code/anti_pattern_too_fine_grained_tasks.py :language: python :start-after: batching_start :end-before: batching_end

As we can see from the example above, over-parallelizing has higher overhead and the program runs slower than the serial version. Through batching with a proper batch size, we are able to amortize the overhead and achieve the expected speedup.