doc/source/ray-more-libs/multiprocessing.rst
.. _ray-multiprocessing:
.. _issue on GitHub: https://github.com/ray-project/ray/issues
Ray supports running distributed Python programs with the multiprocessing.Pool API_
using Ray Actors <actors.html>__ instead of local processes. This makes it easy
to scale existing applications that use multiprocessing.Pool from a single node
to a cluster.
.. _multiprocessing.Pool API: https://docs.python.org/3/library/multiprocessing.html#module-multiprocessing.pool
To get started, first install Ray <installation.html>__, then use
ray.util.multiprocessing.Pool in place of multiprocessing.Pool.
This will start a local Ray cluster the first time you create a Pool and
distribute your tasks across it. See the Run on a Cluster_ section below for
instructions to run on a multi-node Ray cluster instead.
.. code-block:: python
from ray.util.multiprocessing import Pool
def f(index): return index
pool = Pool() for result in pool.map(f, range(100)): print(result)
The full multiprocessing.Pool API is currently supported. Please see the
multiprocessing documentation_ for details.
.. warning::
The context argument in the Pool constructor is ignored when using Ray.
.. _multiprocessing documentation: https://docs.python.org/3/library/multiprocessing.html#module-multiprocessing.pool
This section assumes that you have a running Ray cluster. To start a Ray cluster,
see the :ref:cluster setup <cluster-index> instructions.
To connect a Pool to a running Ray cluster, you can specify the address of the
head node in one of two ways:
RAY_ADDRESS environment variable.ray_address keyword argument to the Pool constructor... code-block:: python
from ray.util.multiprocessing import Pool
pool = Pool()
pool = Pool(ray_address="auto")
pool = Pool(ray_address="<ip_address>:<port>")
You can also start Ray manually by calling ray.init() (with any of its supported
configuration options) before creating a Pool.