doc/source/ray-core/compiled-graph/ray-compiled-graph.rst
.. _ray-compiled-graph:
.. warning::
Ray Compiled Graph is currently in beta (since Ray 2.44). The APIs are subject to change and expected to evolve.
The API is available from Ray 2.32, but it's recommended to use a version after 2.44.
As large language models (LLMs) become common, programming distributed systems with multiple GPUs is essential.
:ref:Ray Core APIs <core-key-concepts> facilitate using multiple GPUs but have limitations such as:
NCCL <https://developer.nvidia.com/nccl>_).Ray Compiled Graph gives you a Ray Core-like API but with:
For example, consider the following Ray Core code, which sends data to an actor and gets the result:
.. testcode:: :skipif: True
# Ray Core API for remote execution.
# ~1ms overhead to invoke `recv`.
ref = receiver.recv.remote(data)
ray.get(ref)
This code shows how to compile and execute the same example as a Compiled Graph.
.. testcode:: :skipif: True
# Compiled Graph for remote execution.
# less than 50us overhead to invoke `recv` (during `graph.execute(data)`).
with InputNode() as inp:
graph = receiver.recv.bind(inp)
graph = graph.experimental_compile()
ref = graph.execute(data)
ray.get(ref)
Ray Compiled Graph has a static execution model. It's different from classic Ray APIs, which are eager. Because of the static nature, Ray Compiled Graph can perform various optimizations such as:
Ray Compiled Graph APIs simplify development of high-performance multi-GPU workloads such as LLM inference or distributed training that require:
Heterogeneous <https://www.youtube.com/watch?v=Mg08QTBILWU>_ or MPMD (Multiple Program Multiple Data) execution.Ray Compiled Graph blog <https://www.anyscale.com/blog/announcing-compiled-graphs>_Ray Compiled Graph talk at Ray Summit <https://www.youtube.com/watch?v=jv58Cpr6SAs>_Heterogeneous training with Ray Compiled Graph <https://www.youtube.com/watch?v=Mg08QTBILWU>_Distributed LLM inference with Ray Compiled Graph <https://www.youtube.com/watch?v=oMb_WiUwf5o>_Learn more details about Ray Compiled Graph from the following links.
.. toctree:: :maxdepth: 1
quickstart
profiling
overlap
troubleshooting
compiled-graph-api