doc/source/ray-contribute/docs.md
There are many ways to contribute to the Ray documentation, and we're always looking for new contributors. Even if you just want to fix a typo or expand on a section, please feel free to do so!
This document walks you through everything you need to do to get started.
We follow the Google developer documentation style guide. Here are some highlights:
The editorial style is enforced in CI by Vale. For more information, see How to use Vale.
If you want to contribute to the Ray documentation, you need a way to build it. Don't install Ray in the environment you plan to use to build documentation. The requirements for the docs build system are generally not compatible with those you need to run Ray itself.
Follow these instructions to build the documentation:
Next, change into the ray/doc directory:
cd ray/doc
If you haven't done so already, create a Python environment separate from the one you use to build and run Ray, preferably using the latest version of Python. For example, if you're using conda:
conda create -n docs python=3.12
Next, activate the Python environment you are using (e.g., venv, conda, etc.). With conda this would be:
conda activate docs
Install the documentation dependencies with the following command:
pip install -r requirements-doc.lock.txt
Don't use -U in this step; requirements-doc.lock.txt is a lock file that pins the exact versions of all the required dependencies.
Before building, clean your environment first by running:
make clean
Choose from the following 2 options to build documentation locally:
To use this option, you can run:
make local
This option is recommended if you need to make frequent uncomplicated and small changes like editing text, adding things within existing files, etc.
In this approach, Sphinx only builds the changes you made in your branch compared to your last pull from upstream master. The rest of doc is cached with pre-built doc pages from your last commit from upstream (for every new commit pushed to Ray, CI builds all the documentation pages from that commit and store them on S3 as cache).
The build first traces your commit tree to find the latest commit that CI already cached on S3.
Once the build finds the commit, it fetches the corresponding cache from S3 and extracts it into the doc/ directory. Simultaneously, CI tracks all the files that have changed from that commit to current HEAD, including any un-staged changes.
Sphinx then rebuilds only the pages that your changes affect, leaving the rest untouched from the cache.
When build finishes, the doc page would automatically pop up on your browser. If any change is made in the doc/ directory, Sphinx would automatically rebuild and reload your doc page. You can stop it by interrupting with Ctrl+C.
For more complicated changes that involve adding or removing files, always use make develop first, then you can start using make local afterwards to iterate on the cache that make develop produces.
In the full build option, Sphinx rebuilds all files in doc/ directory, ignoring all cache and saved environment.
Because of this behavior, you get a really clean build but it's much slower.
make develop
Find the documentation build in the _build directory.
After the build finishes, you can simply open the _build/html/index.html file in your browser.
It's considered good practice to check the output of your build to make sure everything is working as expected.
Before committing any changes, make sure you run the
linter
with pre-commit run from the doc folder,
to make sure your changes are formatted correctly.
If you find yourself working with documentation often, you might find the esbonio language server to be useful. Esbonio provides context-aware syntax completion, definitions, diagnostics, document links, and other information for RST documents. If you're unfamiliar with language servers, they are important pieces of a modern developer's toolkit; if you've used pylance or python-lsp-server before, you'll know how useful these tools can be.
Esbonio also provides a vscode extension which includes a live preview. Simply install the esbonio vscode extension to start using the tool:
As an example of Esbonio's autocompletion capabilities, you can type .. to pull up an autocomplete menu for all RST directives:
Esbonio also can be used with neovim - see the lspconfig repository for installation instructions.
The Ray documentation is built using the sphinx build system.
We're using the PyData Sphinx Theme for the documentation.
We use myst-parser to allow you to write Ray documentation in either Sphinx's native
reStructuredText (rST) or in
Markedly Structured Text (MyST).
The two formats can be converted to each other, so the choice is up to you.
Having said that, it's important to know that MyST is
common markdown compliant. Past experience has shown that most developers are familiar with md syntax, so
if you intend to add a new document, we recommend starting from an .md file.
The Ray documentation also fully supports executable formats like Jupyter Notebooks. Many of our examples are notebooks with MyST markdown cells.
If you take Ray Tune as an example, you can see that our documentation is made up of several types of documentation, all of which you can contribute to:
This structure is reflected in the Ray documentation source code as well, so you should have no problem finding what you're looking for. All other Ray projects share a similar structure, but depending on the project there might be minor differences.
Each type of documentation listed above has its own purpose, but at the end our documentation comes down to two types of documents:
.ipynb format. All Tune examples are written as notebooks. These notebooks render in
the browser like .md or .rst files, but have the added benefit that users can easily run the code themselves.If you spot a typo in any document, or think that an explanation is not clear enough, please consider opening a pull request. In this scenario, just run the linter as described above and submit your pull request.
We use Sphinx's autodoc extension to generate our API documentation from our source code. In case we're missing a reference to a function or class, please consider adding it to the respective document in question.
For example, here's how you can add a function or class reference using autofunction and autoclass:
.. autofunction:: ray.tune.integration.docker.DockerSyncer
.. autoclass:: ray.tune.integration.keras.TuneReportCallback
The above snippet was taken from the Tune API documentation, which you can look at for reference.
If you want to change the content of the API documentation, you will have to edit the respective function or class
signatures directly in the source code.
For example, in the above autofunction call, to change the API reference for ray.tune.integration.docker.DockerSyncer,
you would have to change the following source file.
To show the usage of APIs, it is important to have small usage examples embedded in the API documentation. These should be self-contained and run out of the box, so a user can copy and paste them into a Python interpreter and play around with them (e.g., if applicable, they should point to example data). Users often rely on these examples to build their applications. To learn more about writing examples, read How to write code snippets.
.rST or .md fileModifying text in an existing documentation file is easy, but you need to be careful when it comes to adding code. The reason is that we want to ensure every code snippet on our documentation is tested. This requires us to have a process for including and testing code snippets in documents. To learn how to write testable code snippets, read How to write code snippets.
from ray import train
def objective(x, a, b): # Define an objective function.
return a * (x ** 0.5) + b
def trainable(config): # Pass a "config" dictionary into your trainable.
for x in range(20): # "Train" for 20 iterations and compute intermediate scores.
score = objective(x, config["a"], config["b"])
train.report({"score": score}) # Send the score to Tune.
This code is imported by literalinclude from a file called doc_code/key_concepts.py.
Every Python file in the doc_code directory will automatically get tested by our CI system,
but make sure to run scripts that you change (or new scripts) locally first.
You do not need to run the testing framework locally.
In rare situations, when you're adding obvious pseudo-code to demonstrate a concept, it is ok to add it
literally into your .rST or .md file, e.g. using a .. code-cell:: python directive.
But if your code is supposed to run, it needs to be tested.
Sometimes you might want to add a completely new document to the Ray documentation, like adding a new user guide or a new example.
For this to work, you need to make sure to add the new document explicitly to a parent document's toctree, which determines the structure of the Ray documentation. See the sphinx documentation for more information.
Depending on the type of document you're adding, you might also have to make changes to an existing overview
page that curates the list of documents in question.
For instance, for Ray Tune each user guide is added to the
user guide overview page as a panel, and the same
goes for all Tune examples.
Always check the structure of the Ray sub-project whose documentation you're working on to see how to integrate
it within the existing structure.
In some cases you may be required to choose an image for the panel. Images are located in
doc/source/images.
To add a new executable example to the Ray documentation, you can start from our
MyST notebook template or
Jupyter notebook template.
You could also simply download the document you're reading right now (click on the respective download button at the
top of this page to get the .ipynb file) and start modifying it.
All the example notebooks in Ray Tune get automatically tested by our CI system, provided you place them in the
examples folder.
If you have questions about how to test your notebook when contributing to other Ray sub-projects, please make
sure to ask a question in the Ray community Slack or directly on GitHub,
when opening your pull request.
To work off of an existing example, you could also have a look at the
Ray Tune Hyperopt example (.ipynb)
or the Ray Serve guide for text classification (.md).
We recommend that you start with an .md file and convert your file to an .ipynb notebook at the end of the process.
We'll walk you through this process below.
What makes these notebooks different from other documents is that they combine code and text in one document, and can be launched in the browser. We also make sure they are tested by our CI system, before we add them to our documentation. To make this work, notebooks need to define a kernel specification to tell a notebook server how to interpret and run the code. For instance, here's the kernel specification of a Python notebook:
---
jupytext:
text_representation:
extension: .md
format_name: myst
kernelspec:
display_name: Python 3
language: python
name: python3
---
If you write a notebook in .md format, you need this YAML front matter at the top of the file.
To add code to your notebook, you can use the code-cell directive.
Here's an example:
```python
import ray
import ray.rllib.agents.ppo as ppo
from ray import serve
def train_ppo_model():
trainer = ppo.PPOTrainer(
config={"framework": "torch", "num_workers": 0},
env="CartPole-v0",
)
# Train for one iteration
trainer.train()
trainer.save("/tmp/rllib_checkpoint")
return "/tmp/rllib_checkpoint/checkpoint_000001/checkpoint-1"
checkpoint_path = train_ppo_model()
```
Putting this markdown block into your document will render as follows in the browser:
import ray
import ray.rllib.agents.ppo as ppo
from ray import serve
def train_ppo_model():
trainer = ppo.PPOTrainer(
config={"framework": "torch", "num_workers": 0},
env="CartPole-v0",
)
# Train for one iteration
trainer.train()
trainer.save("/tmp/rllib_checkpoint")
return "/tmp/rllib_checkpoint/checkpoint_000001/checkpoint-1"
checkpoint_path = train_ppo_model()
What makes this work is the :tags: [hide-cell] directive in the code-cell.
The reason we suggest starting with .md files is that it's much easier to add tags to them, as you've just seen.
You can also add tags to .ipynb files, but you'll need to start a notebook server for that first, which may
not want to do to contribute a piece of documentation.
Apart from hide-cell, you also have hide-input and hide-output tags that hide the input and output of a cell.
Also, if you need code that gets executed in the notebook, but you don't want to show it in the documentation,
you can use the remove-cell, remove-input, and remove-output tags in the same way.
Reference sections labels are a way to link to specific parts of the documentation from within a notebook. Creating one inside a markdown cell is simple:
(my-label)=
# The thing to label
Then, you can link it in .rst files with the following syntax:
See {ref}`the thing that I labeled <my-label>` for more information.
Removing cells can be particularly interesting for compute-intensive notebooks. We want you to contribute notebooks that use realistic values, not just toy examples. At the same time we want our notebooks to be tested by our CI system, and running them should not take too long. What you can do to address this is to have notebook cells with the parameters you want the users to see first:
```{code-cell} python3
num_workers = 8
num_gpus = 2
```
which will render as follows in the browser:
num_workers = 8
num_gpus = 2
But then in your notebook you follow that up with a removed cell that won't get rendered, but has much smaller values and make the notebook run faster:
```{code-cell} python3
:tags: [remove-cell]
num_workers = 0
num_gpus = 0
```
Once you're finished writing your example, you can convert it to an .ipynb notebook using jupytext:
jupytext your-example.md --to ipynb
In the same way, you can convert .ipynb notebooks to .md notebooks with --to myst.
And if you want to convert your notebook to a Python file, e.g. to test if your whole script runs without errors,
you can use --to py instead.
(vale)=
Vale checks if your writing adheres to the Google developer documentation style guide. It's only enforced on the Ray Data documentation.
Vale catches typos and grammatical errors. It also enforces stylistic rules like “use contractions” and “use second person.” For the full list of rules, see the configuration in the Ray repository.
Install Vale. If you use macOS, use Homebrew.
brew install vale
Otherwise, use PyPI.
pip install vale
For more information on installation, see the Vale documentation.
Install the Vale VS Code extension by following these installation instructions.
VS Code should show warnings in your code editor and in the “Problems” panel.
Install Vale. If you use macOS, use Homebrew.
brew install vale
Otherwise, use PyPI.
pip install vale
For more information on installation, see the Vale documentation.
Run Vale in your terminal
vale doc/source/data/overview.rst
Vale should show warnings in your terminal.
❯ vale doc/source/data/overview.rst
doc/source/data/overview.rst
18:1 warning Try to avoid using Google.We
first-person plural like 'We'.
18:46 error Did you really mean Vale.Spelling
'distrbuted'?
24:10 suggestion In general, use active voice Google.Passive
instead of passive voice ('is
built').
28:14 warning Use 'doesn't' instead of 'does Google.Contractions
not'.
✖ 1 error, 2 warnings and 1 suggestion in 1 file.
To add custom terminology, complete the following steps:
.vale/styles/Vocab. For example, .vale/styles/Vocab/Data.accept.txt. For example,
.vale/styles/Vocab/Data/accept.txt.accept.txt. Vale accepts Regex.For more information, see Vocabularies in the Vale documentation.
Vale errors if you use a word that isn't on Google's word list.
304:52 error Use 'select' instead of Google.WordList
'check'.
If you want to use the word anyway, modify the appropriate field in the WordList configuration.
If you run into a problem building the docs, following these steps can help isolate or eliminate most issues:
make clean to clean out docs build artifacts in the working directory. Sphinx uses caching to avoid doing work, and this sometimes causes problems. This is particularly true if you build the docs, then git pull origin master to pull in recent changes, and then try to build docs again.pip list to check the installed dependencies. Compare them to doc/requirements-doc.txt. The documentation build system doesn't have the same dependency requirements as Ray. You don't need to run ML models or execute code on distributed systems in order to build the docs. In fact, it's best to use a completely separate docs build environment from the environment you use to run Ray to avoid dependency conflicts. When installing requirements, do pip install -r doc/requirements-doc.txt. Don't use -U because you don't want to upgrade any dependencies during the installation.SPHINXOPTS in doc/Makefile to tell sphinx to stop when it encounters a breakpoint, and remove -j auto to disable parallel builds. Now you can put breakpoints in the modules you're trying to import, or in sphinx code itself, which can help isolate build stubborn build issues.make local skips rebuilding many other pages, so Sphinx doesn't update the side navigation bar on those pages. To build docs with correct side navigation bar on all pages, consider using make develop.There are many other ways to contribute to Ray other than documentation. See our contributor guide for more information.