Running Tune experiments with BOHB

In this tutorial we introduce BOHB, while running a simple Ray Tune experiment. Tune’s Search Algorithms integrate with BOHB and, as a result, allow you to seamlessly scale up a BOHB optimization process - without sacrificing performance.

Bayesian Optimization HyperBand (BOHB) combines the benefits of Bayesian optimization together with Bandit-based methods (e.g. HyperBand). BOHB does not rely on the gradient of the objective function, but instead, learns from samples of the search space. It is suitable for optimizing functions that are non-differentiable, with many local minima, or even unknown but only testable. Therefore, this approach belongs to the domain of "derivative-free optimization" and "black-box optimization".

In this example we minimize a simple objective to briefly demonstrate the usage of BOHB with Ray Tune via BOHBSearch. It's useful to keep in mind that despite the emphasis on machine learning experiments, Ray Tune optimizes any implicit or explicit objective. Here we assume ConfigSpace==0.4.18 and hpbandster==0.7.4 libraries are installed. To learn more, please refer to the BOHB website.

python

!pip install ray[tune]
!pip install ConfigSpace==0.4.18
!pip install hpbandster==0.7.4

Click below to see all the imports we need for this example.

python

import tempfile
import time
from pathlib import Path

import ray
from ray import tune
from ray.tune.schedulers.hb_bohb import HyperBandForBOHB
from ray.tune.search.bohb import TuneBOHB
import ConfigSpace as CS

Let's start by defining a simple evaluation function. We artificially sleep for a bit (0.1 seconds) to simulate a long-running ML experiment. This setup assumes that we're running multiple steps of an experiment and try to tune two hyperparameters, namely width and height, and activation.

python

def evaluate(step, width, height, activation):
    time.sleep(0.1)
    activation_boost = 10 if activation=="relu" else 1
    return (0.1 + width * step / 100) ** (-1) + height * 0.1 + activation_boost

Next, our objective function takes a Tune config, evaluates the score of your experiment in a training loop, and uses tune.report to report the score back to Tune.

BOHB will interrupt our trials often, so we also need to {ref}save and restore checkpoints <train-checkpointing>.

python

def objective(config):
    start = 0
    if tune.get_checkpoint():
        with tune.get_checkpoint().as_directory() as checkpoint_dir:
            start = int((Path(checkpoint_dir) / "data.ckpt").read_text())

    for step in range(start, config["steps"]):
        score = evaluate(step, config["width"], config["height"], config["activation"])
        with tempfile.TemporaryDirectory() as checkpoint_dir:
            (Path(checkpoint_dir) / "data.ckpt").write_text(str(step))
            tune.report(
                {"iterations": step, "mean_loss": score},
                checkpoint=tune.Checkpoint.from_directory(checkpoint_dir)
            )

python

ray.init(configure_logging=False)

Next we define a search space. The critical assumption is that the optimal hyperparameters live within this space. Yet, if the space is very large, then those hyperparameters may be difficult to find in a short amount of time.

python

search_space = {
    "steps": 100,
    "width": tune.uniform(0, 20),
    "height": tune.uniform(-100, 100),
    "activation": tune.choice(["relu", "tanh"]),
}

Next we define the search algorithm built from TuneBOHB, constrained to a maximum of 4 concurrent trials with a ConcurrencyLimiter. Below algo will take care of the BO (Bayesian optimization) part of BOHB, while scheduler will take care the HB (HyperBand) part.

python

algo = TuneBOHB()
algo = tune.search.ConcurrencyLimiter(algo, max_concurrent=4)
scheduler = HyperBandForBOHB(
    time_attr="training_iteration",
    max_t=100,
    reduction_factor=4,
    stop_last_trials=False,
)

The number of samples is the number of hyperparameter combinations that will be tried out. This Tune run is set to 1000 samples. (you can decrease this if it takes too long on your machine).

python

num_samples = 1000

python

num_samples = 10

Finally, we run the experiment to minimize the "mean_loss" of the objective by searching within "steps": 100 via algo, num_samples times. This previous sentence is fully characterizes the search problem we aim to solve. With this in mind, notice how efficient it is to execute tuner.fit().

python

tuner = tune.Tuner(
    objective,
    tune_config=tune.TuneConfig(
        metric="mean_loss",
        mode="min",
        search_alg=algo,
        scheduler=scheduler,
        num_samples=num_samples,
    ),
    run_config=tune.RunConfig(
        name="bohb_exp",
        stop={"training_iteration": 100},
    ),
    param_space=search_space,
)
results = tuner.fit()

Here are the hyperparameters found to minimize the mean loss of the defined objective.

python

print("Best hyperparameters found were: ", results.get_best_result().config)

Optional: Passing the search space via the TuneBOHB algorithm

We can define the hyperparameter search space using ConfigSpace, which is the format accepted by BOHB.

python

config_space = CS.ConfigurationSpace()
config_space.add_hyperparameter(
    CS.Constant("steps", 100)
)
config_space.add_hyperparameter(
    CS.UniformFloatHyperparameter("width", lower=0, upper=20)
)
config_space.add_hyperparameter(
    CS.UniformFloatHyperparameter("height", lower=-100, upper=100)
)
config_space.add_hyperparameter(
    CS.CategoricalHyperparameter(
        "activation", choices=["relu", "tanh"]
    )
)

python

# As we are passing config space directly to the searcher,
# we need to define metric and mode in it as well, in addition
# to Tuner()
algo = TuneBOHB(
    space=config_space,
    metric="mean_loss",
    mode="max",
)
algo = tune.search.ConcurrencyLimiter(algo, max_concurrent=4)
scheduler = HyperBandForBOHB(
    time_attr="training_iteration",
    max_t=100,
    reduction_factor=4,
    stop_last_trials=False,
)

python

tuner = tune.Tuner(
    objective,
    tune_config=tune.TuneConfig(
        metric="mean_loss",
        mode="min",
        search_alg=algo,
        scheduler=scheduler,
        num_samples=num_samples,
    ),
    run_config=tune.RunConfig(
        name="bohb_exp_2",
        stop={"training_iteration": 100},
    ),
)
results = tuner.fit()

Here again are the hyperparameters found to minimize the mean loss of the defined objective.

python

print("Best hyperparameters found were: ", results.get_best_result().config)

python

ray.shutdown()