doc/source/templates/05_dreambooth_finetuning/README.md
| Template Specification | Description |
|---|---|
| Summary | This example shows how to do DreamBooth fine-tuning of a Stable Diffusion model using Ray Train for data-parallel training with many workers and Ray Data for data ingestion. Use one of the provided datasets, or supply your own photos. By the end of this example, you'll be able to generate images of your subject in a variety of situations, just by feeding in a text prompt! |
| Time to Run | ~10-15 minutes to generate a regularization dataset and fine-tune the model on photos of your subject. |
| Minimum Compute Requirements | At least 1 GPUs, where each GPU has >= 24GB GRAM. The default is 1 node with 4 GPUS: A10G GPU (AWS) or L4 GPU (GCE). |
| Cluster Environment | This template uses a Docker image built on top of the latest Anyscale-provided Ray image using Python 3.9: anyscale/ray:latest-py39-cu118. See the appendix below for more details. |
This README will only contain minimal instructions on running this example on Anyscale. See the guide on the Ray documentation for a step-by-step walkthrough of the training code.
You can get started fine-tuning on a sample dog dataset with default settings with the following commands:
chmod +x ./dreambooth_run.sh
./dreambooth_run.sh
Here are a few modifications to the dreambooth_run.sh script that you may want to make:
dog.
$CLASS_NAME and $INSTANCE_DIR environment variables.$DATA_PREFIX that the pre-trained model is downloaded to. This directory is also where the training dataset and the fine-tuned model checkpoint are written at the end of training.
$DATA_PREFIX to a shared NFS filesystem such as /mnt/cluster_storage. See this doc for all the options.$DATA_PREFIX environment variable on each run if you don't want to lose the models/data of previous runs.$NUM_WORKERS variable sets the number of data-parallel workers used during fine-tuning. The default is 2 workers (2 workers, each using 1 GPU), and you should increase this number if you add more GPU worker nodes to the cluster.--num_epochs and --max_train_steps determines the number of fine-tuning steps to take.
generate.py is used to generate stable diffusion images after loading the model from a checkpoint. You should modify the prompt at the end to be something more interesting, rather than just a photo of your subject.python train.py ... command. Running the bash script will start from the beginning (generating another regularization dataset).python train.py \
--model_dir=$ORIG_MODEL_PATH \
--output_dir=$TUNED_MODEL_DIR \
--instance_images_dir=$IMAGES_OWN_DIR \
--instance_prompt="photo of $UNIQUE_TOKEN $CLASS_NAME" \
--class_images_dir=$IMAGES_REG_DIR \
--class_prompt="photo of a $CLASS_NAME" \
--train_batch_size=2 \
--lr=1e-4 \ # Note a much higher learning rate here!
--num_epochs=10 \
--max_train_steps=400 \
--num_workers $NUM_WORKERS
--use_lora
Use the generate.py script to generate images with a prompt.
Replace the variables with the values that you used in the fine-tuning script.
See run_model_flags in flags.py for a full list of available command line arguments to pass to the script.
python generate.py \
--model_dir=$TUNED_MODEL_DIR \
--output_dir=$IMAGES_NEW_DIR \
--prompts="photo of a $UNIQUE_TOKEN $CLASS_NAME" \
--num_samples_per_prompt=5
To generate images using LoRA fine-tuned model:
python generate.py \
--model_dir=$ORIG_MODEL_PATH \
--lora_weights_dir=$TUNED_MODEL_DIR \
--output_dir=$IMAGES_NEW_DIR \
--prompts="photo of a $UNIQUE_TOKEN $CLASS_NAME" \
--num_samples_per_prompt=5
See the playground.ipynb notebook for a more interactive way to generate images with the fine-tuned model.
Click on the Jupyter icon on the workspace page and open the notebook. Note: The widgets in this notebook don't work in VS Code, so please use Jupyter!
The dreambooth/requirements.txt file lists the requirements. Feel free to modify this file to include more requirements, then follow this guide to create a new cluster environment with the anyscale CLI . Paste the requirements into the cluster environment YAML.
Finally, update the workspace's cluster environment to this environment after it's done building.
Use the following docker pull command if you want to manually build a new Docker image based off of this one.
docker pull us-docker.pkg.dev/anyscale-workspace-templates/workspace-templates/dreambooth-finetuning:latest