docs/en/guides/yolo-data-augmentation.md
Data augmentation is a crucial technique in computer vision that artificially expands your training dataset by applying various transformations to existing images. When training deep learning models like Ultralytics YOLO, data augmentation helps improve model robustness, reduces overfitting, and enhances generalization to real-world scenarios.
<p align="center"> <iframe loading="lazy" width="720" height="405" src="https://www.youtube.com/embed/e-TwqFtay90" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen> </iframe><strong>Watch:</strong> How to use Mosaic, MixUp & more Data Augmentations to help Ultralytics YOLO Models generalize better 🚀
</p>Data augmentation serves multiple critical purposes in training computer vision models:
Ultralytics YOLO's implementation provides a comprehensive suite of augmentation techniques, each serving specific purposes and contributing to model performance in different ways. This guide will explore each augmentation parameter in detail, helping you understand when and how to use them effectively in your projects.
You can customize each parameter using the Python API, the command line interface (CLI), or a configuration file. Below are examples of how to set up data augmentation in each method.
!!! example "Configuration Examples"
=== "Python"
```python
import albumentations as A
from ultralytics import YOLO
# Load a model
model = YOLO("yolo26n.pt")
# Training with custom augmentation parameters
model.train(data="coco.yaml", epochs=100, hsv_h=0.03, hsv_s=0.6, hsv_v=0.5)
# Training without any augmentations (disabled values omitted for clarity)
model.train(
data="coco.yaml",
epochs=100,
hsv_h=0.0,
hsv_s=0.0,
hsv_v=0.0,
translate=0.0,
scale=0.0,
fliplr=0.0,
mosaic=0.0,
erasing=0.0,
auto_augment=None,
)
# Training with custom Albumentations transforms (Python API only)
custom_transforms = [
A.Blur(blur_limit=7, p=0.5),
A.CLAHE(clip_limit=4.0, p=0.5),
]
model.train(data="coco.yaml", epochs=100, augmentations=custom_transforms)
```
=== "CLI"
```bash
# Training with custom augmentation parameters
yolo detect train data=coco8.yaml model=yolo26n.pt epochs=100 hsv_h=0.03 hsv_s=0.6 hsv_v=0.5
```
You can define all training parameters, including augmentations, in a YAML configuration file (e.g., train_custom.yaml). The mode parameter is only required when using the CLI. This new YAML file will then override the default one located in the ultralytics package.
# train_custom.yaml
# 'mode' is required only for CLI usage
mode: train
data: coco8.yaml
model: yolo26n.pt
epochs: 100
hsv_h: 0.03
hsv_s: 0.6
hsv_v: 0.5
Then launch the training with the Python API:
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a COCO-pretrained YOLO26n model
model = YOLO("yolo26n.pt")
# Train the model with custom configuration
model.train(cfg="train_custom.yaml")
```
=== "CLI"
```bash
# Train the model with custom configuration
yolo detect train model="yolo26n.pt" cfg=train_custom.yaml
```
hsv_h)0.0 - 1.0{{ hsv_h }}hsv_h hyperparameter defines the shift magnitude, with the final adjustment randomly chosen between -hsv_h and hsv_h. For example, with hsv_h=0.3, the shift is randomly selected within -0.3 to 0.3. For values above 0.5, the hue shift wraps around the color wheel, that's why the augmentations look the same between 0.5 and -0.5.-0.5 | -0.25 | 0.0 | 0.25 | 0.5 |
|---|---|---|---|---|
hsv_s)0.0 - 1.0{{ hsv_s }}hsv_s hyperparameter defines the shift magnitude, with the final adjustment randomly chosen between -hsv_s and hsv_s. For example, with hsv_s=0.7, the intensity is randomly selected within -0.7 to 0.7.-1.0 | -0.5 | 0.0 | 0.5 | 1.0 |
|---|---|---|---|---|
hsv_v)0.0 - 1.0{{ hsv_v }}hsv_v hyperparameter defines the shift magnitude, with the final adjustment randomly chosen between -hsv_v and hsv_v. For example, with hsv_v=0.4, the intensity is randomly selected within -0.4 to 0.4.-1.0 | -0.5 | 0.0 | 0.5 | 1.0 |
|---|---|---|---|---|
degrees)0.0 to 180{{ degrees }}degrees hyperparameter defines the rotation angle, with the final adjustment randomly chosen between -degrees and degrees. For example, with degrees=10.0, the rotation is randomly selected within -10.0 to 10.0.-180 | -90 | 0.0 | 90 | 180 |
|---|---|---|---|---|
translate)0.0 - 1.0{{ translate }}translate hyperparameter defines the shift magnitude, with the final adjustment randomly chosen twice (once for each axis) within the range -translate and translate. For example, with translate=0.5, the translation is randomly selected within -0.5 to 0.5 on the x-axis, and another independent random value is selected within the same range on the y-axis.x and y axes. Values -1.0 and 1.0 are not shown as they would translate the image completely out of the frame.-0.5 | -0.25 | 0.0 | 0.25 | 0.5 |
|---|---|---|---|---|
scale)0.0 - 1.0{{ scale }}scale hyperparameter defines the scaling factor, with the final adjustment randomly chosen between 1-scale and 1+scale. For example, with scale=0.5, the scaling is randomly selected within 0.5 to 1.5.-1.0 is not shown as it would make the image disappear, while 1.0 simply results in a 2x zoom.scale, not the final scale factor.scale is greater than 1.0, the image can be either very small or flipped, as the scaling factor is randomly chosen between 1-scale and 1+scale. For example, with scale=3.0, the scaling is randomly selected within -2.0 to 4.0. If a negative value is chosen, the image is flipped.-0.5 | -0.25 | 0.0 | 0.25 | 0.5 |
|---|---|---|---|---|
shear)-180 to +180{{ shear }}shear hyperparameter defines the shear angle, with the final adjustment randomly chosen between -shear and shear. For example, with shear=10.0, the shear is randomly selected within -10 to 10 on the x-axis, and another independent random value is selected within the same range on the y-axis.shear values can rapidly distort the image, so it's recommended to start with small values and gradually increase them.-10 | -5 | 0.0 | 5 | 10 |
|---|---|---|---|---|
perspective)0.0 - 0.001{{ perspective }}perspective hyperparameter defines the perspective magnitude, with the final adjustment randomly chosen between -perspective and perspective. For example, with perspective=0.001, the perspective is randomly selected within -0.001 to 0.001 on the x-axis, and another independent random value is selected within the same range on the y-axis.-0.001 | -0.0005 | 0.0 | 0.0005 | 0.001 |
|---|---|---|---|---|
flipud)0.0 - 1.0{{ flipud }}flipud=1.0 ensuring that all images are flipped and a value of flipud=0.0 disabling the transformation entirely. For example, with flipud=0.5, each image has a 50% chance of being flipped upside-down.flipud off | flipud on |
|---|---|
fliplr)0.0 - 1.0{{ fliplr }}fliplr hyperparameter defines the probability of applying the transformation, with a value of fliplr=1.0 ensuring that all images are flipped and a value of fliplr=0.0 disabling the transformation entirely. For example, with fliplr=0.5, each image has a 50% chance of being flipped left to right.fliplr off | fliplr on |
|---|---|
bgr)0.0 - 1.0{{ bgr }}bgr hyperparameter defines the probability of applying the transformation, with bgr=1.0 ensuring all images undergo the channel swap and bgr=0.0 disabling it. For example, with bgr=0.5, each image has a 50% chance of being converted from RGB to BGR.bgr off | bgr on |
|---|---|
mosaic)0.0 - 1.0{{ mosaic }}mosaic hyperparameter defines the probability of applying the transformation, with mosaic=1.0 ensuring that all images are combined and mosaic=0.0 disabling the transformation. For example, with mosaic=0.5, each image has a 50% chance of being combined with three other images.mosaic augmentation makes the model more robust, it can also make the training process more challenging.mosaic augmentation can be disabled near the end of training by setting close_mosaic to the number of epochs before completion when it should be turned off. For example, if epochs is set to 200 and close_mosaic is set to 20, the mosaic augmentation will be disabled after 180 epochs. If close_mosaic is set to 0, the mosaic augmentation will be enabled for the entire training process.mosaic augmentation combines 4 images picked randomly from the dataset. If the dataset is small, the same image may be used multiple times in the same mosaic.mosaic off | mosaic on |
|---|---|
mixup)0.0 - 1.0{{ mixup }}mixup hyperparameter defines the probability of applying the transformation, with mixup=1.0 ensuring that all images are mixed and mixup=0.0 disabling the transformation. For example, with mixup=0.5, each image has a 50% chance of being mixed with another image.mixup ratio is a random value picked from a np.random.beta(32.0, 32.0) beta distribution, meaning each image contributes approximately 50%, with slight variations.First image, mixup off | Second image, mixup off | mixup on |
|---|---|---|
cutmix)0.0 - 1.0{{ cutmix }}cutmix hyperparameter defines the probability of applying the transformation, with cutmix=1.0 ensuring that all images undergo this transformation and cutmix=0.0 disabling it completely. For example, with cutmix=0.5, each image has a 50% chance of having a region replaced with a patch from another image.cutmix maintains the original pixel intensities within the cut regions, preserving local features.0.1 (10%) of their original area within the pasted region are preserved.0.1 by default.First image, cutmix off | Second image, cutmix off | cutmix on |
|---|---|---|
copy_paste)0.0 - 1.0{{ copy_paste }}copy_paste_mode. The copy_paste hyperparameter defines the probability of applying the transformation, with copy_paste=1.0 ensuring that all images are copied and copy_paste=0.0 disabling the transformation. For example, with copy_paste=0.5, each image has a 50% chance of having objects copied from another image.copy_paste augmentation can be used to copy objects from one image to another.copy_paste_mode, its Intersection over Area (IoA) is computed with all the object of the source image. If all the IoA are below 0.3 (30%), the object is pasted in the target image. If only one the IoA is above 0.3, the object is not pasted in the target image.0.3 by default.copy_paste off | copy_paste on with copy_paste_mode=flip | Visualize the copy_paste process |
|---|---|---|
copy_paste_mode)'flip', 'mixup''{{ copy_paste_mode }}''flip', the objects come from the same image, while 'mixup' allows objects to be copied from different images.copy_paste_mode, but the way the objects are copied is different.| Reference image | Chosen image for copy_paste | copy_paste on with copy_paste_mode=mixup |
|---|---|---|
auto_augment)'randaugment', 'autoaugment', 'augmix', None'{{ auto_augment }}''randaugment' option uses RandAugment, 'autoaugment' uses AutoAugment, and 'augmix' uses AugMix. Setting to None disables automated augmentation.erasing)0.0 - 0.9{{ erasing }}erasing hyperparameter defines the probability of applying the transformation, with erasing=0.9 ensuring that almost all images are erased and erasing=0.0 disabling the transformation. For example, with erasing=0.5, each image has a 50% chance of having a portion erased.erasing augmentation comes with a scale, ratio, and value hyperparameters that cannot be changed with the current implementation. Their default values are (0.02, 0.33), (0.3, 3.3), and 0, respectively, as stated in the PyTorch documentation.erasing hyperparameter is set to 0.9 to avoid applying the transformation to all images.erasing off | erasing on (example 1) | erasing on (example 2) | erasing on (example 3) |
|---|---|---|---|
augmentations)list of Albumentations transformsNone!!! example "Custom Albumentations Example"
=== "Python API"
```python
import albumentations as A
from ultralytics import YOLO
# Load a model
model = YOLO("yolo26n.pt")
# Define custom Albumentations transforms
custom_transforms = [
A.Blur(blur_limit=7, p=0.5),
A.GaussNoise(var_limit=(10.0, 50.0), p=0.3),
A.CLAHE(clip_limit=4.0, p=0.5),
A.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, p=0.5),
A.HueSaturationValue(hue_shift_limit=20, sat_shift_limit=30, val_shift_limit=20, p=0.5),
]
# Train with custom Albumentations transforms
model.train(
data="coco8.yaml",
epochs=100,
augmentations=custom_transforms, # Pass custom transforms
imgsz=640,
)
```
=== "More Advanced Example"
```python
import albumentations as A
from ultralytics import YOLO
# Load a model
model = YOLO("yolo26n.pt")
# Define advanced custom Albumentations transforms with specific parameters
advanced_transforms = [
A.OneOf(
[
A.MotionBlur(blur_limit=7, p=1.0),
A.MedianBlur(blur_limit=7, p=1.0),
A.GaussianBlur(blur_limit=7, p=1.0),
],
p=0.3,
),
A.OneOf(
[
A.GaussNoise(var_limit=(10.0, 50.0), p=1.0),
A.ISONoise(color_shift=(0.01, 0.05), intensity=(0.1, 0.5), p=1.0),
],
p=0.2,
),
A.CLAHE(clip_limit=4.0, tile_grid_size=(8, 8), p=0.5),
A.RandomBrightnessContrast(brightness_limit=0.3, contrast_limit=0.3, brightness_by_max=True, p=0.5),
A.HueSaturationValue(hue_shift_limit=20, sat_shift_limit=30, val_shift_limit=20, p=0.5),
A.CoarseDropout(
max_holes=8, max_height=32, max_width=32, min_holes=1, min_height=8, min_width=8, fill_value=0, p=0.2
),
]
# Train with advanced custom transforms
model.train(
data="coco8.yaml",
epochs=100,
augmentations=advanced_transforms,
imgsz=640,
)
```
Key Points:
augmentations parameter, they completely replace the default Albumentations transforms. The default YOLO augmentations (like mosaic, hsv_h, hsv_s, degrees, etc.) remain active and are applied independently.Common Use Cases:
Compatibility Notes:
For more information about Albumentations and available transforms, visit the official Albumentations documentation.
Choosing the right augmentations depends on your specific use case and dataset. Here are a few general guidelines to help you decide:
hsv_h, hsv_s, and hsv_v are a solid starting point.rotation, translation, scale, shear, or perspective. However, if the camera angle may vary, and you need the model to be more robust, it's better to keep these augmentations.mosaic augmentation only if having partially occluded objects or multiple objects per image is acceptable and does not change the label value. Alternatively, you can keep mosaic active but increase the close_mosaic value to disable it earlier in the training process.In short: keep it simple. Start with a small set of augmentations and gradually add more as needed. The goal is to improve the model's generalization and robustness, not to overcomplicate the training process. Also, make sure the augmentations you apply reflect the same data distribution your model will encounter in production.
albumentations: Blur[...] reference. Does that mean Ultralytics YOLO runs additional augmentation like blurring?If the albumentations package is installed, Ultralytics automatically applies a set of extra image augmentations using it. These augmentations are handled internally and require no additional configuration.
You can find the full list of applied transformations in our technical documentation, as well as in our Albumentations integration guide. Note that only the augmentations with a probability p greater than 0 are active. These are purposefully applied at low frequencies to mimic real-world visual artifacts, such as blur or grayscale effects.
You can also provide your own custom Albumentations transforms using the Python API. See the Advanced Augmentation Features section for more details.
Check if the albumentations package is installed. If not, you can install it by running pip install albumentations. Once installed, the package should be automatically detected and used by Ultralytics.
You can customize augmentations by creating a custom dataset class and trainer. For example, you can replace the default Ultralytics classification augmentations with PyTorch's torchvision.transforms.Resize or other transforms. See the custom training example in the classification documentation for implementation details.