docs/en/datasets/obb/dota-v2.md
DOTA stands as a specialized dataset, emphasizing object detection in aerial images. Originating from the DOTA series of datasets, it offers annotated images capturing a diverse array of aerial scenes with Oriented Bounding Boxes (OBB).
<strong>Watch:</strong> How to Train Ultralytics YOLO26 on the DOTA Dataset for Oriented Bounding Boxes in Google Colab
</p>DOTA exhibits a structured layout tailored for OBB object detection challenges:
DOTA serves as a benchmark for training and evaluating models specifically tailored for aerial image analysis. With the inclusion of OBB annotations, it provides a unique challenge, enabling the development of specialized object detection models that cater to aerial imagery's nuances. The dataset is particularly valuable for applications in remote sensing, surveillance, and environmental monitoring.
A dataset YAML (Yet Another Markup Language) file specifies image/label roots, class names, and other important metadata. Ultralytics maintains official YAML files for the two most commonly used releases:
Use the YAML that matches the release you downloaded, or author a custom YAML if you are working with DOTA-v2 or another derivative.
!!! example "DOTAv1.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/DOTAv1.yaml"
```
The raw imagery routinely exceeds 10,000 pixels on a side, so tiling is required before feeding the data to YOLO. Use the helper below to slice the source imagery into overlapping 1024 × 1024 crops at multiple scales while keeping the annotations in sync.
!!! example "Split images"
=== "Python"
```python
from ultralytics.data.split_dota import split_test, split_trainval
# Split train and val set, with labels.
split_trainval(
data_root="path/to/DOTAv1.0/",
save_dir="path/to/DOTAv1.0-split/",
rates=[0.5, 1.0, 1.5], # multiscale
gap=500,
)
# Split test set, without labels.
split_test(
data_root="path/to/DOTAv1.0/",
save_dir="path/to/DOTAv1.0-split/",
rates=[0.5, 1.0, 1.5], # multiscale
gap=500,
)
```
!!! tip
Keep the output directory organized in the standard YOLO layout (`images/train`, `labels/train`, etc.) so you can reference it directly from the dataset YAML.
To train a model on the DOTA v1 dataset, you can utilize the following code snippets. Always refer to your model's documentation for a thorough list of available arguments. For those looking to experiment with a smaller subset first, consider using the DOTA8 dataset, which contains just 8 images for quick testing.
!!! warning
Please note that all images and associated annotations in the DOTAv1 dataset can be used for academic purposes, but commercial use is prohibited. Your understanding and respect for the dataset creators' wishes are greatly appreciated!
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Create a new YOLO26n-OBB model from scratch
model = YOLO("yolo26n-obb.yaml")
# Train the model on the DOTAv1 dataset
results = model.train(data="DOTAv1.yaml", epochs=100, imgsz=1024)
```
=== "CLI"
```bash
# Train a new YOLO26n-OBB model on the DOTAv1 dataset
yolo obb train data=DOTAv1.yaml model=yolo26n-obb.pt epochs=100 imgsz=1024
```
Having a glance at the dataset illustrates its depth:
The dataset's richness offers invaluable insights into object detection challenges exclusive to aerial imagery. The DOTA-v2.0 dataset has become particularly popular for remote sensing and aerial surveillance projects due to its comprehensive annotations and diverse object categories.
If you use DOTA in your work, please cite the relevant research papers:
!!! quote ""
=== "BibTeX"
```bibtex
@article{9560031,
author={Ding, Jian and Xue, Nan and Xia, Gui-Song and Bai, Xiang and Yang, Wen and Yang, Michael and Belongie, Serge and Luo, Jiebo and Datcu, Mihai and Pelillo, Marcello and Zhang, Liangpei},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
title={Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges},
year={2021},
volume={},
number={},
pages={1-1},
doi={10.1109/TPAMI.2021.3117983}
}
```
A special note of gratitude to the team behind the DOTA datasets for their commendable effort in curating this dataset. For an exhaustive understanding of the dataset and its nuances, please visit the official DOTA website.
The DOTA dataset is a specialized dataset focused on object detection in aerial images. It features Oriented Bounding Boxes (OBB), providing annotated images from diverse aerial scenes. DOTA's diversity in object orientation, scale, and shape across its 1.7M annotations and 18 categories makes it ideal for developing and evaluating models tailored for aerial imagery analysis, such as those used in surveillance, environmental monitoring, and disaster management.
DOTA utilizes Oriented Bounding Boxes (OBB) for annotation, which are represented by rotated rectangles encapsulating objects regardless of their orientation. This method ensures that objects, whether small or at different angles, are accurately captured. The dataset's multiscale images, ranging from 800 × 800 to 20,000 × 20,000 pixels, further allow for the detection of both small and large objects effectively. This approach is particularly valuable for aerial imagery where objects appear at various angles and scales.
To train a model on the DOTA dataset, you can use the following example with Ultralytics YOLO:
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Create a new YOLO26n-OBB model from scratch
model = YOLO("yolo26n-obb.yaml")
# Train the model on the DOTAv1 dataset
results = model.train(data="DOTAv1.yaml", epochs=100, imgsz=1024)
```
=== "CLI"
```bash
# Train a new YOLO26n-OBB model on the DOTAv1 dataset
yolo obb train data=DOTAv1.yaml model=yolo26n-obb.pt epochs=100 imgsz=1024
```
For more details on how to split and preprocess the DOTA images, refer to the split DOTA images section.
For a detailed comparison and additional specifics, check the dataset versions section.
DOTA images, which can be very large, are split into smaller resolutions for manageable training. Here's a Python snippet to split images:
!!! example
=== "Python"
```python
from ultralytics.data.split_dota import split_test, split_trainval
# split train and val set, with labels.
split_trainval(
data_root="path/to/DOTAv1.0/",
save_dir="path/to/DOTAv1.0-split/",
rates=[0.5, 1.0, 1.5], # multiscale
gap=500,
)
# split test set, without labels.
split_test(
data_root="path/to/DOTAv1.0/",
save_dir="path/to/DOTAv1.0-split/",
rates=[0.5, 1.0, 1.5], # multiscale
gap=500,
)
```
This process facilitates better training efficiency and model performance. For detailed instructions, visit the split DOTA images section.