docs/tutorials/data_loading.md
Dataloader is the component that provides data to models. A dataloader usually (but not necessarily) takes raw information from datasets, and process them into a format needed by the model.
Detectron2 contains a builtin data loading pipeline. It's good to understand how it works, in case you need to write a custom one.
Detectron2 provides two functions
build_detection_{train,test}_loader
that create a default data loader from a given config.
Here is how build_detection_{train,test}_loader work:
list[dict] representing the dataset items
in a lightweight format. These dataset items are not yet ready to be used by the model (e.g., images are
not loaded into memory, random augmentations have not been applied, etc.).
Details about the dataset format and dataset registration can be found in
datasets.build_detection_{train,test}_loader. The default mapper is DatasetMapper.model.forward().Using a different "mapper" with build_detection_{train,test}_loader(mapper=) works for most use cases
of custom data loading.
For example, if you want to resize all images to a fixed size for training, use:
import detectron2.data.transforms as T
from detectron2.data import DatasetMapper # the default mapper
dataloader = build_detection_train_loader(cfg,
mapper=DatasetMapper(cfg, is_train=True, augmentations=[
T.Resize((800, 800))
]))
# use this dataloader instead of the default
If the arguments of the default DatasetMapper does not provide what you need, you may write a custom mapper function and use it instead, e.g.:
from detectron2.data import detection_utils as utils
# Show how to implement a minimal mapper, similar to the default DatasetMapper
def mapper(dataset_dict):
dataset_dict = copy.deepcopy(dataset_dict) # it will be modified by code below
# can use other ways to read image
image = utils.read_image(dataset_dict["file_name"], format="BGR")
# See "Data Augmentation" tutorial for details usage
auginput = T.AugInput(image)
transform = T.Resize((800, 800))(auginput)
image = torch.from_numpy(auginput.image.transpose(2, 0, 1))
annos = [
utils.transform_instance_annotations(annotation, [transform], image.shape[1:])
for annotation in dataset_dict.pop("annotations")
]
return {
# create the format that the model expects
"image": image,
"instances": utils.annotations_to_instances(annos, image.shape[1:])
}
dataloader = build_detection_train_loader(cfg, mapper=mapper)
If you want to change not only the mapper (e.g., in order to implement different sampling or batching logic),
build_detection_train_loader won't work and you will need to write a different data loader.
The data loader is simply a
python iterator that produces the format that the model accepts.
You can implement it using any tools you like.
No matter what to implement, it's recommended to check out API documentation of detectron2.data to learn more about the APIs of these functions.
If you use DefaultTrainer,
you can overwrite its build_{train,test}_loader method to use your own dataloader.
See the deeplab dataloader
for an example.
If you write your own training loop, you can plug in your data loader easily.