GETTING_STARTED.md
This document provides brief tutorials covering Detectron for inference and training on the COCO dataset.
README.md.INSTALL.md.To run inference on a directory of image files (demo/*.jpg in this example), you can use the infer_simple.py tool. In this example, we're using an end-to-end trained Mask R-CNN model with a ResNet-101-FPN backbone from the model zoo:
python tools/infer_simple.py \
--cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml \
--output-dir /tmp/detectron-visualizations \
--image-ext jpg \
--wts https://dl.fbaipublicfiles.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl \
demo
Detectron should automatically download the model from the URL specified by the --wts argument. This tool will output visualizations of the detections in PDF format in the directory specified by --output-dir. Here's an example of the output you should expect to see (for copyright information about the demo images see demo/NOTICE).
Notes:
misc_mask time reported by tools/infer_simple.py is high (e.g., much more than 20-90ms). The solution is to first resize your images such that the short side is around 600-800px (the exact choice does not matter) and then run inference on the resized image.This example shows how to run an end-to-end trained Mask R-CNN model from the model zoo using a single GPU for inference. As configured, this will run inference on all images in coco_2014_minival (which must be properly installed).
python tools/test_net.py \
--cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml \
TEST.WEIGHTS https://dl.fbaipublicfiles.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl \
NUM_GPUS 1
Running inference with the same model using $N GPUs (e.g., N=8).
python tools/test_net.py \
--cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml \
--multi-gpu-testing \
TEST.WEIGHTS https://dl.fbaipublicfiles.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl \
NUM_GPUS $N
On an NVIDIA Tesla P100 GPU, inference should take about 130-140 ms per image for this example.
This is a tiny tutorial showing how to train a model on COCO. The model will be an end-to-end trained Faster R-CNN using a ResNet-50-FPN backbone. For the purpose of this tutorial, we'll use a short training schedule and a small input image size so that training and inference will be relatively fast. As a result, the box AP on COCO will be relatively low compared to our baselines. This example is provided for instructive purposes only (i.e., not for comparing against publications).
python tools/train_net.py \
--cfg configs/getting_started/tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml \
OUTPUT_DIR /tmp/detectron-output
Expected results:
/tmp/detectron-outputcoco_2014_minival should be around 22.1% (+/- 0.1% stdev measured over 3 runs)We've also provided configs to illustrate training with 2, 4, and 8 GPUs using learning schedules that will be approximately equivalent to the one used with 1 GPU above. The configs are located at: configs/getting_started/tutorial_{2,4,8}gpu_e2e_faster_rcnn_R-50-FPN.yaml. For example, launching a training job with 2 GPUs will look like this:
python tools/train_net.py \
--multi-gpu-testing \
--cfg configs/getting_started/tutorial_2gpu_e2e_faster_rcnn_R-50-FPN.yaml \
OUTPUT_DIR /tmp/detectron-output
Note that we've also added the --multi-gpu-testing flag to instruct Detectron to parallelize inference over multiple GPUs (2 in this example; see NUM_GPUS in the config file) after training has finished.
Expected results:
coco_2014_minival should be around 22.1% (+/- 0.1% stdev measured over 3 runs)To understand how learning schedules are adjusted (the "linear scaling rule"), please study these tutorial config files and read our paper Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour. Aside from this tutorial, all of our released configs make use of 8 GPUs. If you will be using fewer than 8 GPUs for training (or do anything else that changes the minibatch size), it is essential that you understand how to manipulate training schedules according to the linear scaling rule.
Notes: