Back to Mmdetection

Corruption Benchmarking

docs/en/user_guides/robustness_benchmarking.md

3.3.05.7 KB
Original Source

Corruption Benchmarking

Introduction

We provide tools to test object detection and instance segmentation models on the image corruption benchmark defined in Benchmarking Robustness in Object Detection: Autonomous Driving when Winter is Coming. This page provides basic tutorials how to use the benchmark.

latex
@article{michaelis2019winter,
  title={Benchmarking Robustness in Object Detection:
    Autonomous Driving when Winter is Coming},
  author={Michaelis, Claudio and Mitzkus, Benjamin and
    Geirhos, Robert and Rusak, Evgenia and
    Bringmann, Oliver and Ecker, Alexander S. and
    Bethge, Matthias and Brendel, Wieland},
  journal={arXiv:1907.07484},
  year={2019}
}

About the benchmark

To submit results to the benchmark please visit the benchmark homepage

The benchmark is modelled after the imagenet-c benchmark which was originally published in Benchmarking Neural Network Robustness to Common Corruptions and Perturbations (ICLR 2019) by Dan Hendrycks and Thomas Dietterich.

The image corruption functions are included in this library but can be installed separately using:

shell
pip install imagecorruptions

Compared to imagenet-c a few changes had to be made to handle images of arbitrary size and greyscale images. We also modified the 'motion blur' and 'snow' corruptions to remove dependency from a linux specific library, which would have to be installed separately otherwise. For details please refer to the imagecorruptions repository.

Inference with pretrained models

We provide a testing script to evaluate a models performance on any combination of the corruptions provided in the benchmark.

Test a dataset

  • single GPU testing
  • multiple GPU testing
  • visualize detection results

You can use the following commands to test a models performance under the 15 corruptions used in the benchmark.

shell
# single-gpu testing
python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}]

Alternatively different group of corruptions can be selected.

shell
# noise
python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] --corruptions noise

# blur
python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] --corruptions blur

# wetaher
python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] --corruptions weather

# digital
python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] --corruptions digital

Or a costom set of corruptions e.g.:

shell
# gaussian noise, zoom blur and snow
python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] --corruptions gaussian_noise zoom_blur snow

Finally the corruption severities to evaluate can be chosen. Severity 0 corresponds to clean data and the effect increases from 1 to 5.

shell
# severity 1
python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] --severities 1

# severities 0,2,4
python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] --severities 0 2 4

Results for modelzoo models

The results on COCO 2017val are shown in the below table.

ModelBackboneStyleLr schdbox AP cleanbox AP corr.box %mask AP cleanmask AP corr.mask %
Faster R-CNNR-50-FPNpytorch1x36.318.250.2---
Faster R-CNNR-101-FPNpytorch1x38.520.954.2---
Faster R-CNNX-101-32x4d-FPNpytorch1x40.122.355.5---
Faster R-CNNX-101-64x4d-FPNpytorch1x41.323.456.6---
Faster R-CNNR-50-FPN-DCNpytorch1x40.022.456.1---
Faster R-CNNX-101-32x4d-FPN-DCNpytorch1x43.426.761.6---
Mask R-CNNR-50-FPNpytorch1x37.318.750.134.216.849.1
Mask R-CNNR-50-FPN-DCNpytorch1x41.123.356.737.220.755.7
Cascade R-CNNR-50-FPNpytorch1x40.420.149.7---
Cascade Mask R-CNNR-50-FPNpytorch1x41.220.750.235.717.649.3
RetinaNetR-50-FPNpytorch1x35.617.850.1---
Hybrid Task CascadeX-101-64x4d-FPN-DCNpytorch1x50.632.764.743.828.164.0

Results may vary slightly due to the stochastic application of the corruptions.