research/object_detection/g3doc/tf1_detection_zoo.md
We provide a collection of detection models pre-trained on the COCO dataset, the Kitti dataset, the Open Images dataset, the AVA v2.1 dataset the iNaturalist Species Detection Dataset and the Snapshot Serengeti Dataset. These models can be useful for out-of-the-box inference if you are interested in categories already in those datasets. They are also useful for initializing your models when training on novel datasets.
In the table below, we list each such pre-trained model including:
samples/configs directory,Boxes, and Masks if applicable )You can un-tar each tar.gz file via, e.g.,:
tar -xzvf ssd_mobilenet_v1_coco.tar.gz
Inside the un-tar'ed directory, you will find:
graph.pbtxt)model.ckpt.data-00000-of-00001, model.ckpt.index,
model.ckpt.meta)frozen_inference_graph.pb) to be used for out of the box inference (try
this out in the Jupyter notebook!)pipeline.config) which was used to generate the graph.
These directly correspond to a config file in the
samples/configs)
directory but often with a modified score threshold. In the case of the
heavier Faster R-CNN models, we also provide a version of the model that
uses a highly reduced number of proposals for speed.model.tflite) that can be deployed on
mobile devices.Some remarks on frozen inference graphs:
Note: The asterisk (☆) at the end of model name indicates that this model supports TPU training.
Note: If you download the tar.gz file of quantized models and un-tar, you will get different set of files - a checkpoint, a config file and tflite frozen graphs (txt/binary).
| Model name | Pixel 1 Latency (ms) | COCO mAP | Outputs |
|---|---|---|---|
| ssd_mobiledet_cpu_coco | 113 | 24.0 | Boxes |
| ssd_mobilenet_v2_mnasfpn_coco | 183 | 26.6 | Boxes |
| ssd_mobilenet_v3_large_coco | 119 | 22.6 | Boxes |
| ssd_mobilenet_v3_small_coco | 43 | 15.4 | Boxes |
| Model name | Pixel 4 Edge TPU Latency (ms) | COCO mAP (fp32/uint8) | Outputs |
|---|---|---|---|
| ssd_mobiledet_edgetpu_coco | 6.9 | 25.9/25.6 | Boxes |
| ssd_mobilenet_edgetpu_coco | 6.6 | -/24.3 | Boxes |
| Model name | Pixel 4 DSP Latency (ms) | COCO mAP (fp32/uint8) | Outputs |
|---|---|---|---|
| ssd_mobiledet_dsp_coco | 12.3 | 28.9/28.8 | Boxes |
| Model name | Speed (ms) | Pascal [email protected] | Outputs |
|---|---|---|---|
| faster_rcnn_resnet101_kitti | 79 | 87 | Boxes |
| Model name | Speed (ms) | Open Images [email protected]2 | Outputs |
|---|---|---|---|
| faster_rcnn_inception_resnet_v2_atrous_oidv2 | 727 | 37 | Boxes |
| faster_rcnn_inception_resnet_v2_atrous_lowproposals_oidv2 | 347 | Boxes | |
| facessd_mobilenet_v2_quantized_open_image_v4 3 | 20 | 73 (faces) | Boxes |
| Model name | Speed (ms) | Open Images [email protected]4 | Outputs |
|---|---|---|---|
| faster_rcnn_inception_resnet_v2_atrous_oidv4 | 425 | 54 | Boxes |
| ssd_mobilenetv2_oidv4 | 89 | 36 | Boxes |
| ssd_resnet_101_fpn_oidv4 | 237 | 38 | Boxes |
| Model name | Speed (ms) | Pascal [email protected] | Outputs |
|---|---|---|---|
| faster_rcnn_resnet101_fgvc | 395 | 58 | Boxes |
| faster_rcnn_resnet50_fgvc | 366 | 55 | Boxes |
| Model name | Speed (ms) | Pascal [email protected] | Outputs |
|---|---|---|---|
| faster_rcnn_resnet101_ava_v2.1 | 93 | 11 | Boxes |
| Model name | COCO [email protected] | Outputs |
|---|---|---|
| faster_rcnn_resnet101_snapshot_serengeti | 38 | Boxes |
| context_rcnn_resnet101_snapshot_serengeti | 56 | Boxes |
| Model name | Pixel 6 Edge TPU Speed (ms) | Pixel 6 Speed with Post-processing on CPU (ms) | COCO 2017 mAP (uint8) | Outputs |
|---|---|---|---|---|
| spaghettinet_edgetpu_s | 1.3 | 1.8 | 26.3 | Boxes |
| spaghettinet_edgetpu_m | 1.4 | 1.9 | 27.4 | Boxes |
| spaghettinet_edgetpu_l | 1.7 | 2.1 | 28.0 | Boxes |
See MSCOCO evaluation protocol. The COCO mAP numbers, with the exception of the Pixel 6 Edge TPU models, are evaluated on COCO 14 minival set (note that our split is different from COCO 17 Val). A full list of image ids used in our split could be found here. ↩
This is PASCAL mAP with a slightly different way of true positives computation: see Open Images evaluation protocols, oid_V2_detection_metrics. ↩
Non-face boxes are dropped during training and non-face groundtruth boxes are ignored when evaluating. ↩
This is Open Images Challenge metric: see Open Images evaluation protocols, oid_challenge_detection_metrics. ↩