PointRend: Image Segmentation as Rendering

Alexander Kirillov, Yuxin Wu, Kaiming He, Ross Girshick

[arXiv] [BibTeX]

In this repository, we release code for PointRend in Detectron2. PointRend can be flexibly applied to both instance and semantic segmentation tasks by building on top of existing state-of-the-art models.

Quick start and visualization

This Colab Notebook tutorial contains examples of PointRend usage and visualizations of its point sampling stages.

Training

To train a model with 8 GPUs run:

bash

cd /path/to/detectron2/projects/PointRend
python train_net.py --config-file configs/InstanceSegmentation/pointrend_rcnn_R_50_FPN_1x_coco.yaml --num-gpus 8

Evaluation

Model evaluation can be done similarly:

bash

cd /path/to/detectron2/projects/PointRend
python train_net.py --config-file configs/InstanceSegmentation/pointrend_rcnn_R_50_FPN_1x_coco.yaml --eval-only MODEL.WEIGHTS /path/to/model_checkpoint

Pretrained Models

Instance Segmentation

COCO

<table><tbody>   <th valign="bottom">Mask head</th> <th valign="bottom">Backbone</th> <th valign="bottom">lr sched</th> <th valign="bottom">Output resolution</th> <th valign="bottom">mask AP</th> <th valign="bottom">mask AP&ast;</th> <th valign="bottom">model id</th> <th valign="bottom">download</th>  <tr><td align="left"><a href="configs/InstanceSegmentation/pointrend_rcnn_R_50_FPN_1x_coco.yaml">PointRend</a></td> <td align="center">R50-FPN</td> <td align="center">1×</td> <td align="center">224×224</td> <td align="center">36.2</td> <td align="center">39.7</td> <td align="center">164254221</td> <td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/PointRend/InstanceSegmentation/pointrend_rcnn_R_50_FPN_1x_coco/164254221/model_final_736f5a.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/PointRend/InstanceSegmentation/pointrend_rcnn_R_50_FPN_1x_coco/164254221/metrics.json">metrics</a></td> </tr> <tr><td align="left"><a href="configs/InstanceSegmentation/pointrend_rcnn_R_50_FPN_3x_coco.yaml">PointRend</a></td> <td align="center">R50-FPN</td> <td align="center">3×</td> <td align="center">224×224</td> <td align="center">38.3</td> <td align="center">41.6</td> <td align="center">164955410</td> <td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/PointRend/InstanceSegmentation/pointrend_rcnn_R_50_FPN_3x_coco/164955410/model_final_edd263.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/PointRend/InstanceSegmentation/pointrend_rcnn_R_50_FPN_3x_coco/164955410/metrics.json">metrics</a></td> </tr> </tr> <tr><td align="left"><a href="configs/InstanceSegmentation/pointrend_rcnn_R_101_FPN_3x_coco.yaml">PointRend</a></td> <td align="center">R101-FPN</td> <td align="center">3×</td> <td align="center">224×224</td> <td align="center">40.1</td> <td align="center">43.8</td> <td align="center"></td> <td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/PointRend/InstanceSegmentation/pointrend_rcnn_R_101_FPN_3x_coco/28119983/model_final_3f4d2a.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/PointRend/InstanceSegmentation/pointrend_rcnn_R_101_FPN_3x_coco/28119983/metrics.json">metrics</a></td> </tr> </tr> <tr><td align="left"><a href="configs/InstanceSegmentation/pointrend_rcnn_X_101_32x8d_FPN_3x_coco.yaml">PointRend</a></td> <td align="center">X101-FPN</td> <td align="center">3×</td> <td align="center">224×224</td> <td align="center">41.1</td> <td align="center">44.7</td> <td align="center"></td> <td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/PointRend/InstanceSegmentation/pointrend_rcnn_X_101_32x8d_FPN_3x_coco/28119989/model_final_ba17b9.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/PointRend/InstanceSegmentation/pointrend_rcnn_X_101_32x8d_FPN_3x_coco/28119989/metrics.json">metrics</a></td> </tr> </tbody></table>

AP* is COCO mask AP evaluated against the higher-quality LVIS annotations; see the paper for details. Run python detectron2/datasets/prepare_cocofied_lvis.py to prepare GT files for AP* evaluation. Since LVIS annotations are not exhaustive, lvis-api and not cocoapi should be used to evaluate AP*.

Cityscapes

Cityscapes model is trained with ImageNet pretraining.

Semantic Segmentation

Cityscapes

Cityscapes model is trained with ImageNet pretraining.

<table><tbody>   <th valign="bottom">Method</th> <th valign="bottom">Backbone</th> <th valign="bottom">Output resolution</th> <th valign="bottom">mIoU</th> <th valign="bottom">model id</th> <th valign="bottom">download</th>  <tr><td align="left"><a href="configs/SemanticSegmentation/pointrend_semantic_R_101_FPN_1x_cityscapes.yaml">SemanticFPN + PointRend</a></td> <td align="center">R101-FPN</td> <td align="center">1024×2048</td> <td align="center">78.9</td> <td align="center">202576688</td> <td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/PointRend/SemanticSegmentation/pointrend_semantic_R_101_FPN_1x_cityscapes/202576688/model_final_cf6ac1.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/PointRend/SemanticSegmentation/pointrend_semantic_R_101_FPN_1x_cityscapes/202576688/metrics.json">metrics</a></td> </tr> </tbody></table>

<a name="CitingPointRend"></a>Citing PointRend

If you use PointRend, please use the following BibTeX entry.

BibTeX

@InProceedings{kirillov2019pointrend,
  title={{PointRend}: Image Segmentation as Rendering},
  author={Alexander Kirillov and Yuxin Wu and Kaiming He and Ross Girshick},
  journal={ArXiv:1912.08193},
  year={2019}
}

<a name="CitingImplicitPointRend"></a>Citing Implicit PointRend

If you use Implicit PointRend, please use the following BibTeX entry.

BibTeX

@InProceedings{cheng2021pointly,
  title={Pointly-Supervised Instance Segmentation,
  author={Bowen Cheng and Omkar Parkhi and Alexander Kirillov},
  journal={ArXiv},
  year={2021}
}