Back to Detectron2

Continuous Surface Embeddings for Dense Pose Estimation for Humans and Animals

projects/DensePose/doc/DENSEPOSE_CSE.md

0.616.1 KB
Original Source

Continuous Surface Embeddings for Dense Pose Estimation for Humans and Animals

<a name="Overview"></a> Overview

<div align="center"> </div>

The pipeline uses Faster R-CNN with Feature Pyramid Network meta architecture outlined in Figure 1. For each detected object, the model predicts its coarse segmentation S (2 channels: foreground / background) and the embedding E (16 channels). At the same time, the embedder produces vertex embeddings Ê for the corresponding mesh. Universal positional embeddings E and vertex embeddings Ê are matched to derive for each pixel its continuous surface embedding.

<div align="center"> </div> <p class="image-caption"><b>Figure 1.</b> DensePose continuous surface embeddings architecture based on Faster R-CNN with Feature Pyramid Network (FPN).</p>

Datasets

For more details on datasets used for training and validation of continuous surface embeddings models, please refer to the DensePose Datasets page.

<a name="ModelZoo"></a> Model Zoo and Baselines

Human CSE Models

Continuous surface embeddings models for humans trained using the protocols from Neverova et al, 2020.

Models trained with hard assignment loss ℒ:

<table><tbody> <!-- START TABLE --> <!-- TABLE HEADER --> <th valign="bottom">Name</th> <th valign="bottom">lr sched</th> <th valign="bottom">train time (s/iter)</th> <th valign="bottom">inference time (s/im)</th> <th valign="bottom">train mem (GB)</th> <th valign="bottom">box AP</th> <th valign="bottom">segm AP</th> <th valign="bottom">dp. AP GPS</th> <th valign="bottom">dp. AP GPSm</th> <th valign="bottom">model id</th> <th valign="bottom">download</th> <!-- TABLE BODY --> <!-- ROW: densepose_rcnn_R_50_FPN_s1x --> <tr><td align="left"><a href="../configs/cse/densepose_rcnn_R_50_FPN_s1x.yaml">R_50_FPN_s1x</a></td> <td align="center">s1x</td> <td align="center">0.349</td> <td align="center">0.060</td> <td align="center">6.3</td> <td align="center">61.1</td> <td align="center">67.1</td> <td align="center">64.4</td> <td align="center">65.7</td> <td align="center">251155172</td> <td align="center"><a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_s1x/251155172/model_final_c4ea5f.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_s1x/251155172/metrics.json">metrics</a></td> </tr> <!-- ROW: densepose_rcnn_R_101_FPN_s1x --> <tr><td align="left"><a href="../configs/cse/densepose_rcnn_R_101_FPN_s1x.yaml">R_101_FPN_s1x</a></td> <td align="center">s1x</td> <td align="center">0.461</td> <td align="center">0.071</td> <td align="center">7.4</td> <td align="center">62.3</td> <td align="center">67.2</td> <td align="center">64.7</td> <td align="center">65.8</td> <td align="center">251155500</td> <td align="center"><a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_101_FPN_s1x/251155500/model_final_5c995f.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_101_FPN_s1x/251155500/metrics.json">metrics</a></td> </tr> <!-- ROW: densepose_rcnn_R_50_FPN_DL_s1x --> <tr><td align="left"><a href="../configs/cse/densepose_rcnn_R_50_FPN_DL_s1x.yaml">R_50_FPN_DL_s1x</a></td> <td align="center">s1x</td> <td align="center">0.399</td> <td align="center">0.061</td> <td align="center">7.0</td> <td align="center">60.8</td> <td align="center">67.8</td> <td align="center">65.5</td> <td align="center">66.4</td> <td align="center">251156349</td> <td align="center"><a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_DL_s1x/251156349/model_final_e96218.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_DL_s1x/251156349/metrics.json">metrics</a></td> </tr> <!-- ROW: densepose_rcnn_R_101_FPN_DL_s1x --> <tr><td align="left"><a href="../configs/cse/densepose_rcnn_R_101_FPN_DL_s1x.yaml">R_101_FPN_DL_s1x</a></td> <td align="center">s1x</td> <td align="center">0.504</td> <td align="center">0.074</td> <td align="center">8.3</td> <td align="center">61.5</td> <td align="center">68.0</td> <td align="center">65.6</td> <td align="center">66.6</td> <td align="center">251156606</td> <td align="center"><a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_101_FPN_DL_s1x/251156606/model_final_b236ce.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_101_FPN_DL_s1x/251156606/metrics.json">metrics</a></td> </tr> </tbody></table>

Models trained with soft assignment loss ℒ<sub>σ</sub>:

<table><tbody> <!-- START TABLE --> <!-- TABLE HEADER --> <th valign="bottom">Name</th> <th valign="bottom">lr sched</th> <th valign="bottom">train time (s/iter)</th> <th valign="bottom">inference time (s/im)</th> <th valign="bottom">train mem (GB)</th> <th valign="bottom">box AP</th> <th valign="bottom">segm AP</th> <th valign="bottom">dp. AP GPS</th> <th valign="bottom">dp. AP GPSm</th> <th valign="bottom">model id</th> <th valign="bottom">download</th> <!-- TABLE BODY --> <!-- ROW: densepose_rcnn_R_50_FPN_soft_s1x --> <tr><td align="left"><a href="../configs/cse/densepose_rcnn_R_50_FPN_soft_s1x.yaml">R_50_FPN_soft_s1x</a></td> <td align="center">s1x</td> <td align="center">0.357</td> <td align="center">0.057</td> <td align="center">9.7</td> <td align="center">61.3</td> <td align="center">66.9</td> <td align="center">64.3</td> <td align="center">65.4</td> <td align="center">250533982</td> <td align="center"><a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_soft_s1x/250533982/model_final_2c4512.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_soft_s1x/250533982/metrics.json">metrics</a></td> </tr> <!-- ROW: densepose_rcnn_R_101_FPN_soft_s1x --> <tr><td align="left"><a href="../configs/cse/densepose_rcnn_R_101_FPN_soft_s1x.yaml">R_101_FPN_soft_s1x</a></td> <td align="center">s1x</td> <td align="center">0.464</td> <td align="center">0.071</td> <td align="center">10.5</td> <td align="center">62.1</td> <td align="center">67.3</td> <td align="center">64.5</td> <td align="center">66.0</td> <td align="center">250712522</td> <td align="center"><a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_101_FPN_soft_s1x/250712522/model_final_4637da.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_101_FPN_soft_s1x/250712522/metrics.json">metrics</a></td> </tr> <!-- ROW: densepose_rcnn_R_50_FPN_DL_soft_s1x --> <tr><td align="left"><a href="../configs/cse/densepose_rcnn_R_50_FPN_DL_soft_s1x.yaml">R_50_FPN_DL_soft_s1x</a></td> <td align="center">s1x</td> <td align="center">0.427</td> <td align="center">0.062</td> <td align="center">11.3</td> <td align="center">60.8</td> <td align="center">68.0</td> <td align="center">66.1</td> <td align="center">66.7</td> <td align="center">250713703</td> <td align="center"><a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_DL_soft_s1x/250713703/model_final_9199f5.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_DL_soft_s1x/250713703/metrics.json">metrics</a></td> </tr> <!-- ROW: densepose_rcnn_R_101_FPN_DL_soft_s1x --> <tr><td align="left"><a href="../configs/cse/densepose_rcnn_R_101_FPN_DL_soft_s1x.yaml">R_101_FPN_DL_soft_s1x</a></td> <td align="center">s1x</td> <td align="center">0.483</td> <td align="center">0.071</td> <td align="center">12.2</td> <td align="center">61.5</td> <td align="center">68.2</td> <td align="center">66.2</td> <td align="center">67.1</td> <td align="center">250713061</td> <td align="center"><a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_101_FPN_DL_soft_s1x/250713061/model_final_1d3314.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_101_FPN_DL_soft_s1x/250713061/metrics.json">metrics</a></td> </tr> </tbody></table>

Animal CSE Models

Models obtained by finetuning human CSE models on animals data from ds1_train (see the DensePose LVIS section for more details on the datasets) with soft assignment loss ℒ<sub>σ</sub>:

<table><tbody> <!-- START TABLE --> <!-- TABLE HEADER --> <th valign="bottom">Name</th> <th valign="bottom">lr sched</th> <th valign="bottom">train time (s/iter)</th> <th valign="bottom">inference time (s/im)</th> <th valign="bottom">train mem (GB)</th> <th valign="bottom">box AP</th> <th valign="bottom">segm AP</th> <th valign="bottom">dp. AP GPS</th> <th valign="bottom">dp. AP GPSm</th> <th valign="bottom">model id</th> <th valign="bottom">download</th> <!-- TABLE BODY --> <!-- ROW: densepose_rcnn_R_50_FPN_soft_chimps_finetune_4k --> <tr><td align="left"><a href="../configs/cse/densepose_rcnn_R_50_FPN_soft_chimps_finetune_4k.yaml">R_50_FPN_soft_chimps_finetune_4k</a></td> <td align="center">4K</td> <td align="center">0.569</td> <td align="center">0.051</td> <td align="center">4.7</td> <td align="center">62.0</td> <td align="center">59.0</td> <td align="center">32.2</td> <td align="center">39.6</td> <td align="center">253146869</td> <td align="center"><a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_soft_chimps_finetune_4k/253146869/model_final_52f649.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_soft_chimps_finetune_4k/253146869/metrics.json">metrics</a></td> </tr> <!-- ROW: densepose_rcnn_R_50_FPN_soft_animals_finetune_4k --> <tr><td align="left"><a href="../configs/cse/densepose_rcnn_R_50_FPN_soft_animals_finetune_4k.yaml">R_50_FPN_soft_animals_finetune_4k</a></td> <td align="center">4K</td> <td align="center">0.381</td> <td align="center">0.061</td> <td align="center">7.3</td> <td align="center">44.9</td> <td align="center">55.5</td> <td align="center">21.3</td> <td align="center">28.8</td> <td align="center">253145793</td> <td align="center"><a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_soft_animals_finetune_4k/253145793/model_final_8f8ba2.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_soft_animals_finetune_4k/253145793/metrics.json">metrics</a></td> </tr> <!-- ROW: densepose_rcnn_R_50_FPN_soft_animals_CA_finetune_4k --> <tr><td align="left"><a href="../configs/cse/densepose_rcnn_R_50_FPN_soft_animals_CA_finetune_4k.yaml">R_50_FPN_soft_animals_CA_finetune_4k</a></td> <td align="center">4K</td> <td align="center">0.412</td> <td align="center">0.059</td> <td align="center">7.1</td> <td align="center">53.4</td> <td align="center">59.5</td> <td align="center">25.4</td> <td align="center">33.4</td> <td align="center">253498611</td> <td align="center"><a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_soft_animals_CA_finetune_4k/253498611/model_final_6d69b7.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_soft_animals_CA_finetune_4k/253498611/metrics.json">metrics</a></td> </tr> </tbody></table>

Acronyms:

CA: class agnostic training, where all annotated instances are mapped into a single category

Models obtained by finetuning human CSE models on animals data from ds2_train dataset with soft assignment loss ℒ<sub>σ</sub> and, for some schedules, cycle losses. Please refer to DensePose LVIS section for details on the dataset and to Neverova et al, 2021 for details on cycle losses.

<table><tbody> <!-- START TABLE --> <!-- TABLE HEADER --> <th valign="bottom">Name</th> <th valign="bottom">lr sched</th> <th valign="bottom">train time (s/iter)</th> <th valign="bottom">inference time (s/im)</th> <th valign="bottom">train mem (GB)</th> <th valign="bottom">box AP</th> <th valign="bottom">segm AP</th> <th valign="bottom">dp. AP GPS</th> <th valign="bottom">dp. AP GPSm</th> <th valign="bottom">GErr</th> <th valign="bottom">GPS</th> <th valign="bottom">model id</th> <th valign="bottom">download</th> <!-- TABLE BODY --> <!-- ROW: densepose_rcnn_R_50_FPN_soft_animals_I0_finetune_16k --> <tr><td align="left"><a href="../configs/cse/densepose_rcnn_R_50_FPN_soft_animals_I0_finetune_16k.yaml">R_50_FPN_soft_animals_I0_finetune_16k</a></td> <td align="center">16k</td> <td align="center">0.386</td> <td align="center">0.058</td> <td align="center">8.4</td> <td align="center">54.2</td> <td align="center">67.0</td> <td align="center">29.0</td> <td align="center">38.6</td> <td align="center">13.2</td> <td align="center">85.4</td> <td align="center">270727112</td> <td align="center"><a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_soft_animals_I0_finetune_16k/270727112/model_final_421d28.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_soft_animals_I0_finetune_16k/270727112/metrics.json">metrics</a></td> </tr> <!-- ROW: densepose_rcnn_R_50_FPN_soft_animals_I0_finetune_m2m_16k --> <tr><td align="left"><a href="../configs/cse/densepose_rcnn_R_50_FPN_soft_animals_I0_finetune_m2m_16k.yaml">R_50_FPN_soft_animals_I0_finetune_m2m_16k</a></td> <td align="center">16k</td> <td align="center">0.508</td> <td align="center">0.056</td> <td align="center">12.2</td> <td align="center">54.1</td> <td align="center">67.3</td> <td align="center">28.6</td> <td align="center">38.4</td> <td align="center">12.5</td> <td align="center">87.6</td> <td align="center">270982215</td> <td align="center"><a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_soft_animals_I0_finetune_m2m_16k/270982215/model_final_6fe5f4.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_soft_animals_I0_finetune_m2m_16k/270982215/metrics.json">metrics</a></td> </tr> <!-- ROW: densepose_rcnn_R_50_FPN_soft_animals_I0_finetune_i2m_16k --> <tr><td align="left"><a href="../configs/cse/densepose_rcnn_R_50_FPN_soft_animals_I0_finetune_i2m_16k.yaml">R_50_FPN_soft_animals_I0_finetune_i2m_16k</a></td> <td align="center">16k</td> <td align="center">0.483</td> <td align="center">0.056</td> <td align="center">9.7</td> <td align="center">54.0</td> <td align="center">66.6</td> <td align="center">28.9</td> <td align="center">38.3</td> <td align="center">11.0</td> <td align="center">88.9</td> <td align="center">270727461</td> <td align="center"><a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_soft_animals_I0_finetune_i2m_16k/270727461/model_final_8c9d99.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_soft_animals_I0_finetune_i2m_16k/270727461/metrics.json">metrics</a></td> </tr> </tbody></table>

<a name="References"></a> References

If you use DensePose methods based on continuous surface embeddings, please take the references from the following BibTeX entries:

Continuous surface embeddings:

@InProceedings{Neverova2020ContinuousSurfaceEmbeddings,
    title = {Continuous Surface Embeddings},
    author = {Neverova, Natalia and Novotny, David and Khalidov, Vasil and Szafraniec, Marc and Labatut, Patrick and Vedaldi, Andrea},
    journal = {Advances in Neural Information Processing Systems},
    year = {2020},
}

Cycle Losses:

@InProceedings{Neverova2021UniversalCanonicalMaps,
    title = {Discovering Relationships between Object Categories via Universal Canonical Maps},
    author = {Neverova, Natalia and Sanakoyeu, Artsiom and Novotny, David and Labatut, Patrick and Vedaldi, Andrea},
    journal = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year = {2021},
}