Back to Insightface

README

detection/scrfd/README.md

0.78.7 KB
Original Source

Introduction

SCRFD is an efficient high accuracy face detection approach which initially described in Arxiv, and accepted by ICLR-2022.

Try out the Gradio Web Demo:

Performance

Precision, flops and infer time are all evaluated on VGA resolution.

ResNet family

MethodBackboneEasyMediumHard#Params(M)#Flops(G)Infer(ms)
DSFD (CVPR19)ResNet15294.2991.4771.39120.06259.5555.6
RetinaFace (CVPR20)ResNet5094.9291.9064.1729.5037.5921.7
HAMBox (CVPR20)ResNet5095.2793.7676.7530.2443.2825.9
TinaFace (Arxiv20)ResNet5095.6194.2581.4337.98172.9538.9
--------
ResNet-34GFResNet5095.6494.2284.0224.8134.1611.8
SCRFD-34GFBottleneck Res96.0694.9285.299.8034.1311.7
ResNet-10GFResNet34x0.594.6992.9080.426.8510.186.3
SCRFD-10GFBasic Res95.1693.8783.053.869.984.9
ResNet-2.5GFResNet34x0.2593.2191.1174.471.622.575.4
SCRFD-2.5GFBasic Res93.7892.1677.870.672.534.2

Mobile family

MethodBackboneEasyMediumHard#Params(M)#Flops(G)Infer(ms)
RetinaFace (CVPR20)MobileNet0.2587.7881.1647.320.440.8027.9
FaceBoxes (IJCB17)-76.1757.1724.181.010.2752.5
--------
MobileNet-0.5GFMobileNetx0.2590.3887.0566.680.370.5073.7
SCRFD-0.5GFDepth-wise Conv90.5788.1268.510.570.5083.6

X64 CPU Performance of SCRFD-0.5GF:

Test-Input-SizeCPU Single-ThreadEasyMediumHard
Original-Size(scale1.0)-90.9189.4982.03
640x48028.3ms90.5788.1268.51
320x24011.4ms---

precision and infer time are evaluated on AMD Ryzen 9 3950X, using the simple PyTorch CPU inference by setting OMP_NUM_THREADS=1 (no mkldnn).

Installation

Please refer to mmdetection for installation.

  1. Install mmcv. (mmcv-full==1.2.6 and 1.3.3 was tested)
  2. Install build requirements and then install mmdet.
    pip install -r requirements/build.txt
    pip install -v -e .  # or "python setup.py develop"
    

Data preparation

WIDERFace:

  1. Download WIDERFace datasets and put it under data/retinaface.
  2. Download annotation files from gdrive and put them under data/retinaface/
  data/retinaface/
      train/
          images/
          labelv2.txt
      val/
          images/
          labelv2.txt
          gt/
              *.mat
          

Annotation Format

please refer to labelv2.txt for detail

For each image:

# <image_path> image_width image_height
bbox_x1 bbox_y1 bbox_x2 bbox_y2 (<keypoint,3>*N)
...
...
# <image_path> image_width image_height
bbox_x1 bbox_y1 bbox_x2 bbox_y2 (<keypoint,3>*N)
...
...

Keypoints can be ignored if there is bbox annotation only.

Training

Example training command, with 4 GPUs:

CUDA_VISIBLE_DEVICES="0,1,2,3" PORT=29701 bash ./tools/dist_train.sh ./configs/scrfd/scrfd_1g.py 4

WIDERFace Evaluation

We use a pure python evaluation script without Matlab.

GPU=0
GROUP=scrfd
TASK=scrfd_2.5g
CUDA_VISIBLE_DEVICES="$GPU" python -u tools/test_widerface.py ./configs/"$GROUP"/"$TASK".py ./work_dirs/"$TASK"/model.pth --mode 0 --out wouts

Pretrained-Models

NameEasyMediumHardFLOPsParams(M)Infer(ms)Link
SCRFD_500M90.5788.1268.51500M0.573.6download
SCRFD_1G92.3890.5774.801G0.644.1download
SCRFD_2.5G93.7892.1677.872.5G0.674.2download
SCRFD_10G95.1693.8783.0510G3.864.9download
SCRFD_34G96.0694.9285.2934G9.8011.7download
SCRFD_500M_KPS90.9788.4469.49500M0.573.6download
SCRFD_2.5G_KPS93.8092.0277.132.5G0.824.3download
SCRFD_10G_KPS95.4094.0182.8010G4.235.0download

mAP, FLOPs and inference latency are all evaluated on VGA resolution. _KPS means the model includes 5 keypoints prediction.

Convert to ONNX

Please refer to tools/scrfd2onnx.py

Generated onnx model can accept dynamic input as default.

You can also set specific input shape by pass --shape 640 640, then output onnx model can be optimized by onnx-simplifier.

Inference

Please refer to tools/scrfd.py which uses onnxruntime to do inference.

For two-steps search as we described in paper, we target hard mAP on how we select best candidate models.

We provide an example for searching SCRFD-2.5GF in this repo as below.

  1. For searching backbones:

    python search_tools/generate_configs_2.5g.py --mode 1
    

    Where mode==1 means searching backbone only. For other parameters, please check the code.

  2. After step-1 done, there will be configs/scrfdgen2.5g/scrfdgen2.5g_1.py to configs/scrfdgen2.5g/scrfdgen2.5g_64.py if num_configs is set to 64.

  3. Do training for every generated configs for 80 epochs, please check search_tools/search_train.sh

  4. Test WIDERFace precision for every generated configs, using search_tools/search_test.sh.

  5. Select the top accurate config as the base template(assume the 10-th config is the best), then do the overall network search.

    python search_tools/generate_configs_2.5g.py --mode 2 --template 10
    
  6. Test these new generated configs again and select the top accurate one(s).

Acknowledgments

We thank nihui for the excellent mobile-phone demo.

Demo

  1. ncnn-android-scrfd
  2. scrfd-MNN C++
  3. scrfd-TNN C++
  4. scrfd-NCNN C++
  5. scrfd-ONNXRuntime C++
  6. TensorRT Python
  7. Modelscope demo for rotated face
  8. Modelscope demo for card detection