Back to Paddleocr

PP-Structure Model list

docs/version2.x/ppstructure/models_list.en.md

3.5.08.0 KB
Original Source

PP-Structure Model list

1. Layout Analysis

model namedescriptioninference model sizedownloaddict path
picodet_lcnet_x1_0_fgd_layoutThe layout analysis English model trained on the PubLayNet dataset based on PicoDet LCNet_x1_0 and FGD . the model can recognition 5 types of areas such as Text, Title, Table, Picture and List9.7Minference model / trained modelPubLayNet dict
ppyolov2_r50vd_dcn_365e_publaynetThe layout analysis English model trained on the PubLayNet dataset based on PP-YOLOv2221.0Minference_moel / trained modelsame as above
picodet_lcnet_x1_0_fgd_layout_cdlaThe layout analysis Chinese model trained on the CDLA dataset, the model can recognition 10 types of areas such as Table、Figure、Figure caption、Table、Table caption、Header、Footer、Reference、Equation9.7Minference model / trained modelCDLA dict
picodet_lcnet_x1_0_fgd_layout_tableThe layout analysis model trained on the table dataset, the model can detect tables in Chinese and English documents9.7Minference model / trained modelTable dict
ppyolov2_r50vd_dcn_365e_tableBank_wordThe layout analysis model trained on the TableBank Word dataset based on PP-YOLOv2, the model can detect tables in English documents221.0Minference modelsame as above
ppyolov2_r50vd_dcn_365e_tableBank_latexThe layout analysis model trained on the TableBank Latex dataset based on PP-YOLOv2, the model can detect tables in English documents221.0Minference modelsame as above

2. OCR and Table Recognition

2.1 OCR

model namedescriptioninference model sizedownload
en_ppocr_mobile_v2.0_table_detText detection model of English table scenes trained on PubTabNet dataset4.7Minference model / trained model
en_ppocr_mobile_v2.0_table_recText recognition model of English table scenes trained on PubTabNet dataset6.9Minference model / trained model

If you need to use other OCR models, you can download the model in PP-OCR model_list or use the model you trained yourself to configure to det_model_dir, rec_model_dir field.

2.2 Table Recognition

modeldescriptioninference model sizedownload
en_ppocr_mobile_v2.0_table_structureEnglish table recognition model trained on PubTabNet dataset based on TableRec-RARE6.8Minference model / trained model
en_ppstructure_mobile_v2.0_SLANetEnglish table recognition model trained on PubTabNet dataset based on SLANet9.2Minference model / trained model
ch_ppstructure_mobile_v2.0_SLANetChinese table recognition model based on SLANet9.3Minference model / trained model

3. KIE

On XFUND_zh dataset, Accuracy and time cost of different models on V100 GPU are as follows.

ModelBackboneTaskConfigHmeanTime cost(ms)Download link
VI-LayoutXLMVI-LayoutXLM-baseSERser_vi_layoutxlm_xfund_zh_udml.yml93.19%15.49trained model
LayoutXLMLayoutXLM-baseSERser_layoutxlm_xfund_zh.yml90.38%19.49trained model
LayoutLMLayoutLM-baseSERser_layoutlm_xfund_zh.yml77.31%-trained model
LayoutLMv2LayoutLMv2-baseSERser_layoutlmv2_xfund_zh.yml85.44%31.46trained model
VI-LayoutXLMVI-LayoutXLM-baseREre_vi_layoutxlm_xfund_zh_udml.yml83.92%15.49trained model
LayoutXLMLayoutXLM-baseREre_layoutxlm_xfund_zh.yml74.83%19.49trained model
LayoutLMv2LayoutLMv2-baseREre_layoutlmv2_xfund_zh.yml67.77%31.46trained model
  • Note: The above time cost information just considers inference time without preprocess or postprocess, test environment: V100 GPU + CUDA 10.2 + CUDNN 8.1.1 + TRT 7.2.3.4

On wildreceipt dataset, the algorithm result is as follows:

ModelBackboneConfigHmeanDownload link
SDMGRVGG6configs/kie/sdmgr/kie_unet_sdmgr.yml86.70%trained model