document/content/docs/self-host/custom-models/marker.en.mdx
PDF is a relatively complex file format. FastGPT's built-in PDF parser relies on the pdfjs library, which uses logical parsing and cannot effectively handle complex PDF files. When parsing PDFs containing images, tables, formulas, or other non-plain-text content, the results are often poor.
There are several PDF parsing solutions available. Marker uses the Surya model for vision-based parsing, effectively extracting images, tables, formulas, and other complex content.
Starting from FastGPT v4.9.0, community edition users can add the systemEnv.customPdfParse configuration in config.json to use Marker for PDF parsing. Commercial edition users can configure this directly in the Admin panel via the form. You need to pull the latest Marker image, as the API format has changed.
Refer to the Marker installation guide to install the Marker model. The bundled API is already compatible with FastGPT's custom parsing service.
Quick Docker installation:
docker pull crpi-h3snc261q1dosroc.cn-hangzhou.personal.cr.aliyuncs.com/marker11/marker_images:v0.2
docker run --gpus all -itd -p 7231:7232 --name model_pdf_v2 -e PROCESSES_PER_GPU="2" crpi-h3snc261q1dosroc.cn-hangzhou.personal.cr.aliyuncs.com/marker11/marker_images:v0.2
{
xxx
"systemEnv": {
xxx
"customPdfParse": {
"url": "http://xxxx.com/v2/parse/file", // Custom PDF parsing service URL for Marker v0.2
"key": "", // Custom PDF parsing service key
"doc2xKey": "", // doc2x service key
"price": 0 // PDF parsing service price
}
}
}
Restart the service after making changes.
Upload a PDF file through the Knowledge Base and enable the Enhanced PDF Parsing option.
After uploading, you should see the following logs (LOG_LEVEL must be set to info or debug):
[Info] 2024-12-05 15:04:42 Parsing files from an external service
[Info] 2024-12-05 15:07:08 Custom file parsing is complete, time: 1316ms
You'll notice that PDFs parsed by Marker include image links:
Similarly, in apps you can enable Enhanced PDF Parsing in the file upload settings.
Using Tsinghua's ChatDev Communicative Agents for Software Develop.pdf as an example:
The top row shows chunked results; the bottom row shows the original PDF. Images, formulas, and tables are all extracted effectively.
Note that Marker is licensed under GPL-3.0 license. Please ensure compliance with the license when using it.
For FastGPT versions before V4.9.0, you can use the following method for Marker parsing.
Install and run the Marker service:
docker pull crpi-h3snc261q1dosroc.cn-hangzhou.personal.cr.aliyuncs.com/marker11/marker_images:v0.1
docker run --gpus all -itd -p 7231:7231 --name model_pdf_v1 -e PROCESSES_PER_GPU="2" crpi-h3snc261q1dosroc.cn-hangzhou.personal.cr.aliyuncs.com/marker11/marker_images:v0.1
Then modify the FastGPT environment variables:
CUSTOM_READ_FILE_URL=http://xxxx.com/v1/parse/file
CUSTOM_READ_FILE_EXTENSION=pdf