docs/version3.x/integrations/mcp_server.en.md
PaddleOCR provides a lightweight Model Context Protocol (MCP) server designed to integrate PaddleOCR’s text recognition, layout parsing, and other capabilities into various large-model applications.
Currently Supported Models
| Model | MCP tool name | Description |
|---|---|---|
PP-OCRv5, PP-OCRv5-latin, PP-OCRv6 | ocr | Performs text detection and recognition on images and PDF files. |
PP-StructureV3 | pp_structurev3 | Identifies and extracts text blocks, titles, paragraphs, images, tables, and other layout elements from images or PDF files, converting the input into Markdown documents. |
PaddleOCR-VL, PaddleOCR-VL-1.5, PaddleOCR-VL-1.6 | paddleocr_vl | Performs layout parsing with a VLM-based approach and converts the input into Markdown documents. |
Supported Inference Methods
The following showcases creative use cases built with the PaddleOCR MCP server combined with other tools:
In Claude for Desktop, extract handwritten content from images and save to note-taking software Notion. The PaddleOCR MCP server extracts text, formulas and other information from images while preserving document structure.
<div align="center"> </div>In VSCode, convert handwritten ideas or pseudocode into runnable Python scripts that comply with project coding standards with one click, and upload them to GitHub repositories. The PaddleOCR MCP server extracts explicitly handwritten code from images for subsequent processing.
<div align="center"> </div>In Claude for Desktop, convert PDF documents or images containing complex tables, formulas, handwritten text and other content into locally editable files.
Convert complex PDF documents with tables and watermarks to editable doc/Word format:
<div align="center"> </div>Convert images containing formulas and tables to editable csv/Excel format:
<div align="center"> </div>This section explains how to install the paddleocr-mcp library via pip.
paddleocr-mcp requires Python 3.10 or later.
paddleocr-mcp depends on paddleocr>=3.7.0 by default, so Official API, Qianfan API, and self-hosted API modes do not require installing PaddleOCR separately. Local inference additionally requires the document-parsing dependencies and an inference engine required to run PaddleOCR pipelines locally; see Method 1: Local Inference for details.
Install from PyPI:
pip install -U paddleocr-mcp
Install from source:
git clone https://github.com/PaddlePaddle/PaddleOCR.git
pip install -e mcp_server
For local inference, install the optional extras described in Method 1: Local Inference.
To verify successful installation:
paddleocr_mcp --help
If help information is printed after running the command above, the installation succeeded.
PaddleOCR also supports running the server without installation through methods like uvx; for details, see 2. Using with Claude for Desktop.
This section explains how to use the PaddleOCR MCP server within Claude for Desktop. The steps are also applicable to other MCP hosts with minor adjustments.
The following quick start uses Official API inference as an example to get you started.
Install paddleocr-mcp
Refer to 1. Installation.
Obtain an Access Token
Obtain your access token from the AI Studio Access Token page.
Add MCP Server Configuration
Locate the claude_desktop_config.json configuration file:
~/Library/Application Support/Claude/claude_desktop_config.json%APPDATA%\Claude\claude_desktop_config.json~/.config/Claude/claude_desktop_config.jsonOpen the claude_desktop_config.json file, adjust the configuration according to the example below, and fill it into claude_desktop_config.json.
{
"mcpServers": {
"paddleocr": {
"command": "paddleocr_mcp",
"args": [],
"env": {
"PADDLEOCR_MCP_MODEL": "PP-OCRv5",
"PADDLEOCR_MCP_PPOCR_SOURCE": "aistudio",
"PADDLEOCR_MCP_AISTUDIO_ACCESS_TOKEN": "<your-access-token>"
}
}
}
}
Notes:
<your-access-token> with your access token.PADDLEOCR_MCP_AISTUDIO_BASE_URL environment variable.Important:
paddleocr_mcp is not in your system's PATH, set command to the absolute path of the executable.Restart the MCP Host
Restart Claude for Desktop. The paddleocr server should now be available in the application.
In the configuration file for Claude for Desktop, you need to define how the MCP server is started. The key fields are as follows:
command: paddleocr_mcp (if the executable can be found in the PATH) or the absolute path.args: Configurable command-line arguments, such as ["--verbose"]. See 4. Parameter Reference for details.env: Configurable environment variables. See 4. Parameter Reference for details.You can configure the MCP server according to your requirements to use different inference methods. The operational procedures vary for different methods, which will be explained in detail below.
Install paddleocr-mcp and the local inference dependencies. paddleocr-mcp already depends on PaddleOCR; local inference additionally requires the document-parsing dependencies and an inference engine. You can install them manually by referring to the PaddleOCR installation guide, or use the corresponding optional dependencies:
paddleocr-mcp[local]: includes paddleocr[doc-parser]>=3.7.0 (without the inference engine).paddleocr-mcp[local-cpu]: based on local, additionally includes the CPU PaddlePaddle inference engine (paddlepaddle>=3.2.1).# Install document-parsing dependencies for local inference (inference engine not included):
pip install "paddleocr-mcp[local]"
# Install the CPU PaddlePaddle framework in addition to local:
pip install "paddleocr-mcp[local-cpu]"
To avoid dependency conflicts, it is strongly recommended to install in an isolated virtual environment.
Refer to the configuration example below to modify the claude_desktop_config.json file.
Restart the MCP host.
Configuration example:
{
"mcpServers": {
"paddleocr": {
"command": "paddleocr_mcp",
"args": [],
"env": {
"PADDLEOCR_MCP_MODEL": "PP-OCRv5",
"PADDLEOCR_MCP_PPOCR_SOURCE": "local"
}
}
}
}
Notes:
PADDLEOCR_MCP_MODEL should be set to the model name. See Section 4 for details.
PADDLEOCR_MCP_PIPELINE_CONFIG is optional. If not set, the default pipeline configuration is used. To adjust the configuration, such as changing models, refer to the PaddleOCR documentation to export the pipeline configuration file, and set PADDLEOCR_MCP_PIPELINE_CONFIG to the absolute path of this file.
Inference Performance Tips:
If you encounter long inference time or insufficient memory, consider adjusting the pipeline configuration:
PP-StructureV3 Pipeline:
use_formula_recognition to False to disable formula recognition.mobile version or switching to a lightweight formula recognition model like PP-FormulaNet-S.The following sample code exports a PP-StructureV3 pipeline configuration with most optional features disabled and some key models replaced with lightweight versions.
from paddleocr import PPStructureV3
pipeline = PPStructureV3(
use_doc_orientation_classify=False, # Disable document image orientation classification
use_doc_unwarping=False, # Disable text image unwarping
use_textline_orientation=False, # Disable text line orientation classification
use_formula_recognition=False, # Disable formula recognition
use_seal_recognition=False, # Disable seal text recognition
use_table_recognition=False, # Disable table recognition
use_chart_recognition=False, # Disable chart parsing
# Use lightweight models
text_detection_model_name="PP-OCRv5_mobile_det",
text_recognition_model_name="PP-OCRv5_mobile_rec",
layout_detection_model_name="PP-DocLayout-S",
)
# The configuration file is saved to `PP-StructureV3.yaml`
pipeline.export_paddlex_config_to_yaml("PP-StructureV3.yaml")
For PaddleOCR-VL series, CPU inference is not recommended.
Refer to 2.1 Quick Start.
For tasks other than text recognition, set PADDLEOCR_MCP_MODEL correctly (see Section 4 for parameter details).
paddleocr-mcp.claude_desktop_config.json file.Configuration example:
{
"mcpServers": {
"paddleocr": {
"command": "paddleocr_mcp",
"args": [],
"env": {
"PADDLEOCR_MCP_MODEL": "PaddleOCR-VL",
"PADDLEOCR_MCP_PPOCR_SOURCE": "qianfan",
"PADDLEOCR_MCP_QIANFAN_API_KEY": "<your-api-key>"
}
}
}
}
Notes:
PADDLEOCR_MCP_MODEL should be set to the model name. Qianfan supports only PP-StructureV3 and PaddleOCR-VL.PADDLEOCR_MCP_QIANFAN_BASE_URL is the Qianfan API base URL (optional).PADDLEOCR_MCP_QIANFAN_API_KEY is your Qianfan API key for authentication.paddleocr-mcp in the environment where you need to run the MCP server.claude_desktop_config.json file.Configuration example:
{
"mcpServers": {
"paddleocr": {
"command": "paddleocr_mcp",
"args": [],
"env": {
"PADDLEOCR_MCP_MODEL": "PP-OCRv5",
"PADDLEOCR_MCP_PPOCR_SOURCE": "self_hosted",
"PADDLEOCR_MCP_SELF_HOSTED_BASE_URL": "<your-server-url>"
}
}
}
}
Notes:
PADDLEOCR_MCP_MODEL should be set to the model name. See Section 4 for details.<your-server-url> with the underlying service base URL (e.g. http://127.0.0.1:8080, without path suffixes such as /ocr or /layout-parsing; MCP appends them by pipeline).uvxPaddleOCR also supports starting the MCP server via uvx. With this approach, manual installation of paddleocr-mcp is not required. The main steps are as follows:
claude_desktop_config.json. Examples:Self-hosted API inference example:
{
"mcpServers": {
"paddleocr": {
"command": "uvx",
"args": [
"--from",
"paddleocr-mcp",
"paddleocr_mcp"
],
"env": {
"PADDLEOCR_MCP_MODEL": "PP-OCRv5",
"PADDLEOCR_MCP_PPOCR_SOURCE": "self_hosted",
"PADDLEOCR_MCP_SELF_HOSTED_BASE_URL": "<your-server-url>"
}
}
}
}
Local inference (CPU inference, using the optional local-cpu extra) example:
{
"mcpServers": {
"paddleocr": {
"command": "uvx",
"args": [
"--from",
"paddleocr-mcp[local-cpu]",
"paddleocr_mcp"
],
"env": {
"PADDLEOCR_MCP_MODEL": "PP-OCRv5",
"PADDLEOCR_MCP_PPOCR_SOURCE": "local"
}
}
}
}
For local inference dependencies, performance tuning, and pipeline configuration, refer to Method 1: Local Inference.
Due to the use of a different startup method, the command and args settings in the configuration file differ from the previously described approach. However, the command-line arguments and environment variables supported by the MCP service (such as PADDLEOCR_MCP_SELF_HOSTED_BASE_URL) can still be set in the same way.
In addition to MCP hosts like Claude for Desktop, you can also run the PaddleOCR MCP server via the CLI.
Run the following command to print help information:
paddleocr_mcp --help
Example commands:
# PP-OCRv5 + Official API + stdio
PADDLEOCR_MCP_AISTUDIO_ACCESS_TOKEN=xxxxxx paddleocr_mcp --model PP-OCRv5 --ppocr_source aistudio
# PP-OCRv6 + Official API + stdio
paddleocr_mcp --model PP-OCRv6 --ppocr_source aistudio
# PP-StructureV3 + Local Inference + stdio
paddleocr_mcp --model PP-StructureV3 --ppocr_source local
# OCR + Self-hosted API + Streamable HTTP
paddleocr_mcp --model PP-OCRv5 --ppocr_source self_hosted --self-hosted-base-url http://127.0.0.1:8080 --http
See 4. Parameter Reference for all parameters supported by the PaddleOCR MCP server.
You can control the MCP server via environment variables or CLI arguments.
| Environment Variable | CLI Argument | Type | Description | Options | Default |
|---|---|---|---|---|---|
PADDLEOCR_MCP_MODEL | --model | str | Model to run. MCP selects the tool automatically from the model. | "PP-OCRv5", "PP-OCRv5-latin", "PP-OCRv6", "PP-StructureV3", "PaddleOCR-VL", "PaddleOCR-VL-1.5", "PaddleOCR-VL-1.6" | "PP-OCRv6" |
PADDLEOCR_MCP_PPOCR_SOURCE | --ppocr_source | str | Source of PaddleOCR capabilities. | "local" (local inference), "aistudio" (Official API), "qianfan" (Qianfan API), "self_hosted" (self-hosted API) | "local" |
PADDLEOCR_MCP_AISTUDIO_BASE_URL | --aistudio-base-url | str | AI Studio API base URL (optional for aistudio source). | - | None |
PADDLEOCR_MCP_QIANFAN_BASE_URL | --qianfan-base-url | str | Qianfan API base URL (optional for qianfan source). | - | https://qianfan.baidubce.com/v2/ocr |
PADDLEOCR_MCP_SELF_HOSTED_BASE_URL | --self-hosted-base-url | str | Self-hosted PaddleX serve base URL (required for self_hosted source). | - | None |
PADDLEOCR_MCP_QIANFAN_API_KEY | --qianfan_api_key | str | Qianfan API authentication key (required for qianfan source). | - | None |
PADDLEOCR_MCP_AISTUDIO_ACCESS_TOKEN | --aistudio_access_token | str | AI Studio access token (required for aistudio source). | - | None |
PADDLEOCR_MCP_HTTP_TIMEOUT | --http-timeout | int | HTTP read timeout in seconds for synchronous APIs (qianfan, self_hosted). | - | 600 |
PADDLEOCR_MCP_AISTUDIO_REQUEST_TIMEOUT | --aistudio-request-timeout | int | Per-request HTTP timeout in seconds for AI Studio API calls (job submission, status checks, etc.). | - | 120 |
PADDLEOCR_MCP_AISTUDIO_POLL_TIMEOUT | --aistudio-poll-timeout | int | Total job polling timeout in seconds for AI Studio. | - | 600 |
PADDLEOCR_MCP_DEVICE | --device | str | Device for inference (only effective for local source). | - | None |
PADDLEOCR_MCP_PIPELINE_CONFIG | --pipeline_config | str | PaddleOCR pipeline configuration file path (only effective for local source). | - | None |
| - | --http | bool | Use Streamable HTTP transport instead of stdio (for remote deployment and multiple clients). | - | False |
| - | --host | str | Host address for Streamable HTTP mode. | - | "127.0.0.1" |
| - | --port | int | Port for Streamable HTTP mode. | - | 8000 |
| - | --verbose | bool | Enable verbose logging for debugging. | - | False |
file_type prompt; some complex URLs may fail to process.