Back to Paddleocr

PaddleOCR-VL Apple Silicon Usage Tutorial

docs/version3.x/pipeline_usage/PaddleOCR-VL-Apple-Silicon.en.md

3.7.06.9 KB
Original Source

PaddleOCR-VL Apple Silicon Usage Tutorial

INFO: Unless otherwise specified, the term "PaddleOCR-VL" in this tutorial refers to the PaddleOCR-VL model series (e.g., PaddleOCR-VL-1.6). References specific to the PaddleOCR-VL v1 version will be explicitly noted.

This tutorial is a guide for using PaddleOCR-VL on Apple Silicon, covering the complete workflow from environment preparation to service deployment.

Apple Silicon include, but are not limited to:

  • Apple M1
  • Apple M2
  • Apple M3
  • Apple M4

PaddleOCR-VL has been verified for accuracy and speed on the Apple M4. However, due to hardware diversity, compatibility with other Apple Silicon has not yet been confirmed. We welcome the community to test on different hardware setups and share your results.

Workflow Guide for This Hardware

Use this guide for the workflows below.

GoalSupport on this hardwareRead this section
Local direct inferenceSupportedRead Section 1. Local Runtime Environment Preparation and Section 2. Quick Start.
Client + VLM inference serviceSupportedComplete local direct inference first, then read Section 3. Using VLM Inference Services.
Full API serviceSupported with manual deployment onlyComplete Section 1. Local Runtime Environment Preparation first, then read Section 4.1 Manual Deployment; after that, continue with Section 4.2 Client Invocation Methods and Section 4.3 Pipeline Configuration Adjustment Instructions.
Model fine-tuningSupportedRead Section 5. Model Fine-Tuning.

If you only need to confirm which inference methods are available on this hardware, refer to the PaddleOCR-VL Inference Method and Hardware Support Matrix in the main guide.

1. Local Runtime Environment Preparation

Local Runtime Environment Setup Methods Supported on This Hardware

Local runtime environment setup methodStatusNotes
Official Docker imageNot currently supportedThis hardware does not currently support this path.
Manually install the inference engine and PaddleOCRSupported with steps in this guideContinue reading this section.

Local inference on this hardware currently supports only the PaddlePaddle inference engine.

We strongly recommend installing PaddleOCR-VL in a virtual environment to avoid dependency conflicts. For example, use the Python venv standard library to create a virtual environment:

shell
# Create a virtual environment
python -m venv .venv_paddleocr
# Activate the environment
source .venv_paddleocr/bin/activate

Execute the following commands to complete the installation:

shell
python -m pip install paddlepaddle==3.2.1 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/
python -m pip install -U "paddleocr[doc-parser]"

Please install PaddlePaddle framework version 3.2.1 or above.

2. Quick Start

Please refer to PaddleOCR-VL Usage Tutorial - 2. Quick Start.

3. Using VLM Inference Services

This section explains how to connect PaddleOCR-VL to a dedicated VLM inference service backend. On this hardware, this is usually used to improve inference performance beyond the default configuration for production use. In this hardware-specific guide, the examples use MLX-VLM as the backend for the VLM inference service.

3.1 Starting the VLM Inference Service

IMPORTANT: The service started according to this section is responsible only for the VLM inference stage in the PaddleOCR-VL workflow. It does not provide a complete end-to-end document parsing API. We strongly recommend that you do not call this service directly via HTTP requests or OpenAI clients to process document images. If you need to deploy a service with the full PaddleOCR-VL capabilities, refer to the service deployment section later in this document.

Launch Methods Supported on This Hardware

Launch methodStatusNotes
Official Docker imageNot currently supportedThis hardware does not currently support this path.
Install dependencies with the PaddleOCR CLI and launch the serviceNot currently supportedThis hardware does not currently support this path.
Launch the service directly with the acceleration frameworkSupported with steps in this guideThis section provides the MLX-VLM launch steps.

Install the MLX-VLM inference framework (v0.3.11 or later):

shell
python -m pip install "mlx-vlm>=0.3.11"

Start the MLX-VLM inference service:

shell
mlx_vlm.server --port 8111

3.2 Client Usage Method

The following invocation methods apply to an already launched MLX-VLM inference service.

3.2.1 Command Line Usage

You can specify the backend type (mlx-vlm-server) via --vl_rec_backend, the service address via --vl_rec_server_url, and the huggingface repo id or server-side model weights path via --vl_rec_api_model_name. For example:

shell
paddleocr doc_parser \
  --input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/paddleocr_vl_demo.png \
  --vl_rec_backend mlx-vlm-server \
  --vl_rec_server_url http://localhost:8111/ \
  --vl_rec_api_model_name PaddlePaddle/PaddleOCR-VL-1.6

3.2.2 Python Script Integration

When creating a PaddleOCRVL object, specify the backend type via vl_rec_backend, the service address via the vl_rec_server_url parameter, and the huggingface repo id or server-side model weights path via vl_rec_api_model_name. For example:

python
pipeline = PaddleOCRVL(
    vl_rec_backend="mlx-vlm-server", 
    vl_rec_server_url="http://localhost:8111/",
    vl_rec_api_model_name="PaddlePaddle/PaddleOCR-VL-1.6",
)

3.3 Performance Tuning

Please refer to PaddleOCR-VL Usage Tutorial - 3.3 Performance Tuning.

4. Service Deployment

Deployment Methods Supported on This Hardware

Deployment methodStatusNotes
Docker Compose deploymentNot currently supportedThis hardware currently supports only the manual deployment path.
Manual deploymentSupportedComplete Section 1. Local Runtime Environment Preparation first, then continue with Section 4.1.

4.1 Manual Deployment

Please complete Section 1. Local Runtime Environment Preparation first, then refer to PaddleOCR-VL Usage Tutorial - 4.2 Method 2: Manual Deployment.

4.2 Client Invocation Methods

Please refer to PaddleOCR-VL Usage Tutorial - 4.3 Client Invocation Methods.

4.3 Pipeline Configuration Adjustment Instructions

Please refer to PaddleOCR-VL Usage Tutorial - 4.4 Pipeline Configuration Adjustment Instructions.

5. Model Fine-Tuning

Please refer to PaddleOCR-VL Usage Tutorial - 5. Model Fine-Tuning.