examples/python/batch/README.md
Demonstrates processing multiple PDFs in a single invocation to avoid repeated Java JVM startup overhead.
batch_processing.py shows two methods for batch conversion:
Both methods use a single JVM invocation, which is significantly faster than calling the CLI once per file.
Run:
pip install -r requirements.txt
python batch_processing.py
Found 4 PDFs in pdf/
==========================================================
Method 1: Batch convert with file list
==========================================================
Document Pages Top-level
----------------------------------------------------------
1901.03003 15 241
2408.02509v1 14 365
chinese_scan 1 1
lorem 1 2
----------------------------------------------------------
Total 31 609
Processed 4 documents
Time: 7.95s (single JVM invocation)