skills/paddleocr-text-recognition/references/output_schema.md
This document defines the output envelope returned by ocr_caller.py.
By default, ocr_caller.py saves the JSON envelope to a unique file under the system temp directory and prints the absolute saved path to stderr. Use --output when you need a custom destination, or --stdout when you want to skip file saving and print JSON directly.
ocr_caller.py wraps provider response in a stable structure:
{
"ok": true,
"text": "Extracted text from all pages",
"result": { ... }, // raw provider response
"error": null
}
On error:
{
"ok": false,
"text": "",
"result": null,
"error": {
"code": "ERROR_CODE",
"message": "Human-readable message"
}
}
| Code | Description |
|---|---|
INPUT_ERROR | Invalid or unusable input (arguments, file source, format, types). |
CONFIG_ERROR | Missing or invalid API / client configuration. |
API_ERROR | Request or response handling failed (network, HTTP, body parsing, schema). |
The result field contains raw provider output.
Raw fields may vary by model version and endpoint.
{
"logId": "request-uuid",
"errorCode": 0,
"errorMsg": "Success",
"result": {
"ocrResults": [
{
"prunedResult": {
"rec_texts": ["First line", "Second line"],
"rec_scores": [0.98, 0.95],
"...": "other OCR fields"
},
"ocrImage": "https://...",
"inputImage": "https://...",
"...": "other model-specific fields"
}
],
"dataInfo": {
"numPages": 1,
"type": "pdf",
"...": "other metadata"
},
"...": "other top-level fields"
}
}
Paths are relative to the output envelope root.
result.result.ocrResults[n].prunedResult
Structured OCR data for page n.
result.result.ocrResults[n].prunedResult.rec_texts
Recognized text lines for page n.
result.result.ocrResults[n].prunedResult.rec_scores
Confidence scores for recognized text lines.
ocr_caller.py extracts top-level text from result.result.ocrResults[n].prunedResult.rec_texts, joins lines with \n, and joins pages with \n\n.
# OCR from URL (result auto-saves to the system temp directory)
uv run scripts/ocr_caller.py --file-url "URL" --pretty
# OCR local file (result auto-saves to the system temp directory)
uv run scripts/ocr_caller.py --file-path "doc.pdf" --pretty
# OCR with explicit file type
uv run scripts/ocr_caller.py --file-url "URL" --file-type 1 --pretty
# Save result to a custom file path
uv run scripts/ocr_caller.py --file-url "URL" --output "./result.json" --pretty
# Print JSON to stdout without saving a file
uv run scripts/ocr_caller.py --file-url "URL" --stdout --pretty