skills/liteparse/references/ocr_and_formats.md
--no-ocr or ocr_enabled=False.lit parse document.pdf
lit parse document.pdf --ocr-language fra
lit parse document.pdf --no-ocr
parser = LiteParse(ocr_enabled=True, ocr_language="eng", num_workers=4)
Use Tesseract codes (not ISO alone): eng, fra, deu, spa, chi_sim, etc. Map HTTP OCR language=en separately (see below).
Pre-download .traineddata files, then either:
export TESSDATA_PREFIX=/path/to/tessdata
lit parse document.pdf --ocr-language eng
or:
lit parse document.pdf --tessdata-path /path/to/tessdata
For higher accuracy or GPU-backed OCR, run a server implementing the LiteParse OCR API and point LiteParse at it:
lit parse document.pdf --ocr-server-url http://localhost:8080/ocr
parser = LiteParse(ocr_server_url="http://localhost:8080/ocr")
{base_url}/ocr (typically http://host:8080/ocr)multipart/form-datafile (image bytes, required), language (optional, ISO 639-1, default en){
"results": [
{
"text": "recognized text",
"bbox": [x1, y1, x2, y2],
"confidence": 0.95
}
]
}
ocr/easyocr/ — EasyOCR wrapperocr/paddleocr/ — PaddleOCR wrapperYou only need a server if you choose HTTP OCR; Tesseract is sufficient for many workflows.
.pdf — no conversion step.
Requires LibreOffice installed and on PATH.
| Type | Extensions |
|---|---|
| Word | .doc, .docx, .docm, .odt, .rtf, .pages |
| PowerPoint | .ppt, .pptx, .pptm, .odp, .key |
| Spreadsheets | .xls, .xlsx, .xlsm, .ods, .csv, .tsv, .numbers |
Install LibreOffice:
# macOS
brew install --cask libreoffice
# Ubuntu/Debian
sudo apt-get install libreoffice
# Windows (Chocolatey)
choco install libreoffice-fresh
On Windows, add LibreOffice program directory to PATH (often C:\Program Files\LibreOffice\program).
Requires ImageMagick.
| Formats |
|---|
.jpg, .jpeg, .png, .gif, .bmp, .tiff, .webp, .svg |
Install ImageMagick:
# macOS
brew install imagemagick
# Ubuntu/Debian
sudo apt-get install imagemagick
# Windows
choco install imagemagick.app
Office / image → (LibreOffice or ImageMagick) → PDF → PDFium extract → optional OCR → grid projection → text + JSON
If conversion fails, install the missing tool and retry. Plain-text-only paths cannot be screenshot-rendered.