Back to Agent Zero

Document Query

plugins/_document_query/webui/config.html

1.192.4 KB
Original Source

Document Query

Settings for document parsing and retrieval.

Max parser concurrency

Maximum document parser jobs allowed to run at the same time in this Agent Zero process.

Per-document timeout

Seconds allowed for a single document parse before trying the next parser or returning an error.

Batch timeout

Seconds allowed for a document-query call that processes multiple files.

Retrieval

Controls how parsed text is split, indexed, and selected for Q&A.

Chunk size

Target character count for indexed chunks.

Chunk overlap

Characters repeated between neighboring chunks.

Search threshold

Minimum vector similarity score for retrieved chunks.

Max index chunks

Adapt chunk size when a parsed document would exceed this many indexed chunks. Use 0 for no cap.

Search limit

Maximum matching chunks considered during document Q&A.

Intro chunks

Leading chunks always included per document to preserve titles, abstracts, and setup context.

Remote Files

Limits for documents loaded from HTTP or HTTPS URLs.

Fetch timeout

Seconds allowed for each remote fetch attempt.

Fetch retries

Attempts made before giving up on a remote document.

Retry backoff

Delay in seconds between remote fetch retries.

Max remote bytes

Maximum size accepted for a remote document.

LiteParse and OCR

Parser preference and OCR controls for PDFs, document images, and mixed-format inputs.

Use LiteParse first

Prefer LiteParse before legacy parser fallbacks.

LiteParse workers

Worker count passed to LiteParse for a single parser job.

Max pages

Maximum pages LiteParse should parse from one document.

Target pages

Optional page range expression passed to LiteParse.

Enable OCR

Allow OCR for scanned PDFs and document images.

OCR language

Tesseract language code used by LiteParse OCR.

OCR DPI

Render density used before OCR.

OCR server URL

Optional external OCR service URL.

Tessdata path

Optional path to Tesseract language data.

Output format

LiteParse output format passed to the parser runtime.

Preserve very small text

Keep tiny text regions that OCR or PDF extraction might otherwise discard.

Fallbacks

Legacy parser behavior used when the preferred parser cannot produce text.

PDF OCR fallback

Try legacy Tesseract OCR when direct PDF text extraction is empty.

Thread offload

Run synchronous parser fallbacks in a worker thread.