Back to Opendataloader Pdf

Docling Speed Experiment Results

docs/hybrid/experiments/speed/speed-experiment-2026-01-03.md

2.4.21.8 KB
Original Source

Docling Speed Experiment Results

Date: 2026-01-03 14:31:43

Summary

ApproachDescriptionAvg (s/doc)TargetStatusSpeedup
baselinedocling-serve HTTP2.283---
fastapiFastAPI + SDK singleton0.6850.8PASS3.3x
subprocessPersistent subprocess0.6611.0PASS3.5x

Decision

Phase 0 PASSED - FastAPI approach meets the < 0.8s threshold.

Proceed to Phase 1 implementation:

  • Task 1.1: docling_subprocess_worker.py (skipped - FastAPI only)
  • Task 1.2: hybrid_server.py (opendataloader-pdf-hybrid CLI)
  • Task 2.1: DoclingSubprocessClient.java (skipped - FastAPI only)
  • Task 2.2: DoclingFastServerClient.java
  • Task 2.3: HybridClientFactory modification
  • Task 3: Benchmark integration
  • Task 4: Final validation

Subprocess approach also passed - both approaches available for implementation.

Detailed Statistics

Baseline

  • Description: docling-serve HTTP API
  • Timestamp: 2026-01-03 14:23:41
  • Total documents: 200
  • Successful: 200
  • Failed: 0
  • Total elapsed: 456.6s
  • Average per doc: 2.2825s
  • Min: 2.0045s
  • Max: 8.0182s

Fastapi

  • Description: FastAPI server with docling SDK singleton
  • Timestamp: 2026-01-03 14:27:18
  • Total documents: 200
  • Successful: 200
  • Failed: 0
  • Total elapsed: 137.1s
  • Average per doc: 0.6855s
  • Min: 0.1912s
  • Max: 4.2420s

Subprocess

  • Description: Persistent Python subprocess with docling SDK
  • Timestamp: 2026-01-03 14:30:50
  • Total documents: 200
  • Successful: 200
  • Failed: 0
  • Total elapsed: 132.4s
  • Average per doc: 0.6612s
  • Min: 0.1908s
  • Max: 4.2498s