docs/hybrid/hybrid-mode-tasks.md
Each task is independently executable. A new Claude Code session can reference this document to perform the task.
These decisions require execution results before they can be made.
| Decision | How to Verify | Impact |
|---|---|---|
| API endpoint | Check available endpoints in OpenAPI spec | Task 6 client implementation |
| Page filtering support | Check if API options include page filter | Full PDF send required if unsupported |
| Response page structure | Analyze sample response JSON structure | Task 7 per-page separation logic |
| Coordinate system | Check bbox value ranges in sample response | Task 7 coordinate conversion formula |
| docling element types | Extract actual type values from sample response | Task 1 mapping table |
| Decision | How to Verify | Impact |
|---|---|---|
| Triage threshold tuning | Check triage_fn (missed tables) count | Lower thresholds if recall < 95% |
| Decision | How to Verify | Impact |
|---|---|---|
| Triage re-tuning needed | Analyze FN case signal patterns | Phase 2 rework |
| Task | Status | Completed | Notes |
|---|---|---|---|
| Task -1: Pre-research | ✅ completed | 2026-01-02 | See docs/hybrid/research/ |
| Task 0: docling-api skill | ✅ completed | 2026-01-02 | See .claude/skills/docling-api/ |
| Task 1: schema-mapping skill | ✅ completed | 2026-01-02 | See .claude/skills/schema-mapping/ |
| Task 2: triage-criteria skill | ✅ completed | 2026-01-02 | See .claude/skills/triage-criteria/ |
| Task 3: HybridConfig | ✅ completed | 2026-01-02 | See java/.../hybrid/HybridConfig.java |
| Task 4: CLI Options | ✅ completed | 2026-01-02 | See java/.../cli/CLIOptions.java |
| Task 5: TriageProcessor | ✅ completed | 2026-01-02 | See java/.../hybrid/TriageProcessor.java |
| Task 6: DoclingClient | ✅ completed | 2026-01-02 | See java/.../hybrid/DoclingClient.java |
| Task 7: SchemaTransformer | ✅ completed | 2026-01-02 | See java/.../hybrid/DoclingSchemaTransformer.java |
| Task 8: HybridDocumentProcessor | ✅ completed | 2026-01-02 | See java/.../processors/HybridDocumentProcessor.java |
| Task 9: Triage Logging | ✅ completed | 2026-01-02 | See java/.../hybrid/TriageLogger.java |
| Task 10: Triage Evaluator | ✅ completed | 2026-01-02 | See tests/benchmark/src/evaluator_triage.py |
| Task 11: Triage Analyzer Agent | ✅ completed | 2026-01-02 | See .claude/agents/triage-analyzer.md |
Status Legend:
not_started - Not yet begunin_progress - Currently workingcompleted - Done and verifiedblocked - Waiting on dependency or issueCollect all required data and specifications before implementation begins.
tests/benchmark/pdfs/# Start docling-serve (official container image)
# Reference: https://github.com/docling-project/docling-serve
docker run -d -p 5001:5001 --name docling-serve \
-e DOCLING_SERVE_ENABLE_UI=1 \
quay.io/docling-project/docling-serve
# Wait for startup (model loading takes time)
sleep 30
# Verify server is running (check API docs page)
curl -s http://localhost:5001/docs | head -20
# Access UI playground at: http://localhost:5001/ui
# Collect OpenAPI specification
curl http://localhost:5001/openapi.json > docs/hybrid/research/docling-openapi.json
# Check available endpoints
cat docs/hybrid/research/docling-openapi.json | jq '.paths | keys'
# Alternative: Using pip (if Docker not available)
# pip install "docling-serve[ui]"
# docling-serve run --enable-ui
# Convert using /v1/convert/source endpoint (official API)
# Using file URL source
curl -X POST http://localhost:5001/v1/convert/source \
-H "Content-Type: application/json" \
-d '{
"sources": [{"kind": "file", "path": "samples/pdf/1901.03003.pdf"}],
"options": {"to_formats": ["json", "md"], "do_table_structure": true}
}' \
> docs/hybrid/research/docling-sample-response.json
# If file path doesn't work, try with base64 or HTTP URL
# Alternative: Use multipart form if available
curl -X POST http://localhost:5001/v1/convert/source \
-F "file=@samples/pdf/1901.03003.pdf" \
> docs/hybrid/research/docling-sample-response.json
# Extract response structure
cat docs/hybrid/research/docling-sample-response.json | jq 'keys'
cat docs/hybrid/research/docling-sample-response.json | jq '.document | keys' 2>/dev/null || \
cat docs/hybrid/research/docling-sample-response.json | jq '.[0] | keys'
# Check element types in response
cat docs/hybrid/research/docling-sample-response.json | jq '[.. | .type? // empty] | unique' 2>/dev/null
# List documents containing tables
cat tests/benchmark/ground-truth/reference.json | \
jq -r 'to_entries[] | select(.value[]?.category == "Table") | .key' | \
sort | uniq > docs/hybrid/research/documents-with-tables.txt
# Count
wc -l docs/hybrid/research/documents-with-tables.txt
# Build Java CLI
./scripts/build-java.sh
# Parse the same PDF with Java (JSON output)
java -jar java/opendataloader-pdf-cli/target/opendataloader-pdf-cli-*.jar \
--format json \
-o docs/hybrid/research/ \
samples/pdf/1901.03003.pdf
# Rename for clarity
mv docs/hybrid/research/1901.03003.json docs/hybrid/research/opendataloader-sample-response.json
# Also generate markdown for comparison
java -jar java/opendataloader-pdf-cli/target/opendataloader-pdf-cli-*.jar \
--format md \
-o docs/hybrid/research/ \
samples/pdf/1901.03003.pdf
mv docs/hybrid/research/1901.03003.md docs/hybrid/research/opendataloader-sample-response.md
# Find all semantic types
grep -r "class Semantic" java/opendataloader-pdf-core/ --include="*.java" -l
# Find TableBorder structure
grep -r "class TableBorder" java/opendataloader-pdf-core/ --include="*.java" -A 20
# List all IObject implementations
grep -r "implements.*IObject" java/opendataloader-pdf-core/ --include="*.java"
# Compare element counts
echo "=== Docling elements ==="
cat docs/hybrid/research/docling-sample-response.json | jq '[.document.content[].type] | group_by(.) | map({type: .[0], count: length})'
echo "=== OpenDataLoader elements ==="
cat docs/hybrid/research/opendataloader-sample-response.json | jq '[.kids[].semanticType] | group_by(.) | map({type: .[0], count: length})'
docs/hybrid/research/
├── docling-openapi.json # Full OpenAPI spec
├── docling-sample-response.json # Docling conversion response
├── opendataloader-sample-response.json # OpenDataLoader JSON output
├── opendataloader-sample-response.md # OpenDataLoader markdown output
├── documents-with-tables.txt # List of docs with tables
└── iobject-structure.md # IObject class hierarchy summary
# Verify all files exist
ls -la docs/hybrid/research/
# Verify docling response has expected structure
cat docs/hybrid/research/docling-sample-response.json | jq '.document.content | length'
This research enables:
Create Claude skill for docling-serve API specification so Claude can correctly generate API integration code.
# 1. Start docling-serve
docker run -p 5001:5001 ds4sd/docling-serve
# 2. Check available endpoints
curl http://localhost:5001/docs # OpenAPI spec
# 3. Test conversion API
curl -X POST http://localhost:5001/v1/convert/file \
-F "file=@tests/benchmark/pdfs/01030000000001.pdf" \
-F "options={\"to_formats\":[\"json\",\"md\"]}" \
> docling-response-sample.json
# 4. Extract schema structure
cat docling-response-sample.json | jq 'keys'
cat docling-response-sample.json | jq '.document.content[0]'
.claude/skills/docling-api/
├── SKILL.md # API specification and usage guide
├── request-schema.json # Request format reference
└── response-schema.json # Response structure reference
---
name: docling-api
description: docling-serve REST API specification. Use when implementing DoclingClient or calling docling API.
---
# docling-serve API Reference
## Base URL
`http://localhost:5001`
## Endpoints
### POST /v1/convert/file
Convert PDF file to structured output.
**Request:**
- Content-Type: multipart/form-data
- file: PDF binary
- options: JSON string with conversion options
**Options:**
```json
{
"to_formats": ["json", "md"],
"do_table_structure": true,
"do_ocr": false
}
Response: See response-schema.json for full structure.
| Type | Description |
|---|---|
| paragraph | Text paragraph |
| table | Table with cells |
| heading | Section heading (level 1-6) |
| list | Bulleted or numbered list |
| figure | Image or diagram |
### Success Criteria
- [ ] API endpoints documented with request/response examples
- [ ] Response JSON schema captured from real API call
- [ ] Skill auto-applies when Claude handles docling-related tasks
- [ ] New Claude session can generate correct DoclingClient code using skill
### Test Method
```bash
# In new Claude Code session:
claude "Write a Java method to call docling-serve API"
# Expected: Claude uses SKILL.md to generate correct endpoint, headers, request format
Create Claude skill documenting the mapping between docling output schema and Java IObject hierarchy.
# 1. Get docling element types from response
cat docling-response-sample.json | jq '.document.content[].type' | sort | uniq
# 2. List existing IObject types
grep -r "class.*implements IObject" java/ --include="*.java"
grep -r "class Semantic" java/ --include="*.java"
# 3. Compare field structures
# docling table cell structure vs TableBorderCell
# docling paragraph structure vs SemanticParagraph
.claude/skills/schema-mapping/
├── SKILL.md # Mapping rules and guidelines
├── docling-elements.json # Docling element type samples
└── iobject-types.md # IObject type reference
---
name: schema-mapping
description: Mapping between docling output and Java IObject types. Use when implementing DoclingSchemaTransformer.
---
# Schema Mapping: Docling → IObject
## Type Mapping
| Docling Type | IObject Type | Key Fields |
|--------------|--------------|------------|
| `paragraph` | `SemanticParagraph` | text, bbox |
| `table` | `TableBorder` | cells[][], bbox |
| `heading` | `SemanticHeading` | text, level, bbox |
| `list` | `PDFList` | items[], bbox |
| `figure` | `ImageChunk` | bbox, metadata |
## Field Mapping Details
### Table Mapping
docling: cells: [{row, col, text, rowspan, colspan}]
IObject (TableBorder): rows: [TableBorderRow] cells: [TableBorderCell] contents: List<IObject> colSpan, rowSpan
### Bounding Box
docling: {x, y, width, height} (normalized 0-1) IObject: BoundingBox(left, bottom, right, top) (PDF points)
Conversion: multiply by page dimensions
# In new Claude Code session:
claude "Transform this docling table JSON to TableBorder"
# Expected: Claude uses mapping rules to generate correct transformation code
Create Claude skill documenting triage decision rules for routing pages to Java vs Docling.
# 1. Analyze page content types
grep -r "LineChunk\|TextChunk\|TableBorder" java/ --include="*.java" | head -20
# 2. Find existing table detection logic
grep -r "detectTable\|TableBorder" java/opendataloader-pdf-core/ --include="*.java"
# 3. Review ground truth for table presence patterns
cat tests/benchmark/ground-truth/reference.json | jq '[.[][] | select(.category=="Table")] | length'
.claude/skills/triage-criteria/
├── SKILL.md # Triage rules and thresholds
└── signals.md # Signal extraction methods
---
name: triage-criteria
description: Page triage decision rules. Use when implementing or tuning TriageProcessor.
---
# Triage Criteria
## Strategy
**Conservative**: Minimize false negatives (missing tables). Accept false positives (unnecessary docling calls).
## Decision Signals
| Signal | Extraction | Threshold | Action |
|--------|------------|-----------|--------|
| Line/Text ratio | lineChunks.size() / textChunks.size() | > 0.3 | → DOCLING |
| Grid pattern | aligned horizontal + vertical lines | >= 3 groups | → DOCLING |
| TableBorder detected | existing detector finds border | any | → DOCLING |
| Default | - | - | → JAVA |
## Threshold Tuning Guide
### If FN (missed tables) is high:
- Lower line/text ratio threshold
- Lower grid pattern threshold
- Add more signals
### If too slow (too many docling calls):
- Raise thresholds
- Add early-exit conditions for simple pages
## Benchmark Metrics
- `triage_recall`: Tables correctly sent to docling (target: >= 0.95)
- `triage_fn`: Tables missed (target: <= 5)
# In new Claude Code session:
claude "The triage FN is too high, how should I adjust thresholds?"
# Expected: Claude references skill to suggest specific threshold changes
Add configuration classes for hybrid processing.
Config.java has no hybrid conceptHybridConfig to store backend connection settingsjava/opendataloader-pdf-core/src/main/java/org/opendataloader/pdf/api/Config.javajava/opendataloader-pdf-core/src/main/java/org/opendataloader/pdf/hybrid/HybridConfig.java// HybridConfig.java
public class HybridConfig {
private String url; // null = use backend default
private int timeoutMs = 0; // 0 = no timeout
private boolean fallbackToJava = true;
private int maxConcurrentRequests = 4;
// getters, setters, builder pattern
// Backend-specific default URLs
public static String getDefaultUrl(String hybrid) {
return switch (hybrid) {
case "docling" -> "http://localhost:5001";
case "hancom" -> null; // requires explicit URL
case "azure" -> null; // requires explicit URL
case "google" -> null; // requires explicit URL
default -> null;
};
}
}
// Config.java additions
public static final String HYBRID_OFF = "off";
public static final String HYBRID_DOCLING = "docling";
public static final String HYBRID_HANCOM = "hancom";
public static final String HYBRID_AZURE = "azure";
public static final String HYBRID_GOOGLE = "google";
private static Set<String> hybridOptions = new HashSet<>();
private String hybrid = HYBRID_OFF;
private HybridConfig hybridConfig = new HybridConfig();
static {
hybridOptions.add(HYBRID_OFF);
hybridOptions.add(HYBRID_DOCLING);
// hancom, azure, google added when implemented
}
public boolean isHybridEnabled() {
return !HYBRID_OFF.equals(hybrid);
}
HybridConfig class created with all fieldsConfig.java has hybrid field with validationConfig.java has HybridConfig fieldisHybridEnabled() helper method./scripts/test-java.sh./scripts/test-java.sh
Add CLI options to enable hybrid processing.
--hybrid optionOptionDefinition pattern in CLIOptions.javajava/opendataloader-pdf-cli/src/main/java/org/opendataloader/pdf/cli/CLIOptions.javaNew options:
--hybrid <off|docling|hancom|...> Hybrid backend to use (default: off)
--hybrid-url <url> Backend server URL (default: backend-specific)
--hybrid-timeout <ms> Request timeout in ms (default: 0, no timeout)
--hybrid-fallback Fallback to Java on error (default: true)
// Add to OPTION_DEFINITIONS list
new OptionDefinition("hybrid", null, "string", "off",
"Hybrid backend for AI processing. Values: off (default), docling, hancom", true),
new OptionDefinition("hybrid-url", null, "string", null,
"Hybrid backend server URL (overrides default)", true),
new OptionDefinition("hybrid-timeout", null, "string", "0",
"Hybrid backend request timeout in milliseconds (0 = no timeout)", true),
new OptionDefinition("hybrid-fallback", null, "boolean", true,
"Fallback to Java on hybrid backend error", true),
--hybrid option parsing and Config reflection--hybrid-url option parsing--hybrid-timeout option parsing--hybrid-fallback option parsing--help shows new options--export-options)# Build
./scripts/build-java.sh
# Check options
java -jar java/opendataloader-pdf-cli/target/opendataloader-pdf-cli-*.jar --help
# Test run (hybrid off, default behavior)
java -jar java/opendataloader-pdf-cli/target/opendataloader-pdf-cli-*.jar \
--hybrid off \
tests/benchmark/pdfs/01030000000001.pdf
# Verify JSON export includes new options
java -jar java/opendataloader-pdf-cli/target/opendataloader-pdf-cli-*.jar --export-options | jq '.options[] | select(.name | startswith("hybrid"))'
Implement page-level triage decision logic (JAVA vs BACKEND routing).
ContentFilterProcessor.getFilteredContents(), before table processingjava/opendataloader-pdf-core/src/main/java/org/opendataloader/pdf/processors/TriageProcessor.javapublic class TriageProcessor {
public enum TriageDecision { JAVA, BACKEND }
public record TriageResult(
int pageNumber,
TriageDecision decision,
double confidence,
TriageSignals signals
) {}
public record TriageSignals(
int lineChunkCount,
int textChunkCount,
double lineToTextRatio,
int alignedLineGroups,
boolean hasTableBorder
) {}
/**
* Classify a page for processing path.
* Conservative: bias toward BACKEND when uncertain.
*/
public static TriageResult classifyPage(
List<IObject> filteredContents,
int pageNumber,
HybridConfig config
) {
// Extract signals from content
// Apply thresholds (from triage-criteria skill)
// Return decision with confidence
}
/**
* Batch triage for all pages.
*/
public static Map<Integer, TriageResult> triageAllPages(
Map<Integer, List<IObject>> pageContents,
HybridConfig config
) {
// Triage all pages, return map of results
}
}
| Signal | Threshold | Action |
|---|---|---|
| LineChunk / TextChunk ratio | > 0.3 | → BACKEND |
| Aligned line groups (grid pattern) | >= 3 | → BACKEND |
| TableBorder detected | any | → BACKEND |
| Default | - | → JAVA |
TriageProcessor class createdTriageResult, TriageSignals records definedclassifyPage() method implementedtriageAllPages() batch method implementedcd java && mvn test -Dtest=TriageProcessorTest
./scripts/test-java.sh
Implement REST API client for docling-serve with batch processing support.
java/opendataloader-pdf-core/src/main/java/org/opendataloader/pdf/hybrid/HybridClient.java (interface)java/opendataloader-pdf-core/src/main/java/org/opendataloader/pdf/hybrid/DoclingClient.java// HybridClient.java - interface for all hybrid backends
public interface HybridClient {
record HybridRequest(
byte[] pdfBytes,
Set<Integer> pageNumbers, // 1-indexed, pages to process
boolean doTableStructure,
boolean doOcr
) {}
record HybridResponse(
String markdown,
JsonNode json, // Full structured output
Map<Integer, JsonNode> pageContents // Per-page content
) {}
HybridResponse convert(HybridRequest request) throws IOException;
CompletableFuture<HybridResponse> convertAsync(HybridRequest request);
boolean isAvailable();
}
// DoclingClient.java - docling-serve implementation
public class DoclingClient implements HybridClient {
private final String baseUrl;
private final HttpClient httpClient;
private final ObjectMapper objectMapper;
private final int timeoutMs;
public DoclingClient(HybridConfig config) { ... }
// Implements HybridClient interface
}
// Factory for creating hybrid clients
public class HybridClientFactory {
public static HybridClient create(String hybrid, HybridConfig config) {
return switch (hybrid) {
case "docling" -> new DoclingClient(config);
// case "hancom" -> new HancomClient(config);
// case "azure" -> new AzureClient(config);
default -> throw new IllegalArgumentException("Unknown hybrid backend: " + hybrid);
};
}
}
HybridClient interface createdDoclingClient class implements interfaceHybridClientFactory for creating clientsconvert() method implemented (HTTP request)convertAsync() method for parallel processingisAvailable() health check implemented# Start docling-server
docker run -p 5001:5001 ds4sd/docling-serve
# Integration test
cd java && mvn test -Dtest=DoclingClientIntegrationTest
# Manual test
curl -X POST http://localhost:5001/v1/convert/file \
-F "file=@tests/benchmark/pdfs/01030000000001.pdf"
Transform docling JSON output to IObject hierarchy.
java/opendataloader-pdf-core/src/main/java/org/opendataloader/pdf/hybrid/HybridSchemaTransformer.java (interface)java/opendataloader-pdf-core/src/main/java/org/opendataloader/pdf/hybrid/DoclingSchemaTransformer.java// HybridSchemaTransformer.java - interface for all hybrid backends
public interface HybridSchemaTransformer {
Map<Integer, List<IObject>> transformAll(
HybridResponse response,
Map<Integer, BoundingBox> pageBoundingBoxes
);
}
// DoclingSchemaTransformer.java - docling implementation
public class DoclingSchemaTransformer implements HybridSchemaTransformer {
@Override
public Map<Integer, List<IObject>> transformAll(
HybridResponse response,
Map<Integer, BoundingBox> pageBoundingBoxes
) { ... }
/**
* Transform single page content.
*/
public List<IObject> transformPage(
JsonNode pageContent,
int pageNumber,
BoundingBox pageBoundingBox
) { ... }
// Type-specific transformers
private TableBorder transformTable(JsonNode tableNode, int pageNumber);
private SemanticParagraph transformParagraph(JsonNode paragraphNode, int pageNumber);
private SemanticHeading transformHeading(JsonNode headingNode, int level, int pageNumber);
private PDFList transformList(JsonNode listNode, int pageNumber);
private ImageChunk transformFigure(JsonNode figureNode, int pageNumber);
// Coordinate conversion
private BoundingBox convertBoundingBox(JsonNode bbox, BoundingBox pageBox);
}
HybridSchemaTransformer interface createdDoclingSchemaTransformer class implements interfacetransformAll() batch method implementedcd java && mvn test -Dtest=DoclingSchemaTransformerTest
Implement hybrid processing pipeline with parallel execution.
┌─ Java pages (parallel) ────────────┐
│ ExecutorService │
All Pages Triage ───┤ ├──→ Merge
│ │
└─ Hybrid pages (batch async) ───────┘
Single API call
java/opendataloader-pdf-core/src/main/java/org/opendataloader/pdf/processors/HybridDocumentProcessor.javajava/opendataloader-pdf-core/src/main/java/org/opendataloader/pdf/processors/DocumentProcessor.javapublic class HybridDocumentProcessor {
public static List<List<IObject>> processDocument(
String inputPdfName,
Config config,
Set<Integer> pagesToProcess
) throws IOException {
// Phase 1: Filter all pages + Triage
Map<Integer, List<IObject>> filteredContents = filterAllPages(pagesToProcess);
Map<Integer, TriageResult> triageResults = TriageProcessor.triageAllPages(
filteredContents, config.getHybridConfig()
);
// Phase 2: Split by decision
Set<Integer> javaPages = filterByDecision(triageResults, JAVA);
Set<Integer> hybridPages = filterByDecision(triageResults, BACKEND);
// Phase 3: Process in parallel
HybridClient client = HybridClientFactory.create(
config.getHybrid(), config.getHybridConfig()
);
CompletableFuture<Map<Integer, List<IObject>>> hybridFuture =
CompletableFuture.supplyAsync(() ->
processHybridPath(inputPdfName, hybridPages, client, config)
);
Map<Integer, List<IObject>> javaResults =
processJavaPathParallel(filteredContents, javaPages, config);
Map<Integer, List<IObject>> hybridResults = hybridFuture.join();
// Phase 4: Merge results
return mergeResults(javaResults, hybridResults, pagesToProcess);
}
private static Map<Integer, List<IObject>> processHybridPath(
String pdfPath,
Set<Integer> pageNumbers,
HybridClient client,
Config config
) {
if (pageNumbers.isEmpty()) return Map.of();
byte[] pdfBytes = Files.readAllBytes(Path.of(pdfPath));
HybridResponse response = client.convert(new HybridRequest(
pdfBytes, pageNumbers, true, false
));
// Get appropriate transformer for the hybrid backend
HybridSchemaTransformer transformer = getTransformer(config.getHybrid());
return transformer.transformAll(response, pageBoundingBoxes);
}
}
HybridDocumentProcessor class createdDocumentProcessor.processFile() hybrid branching# Full test suite
./scripts/test-java.sh
# E2E test with docling hybrid
docker run -p 5001:5001 ds4sd/docling-serve
java -jar java/opendataloader-pdf-cli/target/opendataloader-pdf-cli-*.jar \
--hybrid docling \
--hybrid-url http://localhost:5001 \
tests/benchmark/pdfs/01030000000001.pdf
Log triage decisions to JSON for benchmark evaluation.
java/opendataloader-pdf-core/src/main/java/org/opendataloader/pdf/processors/HybridDocumentProcessor.java{
"document": "01030000000001.pdf",
"hybrid": "docling",
"triage": [
{
"page": 1,
"decision": "JAVA",
"confidence": 0.95,
"signals": {
"lineChunkCount": 2,
"textChunkCount": 45,
"lineToTextRatio": 0.04,
"alignedLineGroups": 0,
"hasTableBorder": false
}
},
{
"page": 2,
"decision": "BACKEND",
"confidence": 0.82,
"signals": {
"lineChunkCount": 28,
"textChunkCount": 32,
"lineToTextRatio": 0.875,
"alignedLineGroups": 4,
"hasTableBorder": true
}
}
],
"summary": {
"totalPages": 10,
"javaPages": 8,
"hybridPages": 2
}
}
triage.json)java -jar ... --hybrid docling input.pdf -o output/
cat output/triage.json | jq '.summary'
Add Python evaluator for triage accuracy measurement.
reference.json table presence per pagetriage.json decisionstriage_fn (tables missed by triage)tests/benchmark/src/evaluator_triage.pytests/benchmark/run.py (integrate triage evaluation)tests/benchmark/thresholds.json (add thresholds)# evaluator_triage.py
from dataclasses import dataclass
from pathlib import Path
import json
@dataclass
class TriageMetrics:
recall: float # Table pages correctly sent to hybrid
precision: float # Hybrid pages that actually had tables
fn_count: int # Tables missed (sent to JAVA)
fp_count: int # Non-table pages sent to hybrid
java_pages: int
hybrid_pages: int
def get_pages_with_tables(reference_path: Path) -> dict[str, set[int]]:
"""Extract page numbers with tables from ground truth."""
...
def evaluate_triage(
reference_path: Path,
triage_path: Path
) -> TriageMetrics:
"""Evaluate triage accuracy against ground truth."""
# 1. Extract page-level table presence from reference.json
# 2. Compare with triage.json decisions
# 3. Calculate FN, FP, recall, precision
...
{
"triage_recall": 0.95,
"triage_fn_max": 5
}
evaluator_triage.py createdtriage_recall, triage_fn calculationrun.pythresholds.json# Run benchmark with docling hybrid
./scripts/bench.sh --hybrid docling
# Or test evaluator directly
cd tests/benchmark
python -c "
from src.evaluator_triage import evaluate_triage
from pathlib import Path
result = evaluate_triage(
Path('ground-truth/reference.json'),
Path('prediction/opendataloader-hybrid-docling/triage.json')
)
print(result)
"
Create Claude agent for analyzing triage accuracy and identifying improvement opportunities.
.claude/agents/triage-analyzer.md---
name: triage-analyzer
description: Analyze triage accuracy, identify false negative cases, suggest threshold adjustments
tools: Read, Grep, Glob, Bash(python:*)
---
# Triage Analyzer
Analyze triage results and identify improvement opportunities.
## Capabilities
1. Compare triage.json with reference.json
2. List all FN cases (missed tables)
3. Analyze common patterns in FN cases
4. Suggest threshold adjustments
5. Generate tuning recommendations
## Analysis Workflow
1. Load triage results and ground truth
2. Identify FN cases
3. For each FN, extract page signals
4. Find common signal patterns
5. Recommend threshold changes
## Output Format
- FN case list with signals
- Pattern analysis
- Specific threshold adjustment recommendations
# In Claude Code:
claude "Analyze triage results and find why FN is high"
# Expected: Agent is used to analyze and provide recommendations
Phase -1: Pre-research
└── Task -1: Data Collection ─────────┐
│
Phase 0: Tool Setup (Skills) ▼
├── Task 0: docling-api skill ────────┐
├── Task 1: schema-mapping skill ─────┤ (parallel)
└── Task 2: triage-criteria skill ────┘
│
Phase 1: Infrastructure ▼
├── Task 3: HybridConfig ─────────────┬──→ Task 4: CLI Options
│ │
Phase 2: Core Components │
├── Task 5: TriageProcessor ──────────┤
├── Task 6: DoclingClient ────────────┤ (parallel)
└── Task 7: SchemaTransformer ────────┘
│
Phase 3: Integration ▼
└── Task 8: HybridDocumentProcessor ──┬──→ Task 9: Triage Logging
│
Phase 4: Evaluation ▼
└── Task 10: Triage Evaluator ────────┬──→ Task 11: Triage Analyzer Agent