docs/hybrid/research/comparison-summary.md
01030000000045.pdf (1 page with table)| Category | Docling | OpenDataLoader |
|---|---|---|
| Tables | 1 | 1 |
| Text elements | 5 | 4 paragraphs |
| Images | 0 | 1 |
| Headings | (N/A - uses labels) | 1 |
| Label | Count |
|---|---|
| caption | 1 |
| footnote | 1 |
| page_footer | 1 |
| page_header | 1 |
| text | 1 |
| Property | Docling | OpenDataLoader |
|---|---|---|
| Rows | 9 | 3 |
| Columns | 3 | 3 |
| Total cells | 26 | 9 |
Note: Docling detects more rows in the table structure. This may be due to:
| System | l/left | t/top | r/right | b/bottom | Origin |
|---|---|---|---|---|---|
| Docling | 53.22 | 439.98 | 373.94 | 234.74 | BOTTOMLEFT |
| OpenDataLoader | 54.0 | 234.44 | 372.73 | 440.21 | BOTTOMLEFT |
Coordinate mapping: Both use BOTTOMLEFT origin.
{l, t, r, b} where t=top, b=bottom[left, bottom, right, top]So the actual coordinates match closely:
| Docling Type | OpenDataLoader Type |
|---|---|
| texts (label: text) | paragraph |
| texts (label: section_header) | heading |
| tables | table |
| pictures | image |
| texts (label: page_header) | paragraph (filtered as header) |
| texts (label: page_footer) | paragraph (filtered as footer) |
| texts (label: caption) | paragraph |
| texts (label: footnote) | paragraph |
label field for text types, OpenDataLoader uses typeSectionHeaderItem with level, OpenDataLoader uses heading type with level