Back to Datahub

README

metadata-ingestion/docs/sources/excel/README.md

1.6.01.2 KB
Original Source

Overview

Excel is a storage and lakehouse platform. Learn more in the official Excel documentation.

The DataHub integration for Excel covers file/lakehouse metadata entities such as datasets, paths, and containers. It also captures data profiling and stateful deletion detection.

Concept Mapping

Excel EntityDataHub EntityDescription
Excel WorksheetDatasetEach worksheet becomes a dataset with URN pattern: urn:li:dataset:(urn:li:dataPlatform:excel,{path}/[{filename}]{sheet_name},PROD)
File/Directory StructureContainerDirectory hierarchy creates containers with obfuscated URNs for organizing datasets

:::info Excel workbook

The Excel workbook file itself does not become a separate DataHub entity - only the individual worksheets within it are ingested as datasets. :::