Back to Datahub

README

metadata-ingestion/docs/sources/fabric-onelake/README.md

1.6.02.5 KB
Original Source

Overview

Microsoft Fabric OneLake is a storage and lakehouse platform. Learn more in the official Microsoft Fabric OneLake documentation.

The DataHub integration for Microsoft Fabric OneLake covers workspace, lakehouse, and warehouse containers, table datasets with schema metadata, and view datasets with view definitions and view-to-table lineage parsed from the view SQL. It also extracts query usage statistics from the SQL Analytics Endpoint's queryinsights views, and captures stateful deletion detection.

Concept Mapping

Microsoft FabricDataHub EntityNotes
WorkspaceContainer (subtype: Fabric Workspace)Top-level organizational unit
LakehouseContainer (subtype: Fabric Lakehouse)Contains schemas and tables
WarehouseContainer (subtype: Fabric Warehouse)Contains schemas and tables
SchemaContainer (subtype: Fabric Schema)Logical grouping within lakehouse/warehouse
TableDatasetTables within schemas
ViewDataset (subtype: View)Lakehouse and Warehouse views; lineage extracted from view definition via SQL parsing

Hierarchy Structure

Platform (fabric-onelake)
└── Workspace (Container)
    ├── Lakehouse (Container)
    │   └── Schema (Container)
    │       └── Table/View (Dataset)
    └── Warehouse (Container)
        └── Schema (Container)
            └── Table/View (Dataset)

Platform Instance as Tenant

The Fabric REST API does not expose tenant-level endpoints. To represent tenant-level organization in DataHub, set the platform_instance configuration field to your tenant identifier (e.g., "contoso-tenant"). This will be included in all container and dataset URNs, effectively grouping all workspaces under the specified platform instance/tenant.