Back to Datahub

README

metadata-ingestion/docs/sources/fabric-data-factory/README.md

1.6.01.9 KB
Original Source

Overview

Microsoft Fabric Data Factory is a cloud-based data integration service within the Microsoft Fabric platform. Learn more in the official Microsoft Fabric Data Factory documentation.

The DataHub integration for Fabric Data Factory covers pipeline and orchestration entities such as workspaces, data pipelines, and activities. It also captures table-level lineage and stateful deletion detection.

Concept Mapping

Fabric Data Factory ConceptDataHub EntityNotes
WorkspaceContainer (subtype: Fabric Workspace)Top-level organizational unit
Data PipelineDataFlowOrchestration pipeline containing activities
ActivityDataJobIndividual task within a pipeline (Copy, Lookup, Spark, etc.)
Pipeline RunDataProcessInstanceExecution record for a pipeline run
Activity RunDataProcessInstanceExecution record for an individual activity within a pipeline
Connection(resolved to external Dataset)Used for lineage resolution to datasets on external platforms

Hierarchy Structure

Platform (fabric-data-factory)
└── Workspace (Container)
    └── Data Pipeline (DataFlow)
        └── Activity (DataJob)
            ├── Pipeline Run (DataProcessInstance)
            └── Activity Run (DataProcessInstance)