metadata-ingestion/docs/sources/fabric-data-factory/fabric-data-factory_post.md
Use the Important Capabilities table above as the source of truth for supported features and whether additional configuration is required.
The connector extracts dataset-level lineage from these Fabric activity types:
| Activity Type | Lineage Behavior |
|---|---|
| Copy | Creates lineage from input dataset(s) to output dataset |
| InvokePipeline | Creates pipeline-to-pipeline lineage to the child pipeline |
Lineage is enabled by default (include_lineage: true).
For lineage to connect properly to datasets ingested from other sources (e.g., Snowflake, BigQuery), the connector resolves Fabric connections to DataHub platforms.
Step 1: Automatic Connection Mapping
The connector automatically maps Fabric connection types to DataHub platforms (e.g., a Snowflake connection maps to the snowflake platform). See FABRIC_CONNECTION_PLATFORM_MAP for the full list of supported mappings. Unsupported connection types fall back to using the connection type string as the platform name.
Step 2: Platform Instance Mapping (for cross-recipe lineage)
If you're ingesting the same data sources with other DataHub connectors (e.g., Snowflake, BigQuery), you need to ensure the platform_instance values match. Use platform_instance_map to map your Fabric connection names to the platform instance used in your other recipes:
# Fabric Data Factory Recipe
source:
type: fabric-data-factory
config:
credential:
authentication_method: service_principal
client_id: ${AZURE_CLIENT_ID}
client_secret: ${AZURE_CLIENT_SECRET}
tenant_id: ${AZURE_TENANT_ID}
platform_instance_map:
# Key: Your Fabric connection name (exact match required)
# Value: The platform_instance from your other source recipe
"snowflake-prod-connection": "prod_warehouse"
"bigquery-analytics": "analytics_project"
# Corresponding Snowflake Recipe (platform_instance must match)
source:
type: snowflake
config:
platform_instance: "prod_warehouse" # Must match the value in platform_instance_map
# ... other config
Without matching platform_instance values, lineage will create separate dataset entities instead of connecting to your existing ingested datasets.
Pipeline and activity runs are extracted as DataProcessInstance entities by default:
source:
type: fabric-data-factory
config:
include_execution_history: true # default
execution_history_days: 7 # 1-90 days
This provides run status, duration, timestamps, invoke type, and activity-level details including error messages and retry attempts.
:::note The Fabric API returns at most 100 recently completed runs per pipeline. Run ingestion more frequently to capture deeper history. :::
platform_instanceUse the connector's platform_instance config to distinguish separate Fabric tenants when ingesting from multiple environments:
| Scenario | Risk | Solution |
|---|---|---|
| Single tenant | None | Not needed |
| Multiple tenants | High - name collision risk | Required |
# Multi-tenant example
source:
type: fabric-data-factory
config:
platform_instance: "contoso-tenant" # Prevents URN collisions
:::warning
Different Fabric tenants could have identically-named workspaces and pipelines. Use platform_instance to prevent entity overwrites.
:::
Pipeline URNs follow this format:
urn:li:dataFlow:(fabric-data-factory,{workspace_id}.{pipeline_id},{env})
With platform_instance:
urn:li:dataFlow:(fabric-data-factory,{platform_instance}.{workspace_id}.{pipeline_id},{env})
execution_history_days covers more runs than this limit, only the most recent 100 are returned. Run ingestion more frequently to capture deeper history.ExecutePipeline activity type is marked as legacy in Fabric and is not supported for cross-pipeline lineage.InvokeFabricPipeline operation type is supported for cross-pipeline lineage. Other operation types (InvokeAdfPipeline, InvokeExternalPipeline) are not resolved and will be skipped.sqlReaderQuery or sqlReaderStoredProcedureName instead of a direct table reference, lineage is not extracted.platform_instance_map to explicitly map connection names.workspace_pattern and pipeline_pattern are not filtering out all items.include_lineage: true is set and that Fabric connections are properly configured for the pipelines. Also review the Lineage limitations section for unsupported activity types and scenarios.stateful_ingestion to automatically remove entities that no longer exist in Fabric.