Back to Datahub

README

metadata-ingestion/docs/sources/informatica/README.md

1.6.03.3 KB
Original Source

Overview

Informatica Intelligent Data Management Cloud (IDMC) is a cloud-native data integration and management platform. Learn more in the official Informatica documentation.

The DataHub integration for Informatica covers projects and folders as containers; Mapping Tasks as DataFlows with a transform DataJob per task; Taskflows as DataFlows with a single orchestrate DataJob that chains the step order via inputDatajobs; and resolves table-level lineage across the data estate from mapping source/target connections. It also supports ownership extraction and stateful deletion detection.

Concept Mapping

Source ConceptDataHub ConceptNotes
"informatica"Data Platform
ProjectContainerSubType "Project"
FolderContainerSubType "Folder"
TaskflowDataFlow + one orchestrate DataJobSubTypes "Taskflow" / "Taskflow Orchestration"; the orchestrate sits at the end of the chain with inputDatajobs = [last MT]
Mapping TaskDataFlow + inner transform DataJobSubTypes "Mapping Task" / "Task Logic"; MTs chain to each other via inputDatajobs in Taskflow step order
Mappingnot emitted as a standalone entityOnly Mapping Tasks (runnable schedules) are emitted; the Mapping reference is surfaced via customProperties on the Task
Mappletnot emittedInternal sub-mappings included in other mappings; skipped
Source/TargetDatasetUpstream/downstream lineage; external dataset URNs receive a minimal Status stub so they resolve in lineage search