docs/automations/docs-propagation.md
:::info
This feature is currently in open beta in DataHub Cloud. Reach out to your DataHub Cloud representative to get access.
:::
Documentation Propagation is an automation automatically propagates column and asset (coming soon) descriptions based on downstream column-level lineage and sibling relationships. It simplifies metadata management by ensuring consistency and reducing the manual effort required for documenting data assets to aid in Data Governance & Compliance along with Data Discovery.
This feature is enabled by default in Open Source DataHub.
| Feature | Open Source | DataHub Cloud |
|---|---|---|
| Column-Level Docs Propagation | ✔️ | ✔️ |
| Asset-Level Docs Propagation | ✔️ | ✔️ |
| Downstream Lineage + Siblings | ✔️ | ✔️ |
| Historical Backfilling | ❌ | ✔️ |
Notice that the user must have the Manage Ingestion permission to view and enable the feature.
In DataHub Cloud, you can back-fill historical data for existing assets to ensure that all existing column descriptions are propagated to downstreams when you start the automation. Note that it may take some time to complete the initial back-filling process, depending on the number of assets and the complexity of your lineage.
To do this, navigate to the Automation you created in Step 3 above, click the 3-dot "more" menu:
<p align="left"> </p>and then click "Initialize".
<p align="left"> </p>This one-time step will kick off the back-filling process for existing descriptions. If you only want to begin propagating descriptions going forward, you can skip this step.
Once the automation is enabled, you'll be able to recognize propagated descriptions as those with the thunderbolt icon next to them:
The tooltip will provide additional information, including where the description originated and any intermediate hops that were used to propagate the description.
<p align="left"> </p>