datahub-actions/src/datahub_actions/plugin/action/propagation/docs/README.md
The Documentation Propagation Action allows you to automatically propagate documentation from schema fields to related schema fields. For example, when you add or update documentation for a column in a dataset, this action can automatically propagate that documentation to upstream, downstream, or sibling columns.
This action listens for documentation changes on schema fields and propagates those changes to related fields based on your configuration. It supports:
The Documentation Propagation Action provides several configuration options:
enabled: Controls whether documentation propagation is enabled (default: true)columns_enabled: Controls whether column-level documentation propagation is enabled (default: true)datasets_enabled: Controls whether dataset-level documentation propagation is enabled (default: false) - Note: Currently not implementedcolumn_propagation_relationships: Specifies which relationships to use for propagation. Valid values are:
UPSTREAM: Propagate to upstream columnsDOWNSTREAM: Propagate to downstream columnsSIBLING: Propagate to sibling columnsmax_propagation_depth: Maximum depth for propagation chains (default: 5)max_propagation_fanout: Maximum number of entities to propagate to in a single hop (default: 1000)max_propagation_time_millis: Maximum time in milliseconds for a propagation chain (default: 3600000 - 1 hour)name: "documentation_propagation"
source:
type: "kafka"
config:
connection:
bootstrap: ${KAFKA_BOOTSTRAP_SERVER:-localhost:9092}
schema_registry_url: ${SCHEMA_REGISTRY_URL:-http://localhost:8081}
# Topic Routing - which topics to read from.
topic_routes:
#mcl: ${METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME:-MetadataChangeLog_Versioned_v1} # Topic name for MetadataChangeLogEvent_v1 events.
pe: ${PLATFORM_EVENT_TOPIC_NAME:PlatformEvent_v1} # Topic name for PlatformEvent_v1 events.
filter:
event_type: "EntityChangeEvent_v1"
event:
entityType: "schemaField"
category: "DOCUMENTATION"
action:
type: "doc_propagation"
config:
enabled: true
columns_enabled: true
max_propagation_depth: 3 # Optional: Limit propagation depth
datahub:
server: ${DATAHUB_GMS_HOST:-http://localhost:8080}
token: ${DATAHUB_GMS_TOKEN}
When a documentation change is detected on a schema field:
When documentation is propagated, the action adds attribution metadata to track:
This attribution information is stored with the propagated documentation and can be viewed in the DataHub UI.