metadata-models/docs/entities/structuredProperty.md
Structured Properties are custom, typed metadata fields that can be attached to any entity type in DataHub. They enable organizations to extend the core metadata model with domain-specific attributes that support governance, compliance, and data discovery initiatives.
Structured Properties are identified by a single piece of information:
io.acryl.privacy.retentionTime or companyName.department.propertyName). The qualified name becomes part of the URN and must be globally unique across all structured properties in your DataHub instance.An example of a structured property identifier is urn:li:structuredProperty:io.acryl.privacy.retentionTime.
Unlike other entities that may require multiple pieces of information (platform, name, environment), structured properties have a simple, flat identity model based solely on their qualified name.
The core metadata for a structured property is captured in the propertyDefinition aspect, which defines:
Structured properties support strongly-typed values through the valueType field, which must reference a valid dataType entity. The supported types are:
The type is specified as a URN: urn:li:dataType:datahub.string, urn:li:dataType:datahub.number, etc.
For URN-type properties, you can restrict which entity types are allowed using the typeQualifier field. This enables creating properties that reference only specific entity types:
{
"valueType": "urn:li:dataType:datahub.urn",
"typeQualifier": {
"allowedTypes": [
"urn:li:entityType:datahub.corpuser",
"urn:li:entityType:datahub.corpGroup"
]
}
}
Structured properties can accept either single or multiple values via the cardinality field:
The entityTypes array specifies which entity types this property can be applied to. For example, a property might only be applicable to:
urn:li:entityType:datahub.dataseturn:li:entityType:datahub.schemaFieldurn:li:entityType:datahub.dashboardTo enforce data quality and consistency, you can define a whitelist of acceptable values using the allowedValues array. Each allowed value can include:
When allowed values are defined, the system validates that any property value assignment matches one of the allowed values.
The immutable field (default: false) determines whether a property value can be changed once set. When true, the property value becomes permanent and cannot be modified or removed, ensuring data integrity for critical metadata.
The version field enables breaking schema changes by allowing you to update the property definition in backwards-incompatible ways. Versions must follow the format yyyyMMddhhmmss (e.g., 20240614080000). When a new version is applied:
The structuredPropertySettings aspect controls how properties appear in the DataHub UI:
true, the property is not visible in the UI (useful for internal metadata)true, users can filter search results by this property's valuestrue, displays the property in the asset's sidebartrue and showInAssetSummary is enabled, hides the property from the sidebar when it has no valuetrue, displays the property value as a badge on the asset card (only available for string/number types with allowed values)true, displays the property as a column in dataset schema tables (useful for column-level properties)Structured properties also support standard DataHub aspects:
removed: true, which hides the property without deleting underlying dataThis example creates a structured property for tracking data retention time with numeric values and validation:
<details> <summary>Python SDK: Create a structured property with allowed values</summary>{{ inline /metadata-ingestion/examples/library/structured_property_create_basic.py show_path_as_comment }}
This example creates a property that accepts URN values but only allows references to users and groups:
<details> <summary>Python SDK: Create a structured property with type qualifiers</summary>{{ inline /metadata-ingestion/examples/library/structured_property_create_with_type_qualifier.py show_path_as_comment }}
Once structured properties are defined, you can assign them to entities:
<details> <summary>Python SDK: Apply structured properties to a dataset</summary>{{ inline /metadata-ingestion/examples/library/dataset_update_structured_properties.py show_path_as_comment }}
For more granular control, use patch operations to add or remove individual properties:
<details> <summary>Python SDK: Add and remove structured properties using patches</summary>{{ inline /metadata-ingestion/examples/library/dataset_add_structured_properties_patch.py show_path_as_comment }}
Retrieve structured property definitions to understand their configuration:
<details> <summary>Python SDK: Query a structured property</summary>{{ inline /metadata-ingestion/examples/library/structured_property_query.py show_path_as_comment }}
You can also retrieve structured property definitions using the REST API:
<details> <summary>REST API: Get structured property definition</summary>curl -X 'GET' \
'http://localhost:8080/openapi/v3/entity/structuredProperty/urn%3Ali%3AstructuredProperty%3Aio.acryl.privacy.retentionTime/propertyDefinition' \
-H 'accept: application/json'
Example Response:
{
"urn": "urn:li:structuredProperty:io.acryl.privacy.retentionTime",
"propertyDefinition": {
"value": {
"qualifiedName": "io.acryl.privacy.retentionTime",
"displayName": "Retention Time",
"valueType": "urn:li:dataType:datahub.number",
"description": "Number of days to retain data",
"entityTypes": ["urn:li:entityType:datahub.dataset"],
"cardinality": "SINGLE",
"allowedValues": [
{
"value": { "double": 30 },
"description": "30 days for ephemeral data"
},
{
"value": { "double": 90 },
"description": "90 days for regular data"
}
]
}
}
}
Structured properties enable flexible metadata extension across the entire DataHub ecosystem:
Structured properties can be attached to any core data asset:
Properties also extend to organizational and governance entities:
Structured properties work in conjunction with DataHub Forms to enable:
Structured properties enhance search capabilities:
showInSearchFilters: true become available as faceted search filtersThe GraphQL layer exposes structured properties through several resolvers:
CreateStructuredPropertyResolver: Creates new structured property definitionsUpdateStructuredPropertyResolver: Modifies existing property definitions (within allowed constraints)DeleteStructuredPropertyResolver: Removes structured property definitionsUpsertStructuredPropertiesResolver: Assigns property values to entitiesRemoveStructuredPropertiesResolver: Removes property values from entitiesThese resolvers are located in /datahub-graphql-core/src/main/java/com/linkedin/datahub/graphql/resolvers/structuredproperties/.
Once a structured property is created and in use, certain fields cannot be modified to prevent data inconsistency:
Immutable Fields:
Modifiable Fields:
structuredPropertySettings can be freely modifiedTo make backwards-incompatible changes (like changing cardinality or removing allowed values), you must:
version value in format yyyyMMddhhmmssThis is a destructive operation and should be carefully planned.
Structured properties support two deletion modes:
Soft Delete (via status aspect with removed: true):
removed: falseHard Delete (via entity deletion):
When creating properties for schema fields (columns), be aware:
showInColumnsTable: true displays the property in all dataset schema viewsStructured property values are indexed in Elasticsearch using a special naming convention:
structuredProperties.io_acryl_privacy_retentionTimestructuredProperties._versioned.io_acryl_privacy_retentionTime02.20240614080000.stringUnderstanding this convention is important for:
For more information, refer to: