metadata-models/docs/entities/role.md
The role entity represents external access management roles from source systems (e.g., Snowflake, BigQuery) that control access to data assets. This entity enables DataHub to model and display which roles provide access to datasets, helping data consumers understand what permissions they need to access specific data resources.
Roles are identified by a single piece of information:
An example of a role identifier is urn:li:role:snowflake_reader_role.
Role properties are stored in the roleProperties aspect and contain key information about the external access management role:
The following code snippet shows how to create a role with properties:
<details> <summary>Python SDK: Create a role with properties</summary>{{ inline /metadata-ingestion/examples/library/role_create.py show_path_as_comment }}
Roles can be assigned to users and groups through the actors aspect. This tracks which users (corpuser entities) and groups (corpGroup entities) have been provisioned with the role in the external system.
The actors aspect contains:
The following code snippet shows how to assign users and groups to a role:
<details> <summary>Python SDK: Assign users and groups to a role</summary>{{ inline /metadata-ingestion/examples/library/role_assign_actors.py show_path_as_comment }}
Roles are connected to datasets through the access aspect on dataset entities. This aspect lists which roles provide access to a specific dataset, creating a clear view of the access control landscape.
The following code snippet shows how to associate roles with a dataset:
<details> <summary>Python SDK: Associate roles with a dataset</summary>{{ inline /metadata-ingestion/examples/library/role_assign_to_dataset.py show_path_as_comment }}
You can retrieve role information using the standard REST API endpoints. The response includes all aspects of the role entity.
<details> <summary>Query a role entity via REST API</summary>curl 'http://localhost:8080/entities/urn%3Ali%3Arole%3Asnowflake_reader_role'
This will return the complete role entity including:
roleKey: The identity aspectroleProperties: Name, description, type, and request URLactors: Users and groups assigned to the role{{ inline /metadata-ingestion/examples/library/role_query.py show_path_as_comment }}
curl -X POST 'http://localhost:8080/entities?action=search' \
-H 'Content-Type: application/json' \
-d '{
"entity": "role",
"input": "reader",
"start": 0,
"count": 10
}'
This searches across role names and returns matching role URNs.
</details>Roles have direct relationships with user and group entities:
corpuser entity via the "Has" relationshipcorpGroup entity via the "Has" relationshipThese relationships enable:
It's important to distinguish between the role entity and the dataHubRole entity:
The roleMembership aspect on corpuser and corpGroup entities refers to dataHubRole entities, not the external role entities documented here.
Common usage patterns for the role entity include:
The role entity is exposed through DataHub's GraphQL API with the Role type. Key resolvers include:
RoleType: Provides search and batch load capabilities for role entitiesListRolesResolver: Queries all roles in the systemBatchAssignRoleResolver: Bulk assignment of roles to users/groupsAcceptRoleResolver: Workflow for accepting role assignmentsCurrently, the role entity and access management features only support dataset entities. While roles conceptually could apply to other data assets (dashboards, charts, etc.), the access aspect is currently only defined for datasets.
Future enhancements may extend role-based access management to additional entity types.
The role entity is designed to represent roles that exist in external systems. DataHub does not create or manage these roles directly - it only models them for discovery and documentation purposes. The actual provisioning and de-provisioning of role memberships must be performed in the source systems.
The Access Management UI features are disabled by default in self-hosted deployments. To enable role visualization in the UI, set the SHOW_ACCESS_MANAGEMENT environment variable to true for the datahub-gms service.
The role entity and access management features are under active development and subject to change. Planned enhancements include: