docs/features/feature-guides/service-accounts.md
import FeatureAvailability from '@site/src/components/FeatureAvailability'; import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem';
Available starting in DataHub Cloud v0.3.17, DataHub Core v1.4.0.
Service Accounts provide a secure way to enable programmatic access to DataHub APIs without using personal user credentials. They are designed for automated workflows, CI/CD pipelines, data ingestion processes, and any other use case where a non-human identity needs to interact with DataHub.
Key benefits of using service accounts:
To manage service accounts, a user must have the Manage Service Accounts platform privilege. This privilege allows users to:
By default, users with the Admin role have this privilege. You can grant this privilege to other users or groups through Policies.
Service accounts can be managed from the Settings page in DataHub:
To create a new service account:
The service account will be assigned a unique identifier (URN) automatically. This URN follows the format:
urn:li:corpuser:service_<uuid>
After creating a service account, you need to generate an access token for it to authenticate with DataHub APIs:
:::caution Important Copy and store the generated token securely. It will only be displayed once and cannot be retrieved later. :::
Service accounts can be assigned DataHub roles to control their permissions. To assign a role:
Alternatively, you can add service accounts to Policies for more granular permission control.
To delete a service account:
:::warning Deleting a service account will immediately invalidate all access tokens associated with it. Any automated workflows using those tokens will stop working. :::
To revoke a specific token without deleting the entire service account:
Once you have generated an access token for a service account, you can use it to authenticate with DataHub APIs.
# Set the token as an environment variable
export DATAHUB_GMS_TOKEN="<your-service-account-token>"
# Or pass it directly
datahub get --urn "urn:li:dataset:..."
from datahub.emitter.rest_emitter import DatahubRestEmitter
# Create an emitter with service account token
emitter = DatahubRestEmitter(
gms_server="http://localhost:8080",
token="<your-service-account-token>"
)
# Use the emitter for API calls
emitter.emit(...)
curl -X POST 'https://your-datahub-instance/api/graphql' \
-H 'Authorization: Bearer <your-service-account-token>' \
-H 'Content-Type: application/json' \
-d '{"query": "{ me { corpUser { urn } } }"}'
curl 'https://your-datahub-instance/openapi/v3/entity/dataset/...' \
-H 'Authorization: Bearer <your-service-account-token>'
Service accounts can also be managed programmatically using the GraphQL API.
mutation createServiceAccount($input: CreateServiceAccountInput!) {
createServiceAccount(input: $input) {
urn
type
name
displayName
description
createdBy
createdAt
}
}
Variables:
{
"input": {
"displayName": "My Ingestion Pipeline",
"description": "Service account for automated data ingestion"
}
}
query listServiceAccounts($input: ListServiceAccountsInput!) {
listServiceAccounts(input: $input) {
start
count
total
serviceAccounts {
urn
name
displayName
description
createdBy
createdAt
updatedAt
}
}
}
Variables:
{
"input": {
"start": 0,
"count": 20,
"query": ""
}
}
query getServiceAccount($urn: String!) {
getServiceAccount(urn: $urn) {
urn
name
displayName
description
createdBy
createdAt
updatedAt
}
}
mutation deleteServiceAccount($urn: String!) {
deleteServiceAccount(urn: $urn)
}
mutation createAccessToken($input: CreateAccessTokenInput!) {
createAccessToken(input: $input) {
accessToken
metadata {
id
actorUrn
ownerUrn
name
description
}
}
}
Variables:
{
"input": {
"type": "SERVICE_ACCOUNT",
"actorUrn": "urn:li:corpuser:service_<uuid>",
"name": "my-token-name",
"duration": "ONE_YEAR"
}
}
Valid duration options: ONE_HOUR, ONE_DAY, ONE_MONTH, THREE_MONTHS, SIX_MONTHS, ONE_YEAR, NO_EXPIRY
Use descriptive names that indicate the purpose of each service account:
ingestion-snowflake-prod - For Snowflake production ingestioncicd-github-actions - For GitHub Actions CI/CD pipelinesairflow-metadata-sync - For Airflow metadata synchronizationRegularly rotate service account tokens to maintain security:
Assign the minimum permissions required for each service account:
The Service Accounts tab is only visible to users with the Manage Service Accounts privilege. Contact your DataHub administrator to request access.
Check the following:
Yes, navigate to Settings > Access Tokens and filter by the service account name to see all tokens associated with it. Note that YOU must filter for the owner of the service account to see it. Whoever created the service account will be the token owner.
All tokens associated with a service account are invalidated when the service account is deleted.
Yes, service accounts can be added to policies just like regular users. Use the service account's URN (e.g., urn:li:corpuser:service_<uuid>) when configuring policy actors.
| Feature | Personal Access Token | Service Account Token |
|---|---|---|
| Associated with | A human user | A service account |
| Permissions | Inherits user's permissions | Based on service account's role/policies |
| Revocation impact | Only affects the token holder | Only affects the specific automation |
| Use case | Individual user API access | Automated workflows and integrations |