metadata-ingestion/docs/sources/dbt/dbt-cloud_pre.md
The dbt-cloud module ingests metadata from Dbt into DataHub. It is intended for production ingestion workflows and module-specific capabilities are documented below.
Before running ingestion, ensure network connectivity to the source, valid authentication credentials, and read permissions for metadata APIs required by this module.
Extracts dbt metadata from dbt Cloud APIs.
Create a service account token with "Metadata Only" permission (read-only).
The dbt Cloud source supports two modes of operation:
Specify a single dbt Cloud job to ingest metadata from. The job must have "Generate docs on run" enabled and should process all/most models (otherwise multiple job ingestion may be required).
To get the required IDs, go to the job details page (this is the one with the "Run History" table), and look at the URL. It should look something like this: https://cloud.getdbt.com/next/deploy/107298/projects/175705/jobs/148094. In this example, the account ID is 107298, the project ID is 175705, and the job ID is 148094.
Automatically discovers and ingests metadata from all eligible jobs in a dbt Cloud project. This mode:
generate_docs=True)run_id configuration)When to use auto-discovery:
Requirements: