docs-mintlify/reference/data-modeling/cube_dbt.mdx
cube_dbt package simplifies defining the data model in the semantic layer
on top of dbt models. It provides convenient tools for
loading the metadata of a dbt project, inspecting dbt models, and rendering
them as cubes in YAML.
cube_dbt package from PyPIcube_dbt on GitHubcube on GitHub{/*
Run the following command in the root directory of your Cube project:
echo "cube_dbt" > requirements.txt
pip install -r requirements.txt
*/}
Add the cube_dbt package to the requirements.txt file in the root
directory of your Cube project. Cube Cloud will install the dependencies
automatically.
Dbt classEncapsulates tools for working with the metadata of a dbt project.
Dbt.__init__The constructor accepts the metadata of a dbt project as a dict with the
contents of a manifest.json file.
import json
from cube_dbt import Dbt
manifest_path = './manifest.json'
with open(manifest_path, 'r') as file:
manifest = json.loads(file.read())
dbt = Dbt(manifest)
Use in cases when Dbt.from_file and Dbt.from_url aren't applicable,
e.g., when manifest.json is loaded from a private AWS S3 bucket.
Dbt.from_fileThis static method loads the metadata of a dbt project from a manifest.json
file by its path and returns an instance of the Dbt class.
from cube_dbt import Dbt
manifest_path = './manifest.json'
dbt = Dbt.from_file(manifest_path)
Dbt.from_urlThis static method loads the metadata of a dbt project from a manifest.json
file by its URL and returns an instance of the Dbt class.
from cube_dbt import Dbt
manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'
dbt = Dbt.from_url(manifest_url)
Dbt.filterThis method filters loaded dbt models by their path prefixes, tags, or names.
from cube_dbt import Dbt
manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'
dbt = Dbt.from_url(manifest_url).filter(
paths=['marts/'], # Only models under the 'marts/' path
tags=['cube'], # Only models with the 'cube' tag
names=['orders'] # Only the 'orders' model
)
Use to expose only necessary dbt models to the semantic layer.
Note that values in paths should not be prefixed with models/.
Dbt.modelsThis property exposes a list of loaded dbt models as instances of the
Model class.
from cube_dbt import Dbt
manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'
dbt = Dbt.from_url(manifest_url)
for model in dbt.models:
print(model)
Only dbt models that comply with Dbt.filter rules and are not
materialized as ephemeral will be returned.
Dbt.modelThis method returns a loaded dbt model by its name as an instance of the
Model class.
from cube_dbt import Dbt
manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'
dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
print(model)
Only dbt models that comply with Dbt.filter rules and are not
materialized as ephemeral will be returned.
Model classEncapsulates tools for working with the metadata of a dbt model.
Model.nameThis property exposes the name of a dbt model.
from cube_dbt import Dbt
manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'
dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
print(model.name)
# For example, 'orders'
Model.descriptionThis property exposes the description of a dbt model.
from cube_dbt import Dbt
manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'
dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
print(model.description)
# For example, 'All Jaffle Shop orders'
Model.sql_tableThis property exposes the fully-qualified SQL relation name of a dbt model
that can be used as the sql_table parameter of a cube.
from cube_dbt import Dbt
manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'
dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
print(model.sql_table)
# For example, '"db"."public"."orders"'
Model.columnsThis property exposes a list of columns that belong to this dbt model as
instances of the Column class.
from cube_dbt import Dbt
manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'
dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
for column in model.columns:
print(column)
Model.columnThis method exposes a column that belongs to this dbt model by its name as
an instance of the Column class.
from cube_dbt import Dbt
manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'
dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')
print(column)
Model.primary_keyThis method returns the primary key column, if this dbt model has any, as an
instance of the Column class. Returns None if there's no primary key in
this dbt model.
from cube_dbt import Dbt
manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'
dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
print(model.primary_key)
See Column.primary_key for details on the detection of
primary key columns.
Model.as_cubeThis method renders this dbt model as a YAML snippet that can be inserted
into YAML data models. Includes name, description (if present), and
sql_table.
from cube_dbt import Dbt
manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'
dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
print(model.as_cube())
In the returned multiline string, all lines except for the first one are left-padded with 4 spaces for easier use in YAML data models:
# Jinja template
cubes:
- {{ model.as_cube() }}
# YAML
cubes:
- name: orders
description: All Jaffle Shop orders
sql_table: '"db"."public"."orders"'
Model.as_dimensionsThis method renders the list of columns that belong to this dbt model as a YAML snippet that can be inserted into YAML data models.
Optionally, accepts a list of column names that should be ignored in skip.
from cube_dbt import Dbt
manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'
dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
print(model.as_dimensions(skip=['status']))
See Column.as_dimension for details on the
dimension rendering.
In the returned multiline string, all lines except for the first one are left-padded with 6 spaces for easier use in YAML data models:
# Jinja template
cubes:
- {{ model.as_cube() }}
dimensions:
{{ model.as_dimensions() }}
# YAML
cubes:
- name: orders
description: All Jaffle Shop orders
sql_table: '"db"."public"."orders"'
dimensions:
- name: id
sql: id
type: number
primary_key: true
Column classEncapsulates tools for working with the metadata of a column that belongs to a dbt model.
Column.nameThis property exposes the name of a column.
from cube_dbt import Dbt
manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'
dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')
print(column.name)
# For example, 'status'
Column.descriptionThis property exposes the description of a column.
from cube_dbt import Dbt
manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'
dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')
print(column.description)
# For example, 'Order execution status: new, in progress, delivered'
Column.sqlThis property exposes the name of a column that can be used as the
sql parameter of a dimension.
from cube_dbt import Dbt
manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'
dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')
print(column.sql)
# For example, 'status'
Column.typeThis property exposes the data type of a column that can be used as the
type parameter of a dimension.
from cube_dbt import Dbt
manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'
dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')
print(column.type)
# For example, 'string'
cube_dbt package applies a set of heuristics to map database-specific
types to dimension types. You can check the source
code
for implementation details.
If a column type is not defined in the metadata of a dbt project, string
is used by default.
Column.metaThis property exposes the meta data of a column as a dict that can be
used as the meta parameter of a dimension.
from cube_dbt import Dbt
manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'
dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')
print(column.meta)
# For example, '{some: "data"}'
Column.primary_keyThis property exposes a bool value that indicates if a column is
a primary key or not.
from cube_dbt import Dbt
manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'
dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')
print(column.primary_key)
# For example, 'False'
By convention, the column is considered a primary key if it has the
primary_key tag in the metadata of a dbt project.
Column.as_dimensionThis method renders this column as a YAML snippet that can be inserted
into YAML data models. Includes name, description (if present), sql,
type, primary_key (if True), and meta (if present).
from cube_dbt import Dbt
manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'
dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')
print(column.as_dimension())
In the returned multiline string, all lines except for the first one are left-padded with 8 spaces for easier use in YAML data models:
# Jinja template
cubes:
- {{ model.as_cube() }}
dimensions:
{% for column in model.columns() %}
- {{ column.as_dimension() }}
{% endfor %}
# YAML
cubes:
- name: orders
description: All Jaffle Shop orders
sql_table: '"db"."public"."orders"'
dimensions:
- name: id
sql: id
type: number
primary_key: true
- name: status
description: 'Order execution status: new, in progress, delivered'
sql: status
type: string
meta:
some: data