`cube_dbt` package

cube_dbt package simplifies defining the data model in the semantic layer on top of dbt models. It provides convenient tools for loading the metadata of a dbt project, inspecting dbt models, and rendering them as cubes in YAML.

Install cube_dbt package from PyPI
Check the source code in cube_dbt on GitHub
Submit issues to cube on GitHub

Installation

{/*

Cube Core

Run the following command in the root directory of your Cube project:

bash

echo "cube_dbt" > requirements.txt
pip install -r requirements.txt

*/}

Cube Cloud

Add the cube_dbt package to the requirements.txt file in the root directory of your Cube project. Cube Cloud will install the dependencies automatically.

Reference

`Dbt` class

Encapsulates tools for working with the metadata of a dbt project.

`Dbt.init`

The constructor accepts the metadata of a dbt project as a dict with the contents of a manifest.json file.

python

import json
from cube_dbt import Dbt

manifest_path = './manifest.json'

with open(manifest_path, 'r') as file:
  manifest = json.loads(file.read())
  dbt = Dbt(manifest)

Use in cases when Dbt.from_file and Dbt.from_url aren't applicable, e.g., when manifest.json is loaded from a private AWS S3 bucket.

`Dbt.from_file`

This static method loads the metadata of a dbt project from a manifest.json file by its path and returns an instance of the Dbt class.

python

from cube_dbt import Dbt

manifest_path = './manifest.json'

dbt = Dbt.from_file(manifest_path)

`Dbt.from_url`

This static method loads the metadata of a dbt project from a manifest.json file by its URL and returns an instance of the Dbt class.

python

from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)

`Dbt.filter`

This method filters loaded dbt models by their path prefixes, tags, or names.

python

from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url).filter(
  paths=['marts/'],  # Only models under the 'marts/' path
  tags=['cube'],     # Only models with the 'cube' tag
  names=['orders']   # Only the 'orders' model 
)

Use to expose only necessary dbt models to the semantic layer.

Note that values in paths should not be prefixed with models/.

`Dbt.models`

This property exposes a list of loaded dbt models as instances of the Model class.

python

from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)

for model in dbt.models:
  print(model)

Only dbt models that comply with Dbt.filter rules and are not materialized as ephemeral will be returned.

`Dbt.model`

This method returns a loaded dbt model by its name as an instance of the Model class.

python

from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)

model = dbt.model('orders')
print(model)

Only dbt models that comply with Dbt.filter rules and are not materialized as ephemeral will be returned.

`Model` class

Encapsulates tools for working with the metadata of a dbt model.

`Model.name`

This property exposes the name of a dbt model.

python

from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')

print(model.name)
# For example, 'orders'

`Model.description`

This property exposes the description of a dbt model.

python

from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')

print(model.description)
# For example, 'All Jaffle Shop orders'

`Model.sql_table`

This property exposes the fully-qualified SQL relation name of a dbt model that can be used as the sql_table parameter of a cube.

python

from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')

print(model.sql_table)
# For example, '"db"."public"."orders"'

`Model.columns`

This property exposes a list of columns that belong to this dbt model as instances of the Column class.

python

from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')

for column in model.columns:
  print(column)

`Model.column`

This method exposes a column that belongs to this dbt model by its name as an instance of the Column class.

python

from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')

column = model.column('status')
print(column)

`Model.primary_key`

This method returns the primary key column, if this dbt model has any, as an instance of the Column class. Returns None if there's no primary key in this dbt model.

python

from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')

print(model.primary_key)

See Column.primary_key for details on the detection of primary key columns.

`Model.as_cube`

This method renders this dbt model as a YAML snippet that can be inserted into YAML data models. Includes name, description (if present), and sql_table.

python

from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')

print(model.as_cube())

In the returned multiline string, all lines except for the first one are left-padded with 4 spaces for easier use in YAML data models:

yaml

# Jinja template
cubes:
  - {{ model.as_cube() }}

# YAML
cubes:
  - name: orders
    description: All Jaffle Shop orders
    sql_table: '"db"."public"."orders"'

`Model.as_dimensions`

This method renders the list of columns that belong to this dbt model as a YAML snippet that can be inserted into YAML data models.

Optionally, accepts a list of column names that should be ignored in skip.

python

from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')

print(model.as_dimensions(skip=['status']))

See Column.as_dimension for details on the dimension rendering.

In the returned multiline string, all lines except for the first one are left-padded with 6 spaces for easier use in YAML data models:

yaml

# Jinja template
cubes:
  - {{ model.as_cube() }}

    dimensions:
      {{ model.as_dimensions() }}

# YAML
cubes:
  - name: orders
    description: All Jaffle Shop orders
    sql_table: '"db"."public"."orders"'

    dimensions:
      - name: id
        sql: id
        type: number
        primary_key: true

`Column` class

Encapsulates tools for working with the metadata of a column that belongs to a dbt model.

`Column.name`

This property exposes the name of a column.

python

from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')

print(column.name)
# For example, 'status'

`Column.description`

This property exposes the description of a column.

python

from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')

print(column.description)
# For example, 'Order execution status: new, in progress, delivered'

`Column.sql`

This property exposes the name of a column that can be used as the sql parameter of a dimension.

python

from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')

print(column.sql)
# For example, 'status'

`Column.type`

This property exposes the data type of a column that can be used as the type parameter of a dimension.

python

from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')

print(column.type)
# For example, 'string'

cube_dbt package applies a set of heuristics to map database-specific types to dimension types. You can check the source code for implementation details.

If a column type is not defined in the metadata of a dbt project, string is used by default.

`Column.meta`

This property exposes the meta data of a column as a dict that can be used as the meta parameter of a dimension.

python

from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')

print(column.meta)
# For example, '{some: "data"}'

`Column.primary_key`

This property exposes a bool value that indicates if a column is a primary key or not.

python

from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')

print(column.primary_key)
# For example, 'False'

By convention, the column is considered a primary key if it has the primary_key tag in the metadata of a dbt project.

`Column.as_dimension`

This method renders this column as a YAML snippet that can be inserted into YAML data models. Includes name, description (if present), sql, type, primary_key (if True), and meta (if present).

python

from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')

print(column.as_dimension())

In the returned multiline string, all lines except for the first one are left-padded with 8 spaces for easier use in YAML data models:

yaml

# Jinja template
cubes:
  - {{ model.as_cube() }}

    dimensions:
      {% for column in model.columns() %}
      - {{ column.as_dimension() }}
      {% endfor %}

# YAML
cubes:
  - name: orders
    description: All Jaffle Shop orders
    sql_table: '"db"."public"."orders"'

    dimensions:
      - name: id
        sql: id
        type: number
        primary_key: true

      - name: status
        description: 'Order execution status: new, in progress, delivered'
        sql: status
        type: string
        meta:
          some: data

Installation

Cube Core

Cube Cloud

Reference

Dbt class

Dbt.__init__

Dbt.from_file

Dbt.from_url

Dbt.filter

Dbt.models

Dbt.model

Model class

Model.name

Model.description

Model.sql_table

Model.columns

Model.column

Model.primary_key

Model.as_cube

Model.as_dimensions

Column class

Column.name

Column.description

Column.sql

Column.type

Column.meta

Column.primary_key

Column.as_dimension

`Dbt` class

`Dbt.init`

`Dbt.from_file`

`Dbt.from_url`

`Dbt.filter`

`Dbt.models`

`Dbt.model`

`Model` class

`Model.name`

`Model.description`

`Model.sql_table`

`Model.columns`

`Model.column`

`Model.primary_key`

`Model.as_cube`

`Model.as_dimensions`

`Column` class

`Column.name`

`Column.description`

`Column.sql`

`Column.type`

`Column.meta`

`Column.primary_key`

`Column.as_dimension`