Back to Cube

Using dynamic union tables

docs-mintlify/recipes/data-modeling/dynamic-union-tables.mdx

1.6.433.2 KB
Original Source

Use case

Sometimes, you may have a lot of tables in a database, which actually relate to the same entity.

For example, you can have “per client” tables with the same data, but related to different customers: elon_musk_table, john_doe_table, steve_jobs_table, etc. In this case, it would make sense to create a single cube for customers, which should be backed by a union table from all customers tables.

Data modeling

You can use the sql parameter to define a cube over an arbitrary SQL query, e.g., a query that includes UNION or UNION ALL operators:

<CodeGroup>
yaml
cubes:
  - name: customers
    sql: |
      SELECT *, 'Einstein' AS name FROM einstein_data UNION ALL
      SELECT *, 'Pascal'   AS name FROM pascal_data   UNION ALL
      SELECT *, 'Newton'   AS name FROM newton_data
  
    measures:
      - name: count
        type: count
  
    dimensions:
      - name: name
        sql: name
        type: string





javascript
cube(`customers`, {
  sql: `
    SELECT *, 'Einstein' AS name FROM einstein_data UNION ALL
    SELECT *, 'Pascal'   AS name FROM pascal_data   UNION ALL
    SELECT *, 'Newton'   AS name FROM newton_data
  `,
 
  measures: {
    count: {
      type: `count`
    }
  },
 
  dimensions: {
    name: {
      sql: `name`,
      type: `string`
    }
  }
})
</CodeGroup>

However, it can be quite annoying to write the SQL to union all tables manually. Luckily, you can use dynamic data modeling to generate necessary SQL based on a list of tables:

<CodeGroup>
yaml
{%- set customer_tables = {
  "einstein_data": "Einstein",
  "pascal_data": "Pascal",
  "newton_data": "Newton"
} -%}
 
cubes:
  - name: customers
    sql: |
      {%- for table, name in customer_tables | items %}
      SELECT *, '{{ name | safe }}' AS name FROM {{ table | safe }}
      {% if not loop.last %}UNION ALL{% endif %}
      {% endfor %}
  
    measures:
      - name: count
        type: count
  
    dimensions:
      - name: name
        sql: name
        type: string
 
 
 
 
 
javascript
const customer_tables = [
  { table: "einstein_data", name: "Einstein" },
  { table: "pascal_data", name: "Pascal" },
  { table: "newton_data", name: "Newton" }
]

cube(`customers`, {
  sql: customer_tables
    .map(entry => `SELECT *, '${entry.name}' AS name FROM ${entry.table}`)
    .join(` UNION ALL `),
 
  measures: {
    count: {
      type: `count`
    }
  },
 
  dimensions: {
    name: {
      sql: `name`,
      type: `string`
    }
  }
})
</CodeGroup>

Result

Querying count and name members of the dynamically defined customers cube would result in the following generated SQL:

sql
SELECT
  "customers".name "customers__name",
  count(*) "customers__count"
FROM
  (
    SELECT
      *,
      'Einstein' AS name
    FROM
      einstein_data
    UNION ALL
    SELECT
      *,
      'Pascal' AS name
    FROM
      pascal_data
    UNION ALL
    SELECT
      *,
      'Newton' AS name
    FROM
      newton_data
  ) AS "customers"
GROUP BY
  1
ORDER BY
  2 DESC