Back to Cube

Context variables

docs/content/product/data-modeling/reference/context-variables.mdx

1.6.4315.6 KB
Original Source

Context variables

You can use the following context variables within cube definitions:

CUBE

You can use the CUBE context variable to reference columns or members of the current cube so you don't have to repeat the its name over and over.

It helps reference members while keeping the data model code DRY and easy to maintain.

<CodeTabs>
javascript
cube(`users`, {
  sql_table: `users`,

  joins: {
    contacts: {
      sql: `${CUBE}.contact_id = ${contacts.id}`,
      relationship: `one_to_one`
    }
  },

  dimensions: {
    id: {
      sql: `${CUBE}.id`,
      type: `number`,
      primary_key: true
    },

    name: {
      sql: `COALESCE(${CUBE}.name, ${contacts.name})`,
      type: `string`
    }
  }
})

cube(`contacts`, {
  sql_table: `contacts`,

  dimensions: {
    id: {
      sql: `${CUBE}.id`,
      type: `number`,
      primary_key: true
    },

    name: {
      sql: `${CUBE}.name`,
      type: `string`
    }
  }
})
yaml
cubes:
  - name: users
    sql_table: users

    joins:
      - name: contacts
        sql: "{CUBE}.contact_id = {contacts.id}"
        relationship: one_to_one

    dimensions:
      - name: id
        sql: "{CUBE}.id"
        type: number
        primary_key: true

      - name: name
        sql: "COALESCE({CUBE.name}, {contacts.name})"
        type: string

  - name: contacts
    sql_table: contacts

    dimensions:
      - name: id
        sql: "{CUBE}.id"
        type: number
        primary_key: true

      - name: name
        sql: "{CUBE}.name"
        type: string









</CodeTabs>

FILTER_PARAMS

FILTER_PARAMS context variable allows you to use filter values from the Cube query during SQL generation.

This is useful for hinting your database optimizer to use a specific index or filter out partitions or shards in your cloud data warehouse so you won't be billed for scanning those.

<WarningBox>

Heavy usage of FILTER_PARAMS is considered a bad practice. It usually leads to hard-to-maintain data models. Good rule of thumb is to use FILTER_PARAMS only for predicate pushdown performance optimizations.

If you find yourself relying a lot on FILTER_PARAMS, it might mean that you need to rethink your approach to data modeling and potentially move some transformations upstream. Also, you might reconsider the choice of the data source.

</WarningBox>

FILTER_PARAMS has to be a top-level expression in WHERE and it has the following syntax:

<CodeTabs>
yaml
cubes:
  - name: cube_name
    sql: |
      SELECT *
      FROM table
      WHERE {FILTER_PARAMS.cube_name.member_name.filter(sql_expression)}

    dimensions:
      - name: member_name
        # ...



javascript
cube(`cube_name`, {
  sql: `
    SELECT *
    FROM table
    WHERE ${FILTER_PARAMS.cube_name.member_name.filter(sql_expression)}
  `,

  dimensions: {
    member_name: {
      // ...
    }
  }
})
</CodeTabs>

The filter() function accepts sql_expression, which could be either a string or a function returning a string.

Example with string

See the example below for the case when a string is passed to filter():

<CodeTabs>
javascript
cube(`order_facts`, {
  sql: `
    SELECT *
    FROM orders
    WHERE ${FILTER_PARAMS.order_facts.date.filter('date')}
  `,

  measures: {
    count: {
      type: `count`
    }
  },

  dimensions: {
    date: {
      sql: `date`,
      type: `time`
    }
  }
})
yaml
cubes:
  - name: order_facts
    sql: |
      SELECT *
      FROM orders
      WHERE {FILTER_PARAMS.order_facts.date.filter('date')}

    measures:
      - name: count
        type: count

    dimensions:
      - name: date
        sql: date
        type: time





</CodeTabs>

This will generate the following SQL...

sql
SELECT COUNT(*) AS orders__count
FROM orders
WHERE
  date >= '2018-01-01 00:00:00' AND
  date <= '2018-12-31 23:59:59'

...for the ['2018-01-01', '2018-12-31'] date range passed for the order_facts.date dimension as in following query:

json
{
  "measures": ["order_facts.count"],
  "time_dimensions": [
    {
      "dimension": "order_facts.date",
      "dateRange": ["2018-01-01", "2018-12-31"]
    }
  ]
}

Example with function

You can also pass a function as a filter() argument. This way, you can add BigQuery shard filtering, which will reduce your billing cost.

<CodeTabs>
yaml
cubes:
  - name: events
    sql: |
      SELECT *
      FROM schema.`events*`
      WHERE {FILTER_PARAMS.events.date.filter(
        lambda x, y: f"""
          _TABLE_SUFFIX >= FORMAT_TIMESTAMP('%Y%m%d', TIMESTAMP({x})) AND
          _TABLE_SUFFIX <= FORMAT_TIMESTAMP('%Y%m%d', TIMESTAMP({y}))
        """
      )}

    dimensions:
      - name: date
        sql: date
        type: time
        


javascript
cube(`events`, {
  sql: `
    SELECT *
    FROM schema.\`events*\`
    WHERE ${FILTER_PARAMS.events.date.filter(
      (x, y) => `
        _TABLE_SUFFIX >= FORMAT_TIMESTAMP('%Y%m%d', TIMESTAMP(${x})) AND
        _TABLE_SUFFIX <= FORMAT_TIMESTAMP('%Y%m%d', TIMESTAMP(${y}))
      `
    )}
  `,

  dimensions: {
    date: {
      sql: `date`,
      type: `time`
    }
  }
})
</CodeTabs> <InfoBox>

When a function is passed to filter(), its arguments are passed as strings from the data source driver and it's your responsibility to handle type conversions in this case.

</InfoBox>

In the example above, the filter on a time dimension accepts two values: the lower and the upper bounds of a date range. If a filter accepts multiple values, they are passed to the function as individual parameters:

javascript
cube(`multi_filter`, {
  sql: `
    SELECT 123 AS value
    -- Multiple values: ${FILTER_PARAMS.multi_filter.dummy.filter(
      (...args) => JSON.stringify(args)
    )}
  `,

  dimensions: {
    dummy: {
      sql: `1`,
      type: `number`
    }
  }
})

FILTER_GROUP

If you use FILTER_PARAMS in your query more than once, you must wrap them with FILTER_GROUP.

<WarningBox>

Otherwise, if you combine FILTER_PARAMS with any logical operators other than AND in SQL or if you use filters with boolean operators in your Cube queries, incorrect SQL might be generated.

</WarningBox>

FILTER_GROUP has to be a top-level expression in WHERE and it has the following syntax:

<CodeTabs>
yaml
cubes:
  - name: cube_name
    sql: |
      SELECT *
      FROM table
      WHERE {FILTER_GROUP(
        FILTER_PARAMS.cube_name.member_name.filter(sql_expression),
        FILTER_PARAMS.cube_name.another_member_name.filter(sql_expression)
      )}

    dimensions:
      - name: member_name
        # ...

      - name: another_member_name
        # ...




javascript
cube(`cube_name`, {
  sql: `
    SELECT *
    FROM table
    WHERE ${FILTER_GROUP(
      FILTER_PARAMS.cube_name.member_name.filter(sql_expression),
      FILTER_PARAMS.cube_name.another_member_name.filter(sql_expression)
    )}
  `,

  dimensions: {
    member_name: {
      // ...
    },
    
    another_member_name: {
      // ...
    }
  }
})
</CodeTabs>

Example

To understand the value of FILTER_GROUP, consider the following data model where two FILTER_PARAMS are combined in SQL using the OR operator:

<CodeTabs>
yaml
cubes:
  - name: filter_group
    sql: |
      SELECT *
        FROM (
          SELECT 1 AS a, 3 AS b UNION ALL
          SELECT 2 AS a, 2 AS b UNION ALL
          SELECT 3 AS a, 1 AS b
        ) AS data
        WHERE
          {FILTER_PARAMS.filter_group.a.filter("a")} OR
          {FILTER_PARAMS.filter_group.b.filter("b")}

    dimensions:
      - name: a
        sql: a
        type: number

      - name: b
        sql: b
        type: number




javascript
cube(`filter_group`, {
  sql: `
    SELECT *
      FROM (
        SELECT 1 AS a, 3 AS b UNION ALL
        SELECT 2 AS a, 2 AS b UNION ALL
        SELECT 3 AS a, 1 AS b
      ) AS data
      WHERE
        ${FILTER_PARAMS.filter_group.a.filter('a')} OR
        ${FILTER_PARAMS.filter_group.b.filter('b')}
  `,

  dimensions: {
    a: {
      sql: `a`,
      type: `number`
    },

    b: {
      sql: `b`,
      type: `number`
    }
  }
})
</CodeTabs>

If the following query is run...

json
{
  "dimensions": [
    "filter_group.a",
    "filter_group.b"
  ],
  "filters": [
    {
      "member": "filter_group.a",
      "operator": "gt",
      "values": ["1"]
    },
    {
      "member": "filter_group.b",
      "operator": "gt",
      "values": ["1"]
    }
  ]
}

...the following (logically incorrect) SQL will be generated:

sql
SELECT
  "filter_group".a,
  "filter_group".b
FROM (
  SELECT *
  FROM (
    SELECT 1 AS a, 3 AS b UNION ALL
    SELECT 2 AS a, 2 AS b UNION ALL
    SELECT 3 AS a, 1 AS b
  ) AS data
  WHERE
    (a > 1) OR  -- Incorrect logical operator here
    (b > 1)
) AS "filter_group"
WHERE
  "filter_group".a > 1 AND
  "filter_group".b > 1
GROUP BY 1, 2

As you can see, since an array of filters has AND semantics, Cube has correctly used the AND operator in the "outer" WHERE. At the same time, the hardcoded OR operator has propagated to the "inner" WHERE, leading to a logically incorrect query.

Now, if the cube is defined the following way...

<CodeTabs>
yaml
cubes:
  - name: filter_group
    sql: |
      SELECT *
        FROM (
          SELECT 1 AS a, 3 AS b UNION ALL
          SELECT 2 AS a, 2 AS b UNION ALL
          SELECT 3 AS a, 1 AS b
        ) AS data
        WHERE
          {FILTER_GROUP(
            FILTER_PARAMS.filter_group.a.filter("a"),
            FILTER_PARAMS.filter_group.b.filter("b")
          )}

    # ...
javascript
cube(`filter_group`, {
  sql: `
    SELECT *
      FROM (
        SELECT 1 AS a, 3 AS b UNION ALL
        SELECT 2 AS a, 2 AS b UNION ALL
        SELECT 3 AS a, 1 AS b
      ) AS data
      WHERE
        ${FILTER_GROUP(
          FILTER_PARAMS.filter_group.a.filter('a'),
          FILTER_PARAMS.filter_group.b.filter('b')
        )}
  `,

  // ...
</CodeTabs>

...the following correct SQL will be generated for the same query:

sql
SELECT
  "filter_group".a,
  "filter_group".b
FROM (
  SELECT *
  FROM (
    SELECT 1 AS a, 3 AS b UNION ALL
    SELECT 2 AS a, 2 AS b UNION ALL
    SELECT 3 AS a, 1 AS b
  ) AS data
  WHERE
    (a > 1) AND  -- Correct logical operator here
    (b > 1)
) AS "filter_group"
WHERE
  "filter_group".a > 1 AND
  "filter_group".b > 1
GROUP BY 1, 2

You can also use boolean operators in the Cube query to express more complex filtering logic:

json
{
  "dimensions": [
    "filter_group.a",
    "filter_group.b"
  ],
  "filters": [
    {
      "or": [
        {
          "member": "filter_group.a",
          "operator": "gt",
          "values": ["1"]
        },
        {
          "member": "filter_group.b",
          "operator": "gt",
          "values": ["1"]
        }
      ]
    }
  ]
}

With FILTER_GROUP, the following correct SQL will be generated:

sql
SELECT
  "filter_group".a,
  "filter_group".b
FROM (
  SELECT *
  FROM (
    SELECT 1 AS a, 3 AS b UNION ALL
    SELECT 2 AS a, 2 AS b UNION ALL
    SELECT 3 AS a, 1 AS b
  ) AS data
  WHERE
    (a > 1) OR
    (b > 1)
) AS "filter_group"
WHERE
  "filter_group".a > 1 OR
  "filter_group".b > 1
GROUP BY 1, 2

SQL_UTILS

convertTz

In case you need to convert your timestamp to user request timezone in cube or member SQL you can use SQL_UTILS.convertTz() method. Note that Cube will automatically convert timezones for timeDimensions fields in queries.

<WarningBox>

Dimensions that use SQL_UTILS.convertTz() should not be used as timeDimensions in queries. Doing so will apply the conversion multiple times and yield wrong results.

</WarningBox>

In case the same database field needs to be queried in dimensions and timeDimensions, create dedicated dimensions in the cube definition for the respective use:

<CodeTabs>
javascript
cube(`visitors`, {
  // ...

  dimensions: {
    // Do not use in timeDimensions query property
    created_at_converted: {
      sql: SQL_UTILS.convertTz(`created_at`),
      type: `time`
    },

    // Use in timeDimensions query property
    created_at: {
      sql: `created_at`,
      type: "time"
    }
  }
})
yaml
cubes:
  - name: visitors
    # ...

    dimensions:
      # Do not use in timeDimensions query property
      - name: created_at_converted
        sql: "{SQL_UTILS.convertTz(`created_at`)}"
        type: time

      # Use in timeDimensions query property
      - name: created_at
        sql: created_at
        type: time



</CodeTabs>

COMPILE_CONTEXT

<WarningBox>

COMPILE_CONTEXT is evaluated only once per each key generated by context_to_app_id. The securityContext defined in COMPILE_CONTEXT doesn't change its value for different users, however, it will change for different tenants as defined in context_to_app_id.

</WarningBox>

A global COMPILE_CONTEXT contains securityContext and any other variables provided by extendContext.

Use Jinja {{ }} syntax to access COMPILE_CONTEXT variable.

<CodeTabs>
yaml
cubes:
  - name: users
    sql_table: "user_{{ COMPILE_CONTEXT.securityContext.deployment_id }}.users"
javascript
cube(`users`, {
  sql_table: `user_${COMPILE_CONTEXT.securityContext.deployment_id}.users`
})
</CodeTabs>

SECURITY_CONTEXT

<WarningBox>

SECURITY_CONTEXT is deprecated and can be removed without further notice. Use query_rewrite instead.

</WarningBox>

SECURITY_CONTEXT global variable holds a security context that is passed to Cube via API. Please read the Security Context page for more information on how to provide security context to Cube.

javascript
cube(`orders`, {
  sql: `
    SELECT *
    FROM orders
    WHERE ${SECURITY_CONTEXT.email.filter("email")}
  `,

  dimensions: {
    date: {
      sql: `date`,
      type: `time`
    }
  }
})

To ensure filter value presents for all requests requiredFilter can be used:

javascript
cube(`orders`, {
  sql: `
    SELECT *
    FROM orders
    WHERE ${SECURITY_CONTEXT.email.requiredFilter("email")}
  `,

  dimensions: {
    date: {
      sql: `date`,
      type: `time`
    }
  }
})

You can access values of context variables directly in JavaScript in order to use it during your SQL generation. For example:

<WarningBox>

Use of this feature entails SQL injection security risk. Use it with caution.

</WarningBox>
javascript
cube(`orders`, {
  sql: `
    SELECT *
    FROM ${
      SECURITY_CONTEXT.type.unsafeValue() === "employee" ? "employee" : "public"
    }.orders
  `,

  dimensions: {
    date: {
      sql: `date`,
      type: `time`
    }
  }
})