docs-mintlify/admin/account-billing/ai-tokens.mdx
AI features in Cube use a token-based system to measure and manage consumption.
Cube's AI-powered features consume tokens based on the resources required to complete each request. Token allocation differs by customer type:
Token usage depends on several factors:
Not all AI features consume tokens. The list of features that consume tokens is subject to change as the product evolves.
Self-serve customers on paid plans receive per-seat token grants equal to half of the seat price. Each user is awarded an individual monthly token allocation based on their role.
Per-seat grants:
When a self-serve user exceeds their monthly per-seat grant, usage automatically continues as on-demand consumption. On-demand usage is billed through the credit card on file.
Administrators can set a monthly on-demand spending limit to control additional costs. This limit caps the total on-demand spend across the account for each billing cycle.
Order form customers can purchase pooled add-on token packages. Token packages are added to a shared pool accessible by all users in the account.
Contact your account executive for details on purchasing token packages.
Each user on a free plan receives an individual monthly token allowance. This allowance resets at the start of each calendar month.
Administrators can monitor token consumption through the AI Tokens Usage tab in the billing settings page. The dashboard shows:
When a user exhausts all available token sources (per-seat grant and token packages), AI requests will return an error indicating the token limit has been exceeded.
Yes. When using a Bring Your Own Model (BYOM) configuration, AI requests bypass the token quota system entirely — no tokens are consumed or tracked for those requests. You are billed directly by your model provider.
Cube's AI features use an agentic architecture. A single prompt may trigger multiple internal steps — such as searching the data model, building a query, and summarizing results — each of which consumes tokens independently.
Token usage can vary between identical prompts due to differences in conversation context (earlier messages in the session) or the AI choosing a different reasoning path.