docs/content/product/configuration/concurrency.mdx
All queries to data APIs are processed asynchronously via a query queue. It allows to optimize the load and increase querying performance.
The query queue allows to deduplicate queries to API instances and insulate upstream data sources from query spikes. It also allows to execute queries to data sources concurrently for increased performance.
By default, Cube uses a single query queue for queries from all API instances and the refresh worker to all configured data sources.
<ReferenceBox>You can read more about the query queue in the this blog post.
</ReferenceBox>You can use the context_to_orchestrator_id
configuration option to route queries to multiple queues based on the security
context.
If you're configuring multiple connections to data sources via the driver_factory
configuration option, you must also configure
context_to_orchestrator_id to ensure that queries are routed to correct queues.
Cube supports various kinds of data sources, ranging from cloud data warehouses to embedded databases. Each data source scales differently, therefore Cube provides sound defaults for each kind of data source out-of-the-box.
By default, Cube uses the following concurrency settings for data sources:
| Data source | Default concurrency |
|---|---|
| Amazon Athena | 10 |
| Amazon Redshift | 5 |
| Apache Pinot | 10 |
| ClickHouse | 10 |
| Databricks | 10 |
| Firebolt | 10 |
| Google BigQuery | 10 |
| Snowflake | 8 |
| All other data sources | 5 or less, if specified in the driver |
You can use the <EnvVar>CUBEJS_CONCURRENCY</EnvVar> environment variable to adjust the maximum number of concurrent queries to a data source. It's recommended to use the default configuration unless you're sure that your data source can handle more concurrent queries.
For data sources that support connection pooling, the maximum number of concurrent connections to the database can also be set by using the <EnvVar>CUBEJS_DB_MAX_POOL</EnvVar> environment variable. If changing this from the default, you must ensure that the new value is greater than the number of concurrent connections used by Cube's query queues and the refresh worker.
By default, the refresh worker uses the same concurrency settings as API instances. However, you can override this behvaior in the refresh worker configuration.