docs/src/main/sphinx/connector/clickhouse.md
The ClickHouse connector allows querying tables in an external ClickHouse server. This can be used to query data in the databases on that server, or combine it with other data from different catalogs accessing ClickHouse or any other supported data source.
To connect to a ClickHouse server, you need:
The connector can query a ClickHouse server. Create a catalog properties file
that specifies the ClickHouse connector by setting the connector.name to
clickhouse.
For example, create the file etc/catalog/example.properties. Replace the
connection properties as appropriate for your setup:
connector.name=clickhouse
connection-url=jdbc:clickhouse://host1:8123/
connection-user=exampleuser
connection-password=examplepassword
The connection-url defines the connection information and parameters to pass
to the ClickHouse JDBC driver. The supported parameters for the URL are
available in the ClickHouse JDBC driver configuration.
The connection-user and connection-password are typically required and
determine the user credentials for the connection, often a service user. You can
use {doc}secrets </security/secrets> to avoid actual values in the catalog
properties files.
(clickhouse-tls)=
If you have TLS configured with a globally-trusted certificate installed on your
data source, you can enable TLS between your cluster and the data
source by appending a parameter to the JDBC connection string set in the
connection-url catalog configuration property.
For example, with version 2.6.4 of the ClickHouse JDBC driver, enable TLS by
appending the ssl=true parameter to the connection-url configuration
property:
connection-url=jdbc:clickhouse://host1:8443/?ssl=true
For more information on TLS configuration options, see the Clickhouse JDBC driver documentation
If you have multiple ClickHouse servers you need to configure one catalog for each server. To add another catalog:
etc/catalog.propertiesFor example, if you name the property file sales.properties, Trino uses the
configured connector to create a catalog named sales.
The ClickHouse connector provides a schema for every ClickHouse database.
Run SHOW SCHEMAS to see the available ClickHouse databases:
SHOW SCHEMAS FROM example;
If you have a ClickHouse database named web, run SHOW TABLES to view the
tables in this database:
SHOW TABLES FROM example.web;
Run DESCRIBE or SHOW COLUMNS to list the columns in the clicks table
in the web databases:
DESCRIBE example.web.clicks;
SHOW COLUMNS FROM example.web.clicks;
Run SELECT to access the clicks table in the web database:
SELECT * FROM example.web.clicks;
:::{note}
If you used a different name for your catalog properties file, use
that catalog name instead of example in the above examples.
:::
Table property usage example:
CREATE TABLE default.trino_ck (
id int NOT NULL,
birthday DATE NOT NULL,
name VARCHAR,
age BIGINT,
logdate DATE NOT NULL
)
WITH (
engine = 'MergeTree',
order_by = ARRAY['id', 'birthday'],
partition_by = ARRAY['toYYYYMM(logdate)'],
primary_key = ARRAY['id'],
sample_by = 'id'
);
The following are supported ClickHouse table properties from https://clickhouse.tech/docs/en/engines/table-engines/mergetree-family/mergetree/
| Property name | Default value | Description |
|---|---|---|
engine | Log | Name and parameters of the engine. |
order_by | (none) | Array of columns or expressions to concatenate to create the sorting key. tuple() is used by default if order_by is not specified. |
partition_by | (none) | Array of columns or expressions to use as nested partition keys. Optional. |
primary_key | (none) | Array of columns or expressions to concatenate to create the primary key. Optional. |
sample_by | (none) | An expression to use for sampling. Optional. |
Currently the connector only supports Log and MergeTree table engines
in create table statement. ReplicatedMergeTree engine is not yet supported.
(clickhouse-type-mapping)=
Because Trino and ClickHouse each support types that the other does not, this
connector {ref}modifies some types <type-mapping-overview> when reading or
writing data. Data types may not map the same way in both directions between
Trino and the data source. Refer to the following sections for type mapping in
each direction.
The connector maps ClickHouse types to the corresponding Trino types according to the following table:
:::{list-table} ClickHouse type to Trino type mapping :widths: 30, 25, 50 :header-rows: 1
BoolBOOLEANInt8TINYINTTINYINT and INT1 are aliases of Int8Int16SMALLINTSMALLINT and INT2 are aliases of Int16Int32INTEGERINT, INT4, and INTEGER are aliases of Int32Int64BIGINTBIGINT is an alias of Int64UInt8SMALLINTUInt16INTEGERUInt32BIGINTUInt64DECIMAL(20,0)Float32REALFLOAT is an alias of Float32Float64DOUBLEDOUBLE is an alias of Float64DecimalDECIMALFixedStringVARBINARYclickhouse.map-string-as-varchar config property changes the
mapping to VARCHARStringVARBINARYclickhouse.map-string-as-varchar config property changes the
mapping to VARCHARDateDATEDateTime[(timezone)]TIMESTAMP(0) [WITH TIME ZONE]IPv4IPADDRESSIPv6IPADDRESSEnum8VARCHAREnum16VARCHARUUIDUUID:::
No other types are supported.
The connector maps Trino types to the corresponding ClickHouse types according to the following table:
:::{list-table} Trino type to ClickHouse type mapping :widths: 30, 25, 50 :header-rows: 1
BOOLEANBoolTINYINTInt8TINYINT and INT1 are aliases of Int8SMALLINTInt16SMALLINT and INT2 are aliases of Int16INTEGERInt32INT, INT4, and INTEGER are aliases of Int32BIGINTInt64BIGINT is an alias of Int64REALFloat32FLOAT is an alias of Float32DOUBLEFloat64DOUBLE is an alias of Float64DECIMAL(p,s)Decimal(p,s)VARCHARStringCHARStringVARBINARYStringclickhouse.map-string-as-varchar config property changes the
mapping to VARCHARDATEDateTIMESTAMP(0)DateTimeUUIDUUID:::
No other types are supported.
(clickhouse-sql-support)=
The connector provides read and write access to data and metadata in a ClickHouse catalog. In addition to the globally available and read operation statements, the connector supports the following features:
(clickhouse-insert)=
(clickhouse-alter-table)=
(clickhouse-procedures)=
(clickhouse-table-functions)=
The connector provides specific {doc}table functions </functions/table> to
access ClickHouse.
(clickhouse-query-function)=
query(varchar) -> tableThe query function allows you to query the underlying database directly. It
requires syntax native to ClickHouse, because the full query is pushed down and
processed in ClickHouse. This can be useful for accessing native features which
are not available in Trino or for improving query performance in situations
where running a query natively may be faster.
As a simple example, query the example catalog and select an entire table:
SELECT
*
FROM
TABLE(
example.system.query(
query => 'SELECT
*
FROM
tpch.nation'
)
);
The connector includes a number of performance improvements, detailed in the following sections.
(clickhouse-pushdown)=
The connector supports pushdown for a number of operations:
limit-pushdown{ref}Aggregate pushdown <aggregation-pushdown> for the following functions:
avgcountmaxminsum