docs/en/engines/database-engines/datalake.md
The DataLakeCatalog database engine enables you to connect ClickHouse to external
data catalogs and query open table format data without the need for data duplication.
This transforms ClickHouse into a powerful query engine that works seamlessly with
your existing data lake infrastructure.
The DataLakeCatalog engine supports the following data catalogs:
You will need to enable the relevant settings below to use the DataLakeCatalog engine:
SET allow_experimental_database_iceberg = 1;
SET allow_experimental_database_unity_catalog = 1;
SET allow_experimental_database_glue_catalog = 1;
SET allow_experimental_database_hms_catalog = 1;
SET allow_experimental_database_paimon_rest_catalog = 1;
Databases with the DataLakeCatalog engine can be created using the following syntax:
CREATE DATABASE database_name
ENGINE = DataLakeCatalog(catalog_endpoint[, user, password])
SETTINGS
catalog_type,
[...]
The following settings are supported:
| Setting | Description |
|---|---|
catalog_type | Type of catalog: glue, unity (Delta), rest (Iceberg), hive, onelake (Iceberg) |
warehouse | The warehouse/database name to use in the catalog. |
catalog_credential | Authentication credential for the catalog (e.g., API key or token) |
auth_header | Custom HTTP header for authentication with the catalog service |
auth_scope | OAuth2 scope for authentication (if using OAuth) |
storage_endpoint | Endpoint URL for the underlying storage |
oauth_server_uri | URI of the OAuth2 authorization server for authentication |
vended_credentials | Boolean indicating whether to use vended credentials from the catalog (supports AWS S3 and Azure ADLS Gen2) |
aws_access_key_id | AWS access key ID for S3/Glue access (if not using vended credentials) |
aws_secret_access_key | AWS secret access key for S3/Glue access (if not using vended credentials) |
region | AWS region for the service (e.g., us-east-1) |
dlf_access_key_id | Access key ID for DLF access |
dlf_access_key_secret | Access key Secret for DLF access |
See below sections for examples of using the DataLakeCatalog engine:
allow_experimental_database_iceberg or allow_database_iceberg.CREATE DATABASE database_name
ENGINE = DataLakeCatalog(catalog_endpoint)
SETTINGS
catalog_type = 'onelake',
warehouse = warehouse,
onelake_tenant_id = tenant_id,
oauth_server_uri = server_uri,
auth_scope = auth_scope,
onelake_client_id = client_id,
onelake_client_secret = client_secret;
SHOW TABLES IN database_name;
SELECT count() from database_name.table_name;