Iceberg DDL operations

StarRocks Iceberg Catalog supports a variety of Data Definition Language (DDL) operations, including creating and managing databases, tables, and views.

You must have the appropriate privileges to perform DDL operations. For more information about privileges, see Privileges.

CREATE DATABASE

Creates a database in an Iceberg catalog. This feature is supported from v3.1 onwards.

Syntax

SQL

CREATE DATABASE [IF NOT EXISTS] <database_name>
[PROPERTIES ("location" = "<prefix>://<path_to_database>/<database_name.db>/")]

Parameters

location: Specifies the file path where the database will be created. Both HDFS and cloud storage are supported. If not specified, the database is created in the default file path of the Iceberg catalog.

The prefix varies based on the storage system:

HDFS: hdfs
Google GCS: gs
Azure Blob Storage (HTTP): wasb
Azure Blob Storage (HTTPS): wasbs
Azure Data Lake Storage Gen1: adl
Azure Data Lake Storage Gen2 (HTTP): abfs
Azure Data Lake Storage Gen2 (HTTPS): abfss
AWS S3 or S3-compatible storage: s3

Example

SQL

CREATE DATABASE iceberg_db
PROPERTIES ("location" = "s3://my_bucket/iceberg_db/");

DROP DATABASE

Drops an empty database from an Iceberg catalog. This feature is supported from v3.1 onwards.

:::note Only empty databases can be dropped. When you drop a database, the file path in the remote storage is not deleted. :::

Syntax

SQL

DROP DATABASE [IF EXISTS] <database_name>

Example

SQL

DROP DATABASE iceberg_db;

CREATE TABLE

Creates a table in an Iceberg database. This feature is supported from v3.1 onwards.

Syntax

SQL

CREATE TABLE [IF NOT EXISTS] [database.]table_name
(
    column_definition1[, column_definition2, ...],
    partition_column_definition1, partition_column_definition2, ...
)
[partition_desc]
[ORDER BY sort_desc]
[PROPERTIES ("key" = "value", ...)]
[AS SELECT query]

Parameters

`column_definition`

SQL

col_name col_type [COMMENT 'comment'] [DEFAULT default_value]

:::note All non-partition columns must use NULL as the default value. Partition columns must be defined after non-partition columns and cannot use NULL as the default value. :::

Default values

From v4.1 onwards, StarRocks supports setting default values for columns in Iceberg tables. This feature requires Iceberg format version 3 ("format-version" = "3").

Usage:

Write-time filling: When executing an INSERT statement, if a value is not specified for a column, the system automatically uses the column's default value.
Schema Evolution filling: When a new column is added to an existing table, reading old data files (that do not contain the new column) will use the new column's default value.

Syntax:

SQL

col_name col_type DEFAULT default_value

Requirements:

The table must use Iceberg format version 3 ("format-version" = "3").
Default values for numeric types (INT, BIGINT, FLOAT, DOUBLE), BOOLEAN, STRING, and DATE/TIMESTAMP types must be wrapped in quotes. For example: DEFAULT "18", DEFAULT "100.0", DEFAULT "true".

Examples:

Create a table with default values:

SQL

CREATE TABLE user_info (
    id INT,
    name STRING,
    age INT DEFAULT "18",
    score DOUBLE DEFAULT "100.0",
    status STRING DEFAULT 'active',
    is_active BOOLEAN DEFAULT "true"
) PROPERTIES ("format-version" = "3");

Add a column with a default value:

SQL

ALTER TABLE user_info ADD COLUMN bonus DOUBLE DEFAULT "50.5";

Modify a column's default value:

SQL

ALTER TABLE user_info MODIFY COLUMN status STRING DEFAULT "inactive";

`partition_desc`

SQL

PARTITION BY (partition_expr[, partition_expr...])

Each partition_expr can be:

column_name (identity transform)
transform_expr(column_name)
transform_expr(column_name, parameter)

StarRocks supports partition transformation expressions defined in the Apache Iceberg specification.

:::note Partition columns support all data types except FLOAT, DOUBLE, DECIMAL, and DATETIME. :::

`ORDER BY`

Specifies sort keys for the Iceberg table. This feature is supported from v4.0 onwards.

SQL

ORDER BY (column_name [ASC | DESC] [NULLS FIRST | NULLS LAST], ...)

`PROPERTIES`

Key table properties:

location: File path for the table. Required when using AWS Glue without database-level location.
file_format: File format. Only parquet (Default) is supported.
compression_codec: Compression algorithm. Options: SNAPPY, GZIP, ZSTD, LZ4 (Default: zstd).

Examples

Create a non-partitioned table:

SQL

CREATE TABLE unpartition_tbl
(
    id int,
    score double
);

Create a partitioned table:

SQL

CREATE TABLE partition_tbl
(
    action varchar(20),
    id int,
    dt date
)
PARTITION BY (id, dt);

Create a table with hidden partitions:

SQL

CREATE TABLE hidden_partition_tbl
(
    action VARCHAR(20),
    id INT,
    dt DATE
)
PARTITION BY bucket(id, 10), year(dt);

CREATE TABLE AS SELECT:

SQL

CREATE TABLE new_tbl
PARTITION BY (id, dt)
AS SELECT * FROM existing_tbl;

ALTER TABLE to evolve partition spec

Modifies an Iceberg table's partition spec by adding or dropping partition columns.

Syntax

SQL

ALTER TABLE [catalog.][database.]table_name
ADD PARTITION COLUMN partition_expr [, partition_expr ...];

ALTER TABLE [catalog.][database.]table_name
DROP PARTITION COLUMN partition_expr [, partition_expr ...];

Supported partition_expr formats:

Column name (identity transform)
Transform expressions: year(), month(), day(), hour(), truncate(), bucket()

Examples

Add partition columns:

SQL

ALTER TABLE sales_data
ADD PARTITION COLUMN month(sale_date), bucket(customer_id, 10);

Drop partition column:

SQL

ALTER TABLE sales_data
DROP PARTITION COLUMN day(sale_date);

DROP TABLE

Drops an Iceberg table. This feature is supported from v3.1 onwards.

When you drop a table, the file path and data in the remote storage are not deleted by default.

Syntax

SQL

DROP TABLE [IF EXISTS] <table_name> [FORCE]

Parameters

FORCE: When specified, the table data in the remote storage is deleted, while the file path is retained.

Example

SQL

DROP TABLE iceberg_db.sales_data;

-- Force drop the table with its data
DROP TABLE iceberg_db.temp_data FORCE;

CREATE VIEW

Creates an Iceberg view. This feature is supported from v3.5 onwards. Creating an Iceberg view with PROPERTIES is supported from v4.0.3 onwards.

Syntax

SQL

CREATE VIEW [IF NOT EXISTS]
[<catalog>.<database>.]<view_name>
(
    <column_name> [COMMENT 'column comment']
    [, <column_name> [COMMENT 'column comment'], ...]
)
[COMMENT 'view comment']
[PROPERTIES ("key" = "value", ...)]
AS <query_statement>

Example

Create a regular Iceberg view:

SQL

CREATE VIEW IF NOT EXISTS iceberg_db.sales_summary AS
SELECT region, SUM(amount) as total_sales
FROM iceberg_db.sales
GROUP BY region;

Create an Iceberg view with properties:

SQL

CREATE VIEW IF NOT EXISTS iceberg_db.sales_summary
PROPERTIES (
  "key1" = "value1"
)
AS
SELECT region, SUM(amount) as total_sales
FROM iceberg_db.sales
GROUP BY region;

ALTER VIEW to update StarRocks dialect

Adds or modifies StarRocks dialect for an existing Iceberg view. This feature is supported from v3.5 onwards.

:::note You can define only one StarRocks dialect for each Iceberg view. :::

Syntax

SQL

ALTER VIEW [<catalog>.<database>.]<view_name>
(
    <column_name> [, <column_name>]
)
{ ADD | MODIFY } DIALECT
<query_statement>

Examples

Add StarRocks dialect:

SQL

ALTER VIEW iceberg_db.spark_view ADD DIALECT
SELECT k1, k2 FROM iceberg_db.source_table;

Modify StarRocks dialect:

SQL

ALTER VIEW iceberg_db.spark_view MODIFY DIALECT
SELECT k1, k2, k3 FROM iceberg_db.source_table;