x-pack/plugin/esql-datasource-iceberg/README.md
This plugin provides Apache Iceberg table catalog support for ESQL external data sources.
The Iceberg plugin enables ESQL to query Apache Iceberg tables stored in S3. Iceberg is an open table format for large analytic datasets that provides ACID transactions, schema evolution, and efficient metadata management.
Once installed, the plugin enables querying Iceberg tables via their metadata location:
FROM "s3://my-bucket/warehouse/db/sales_table"
| WHERE sale_date >= "2024-01-01" AND region = "EMEA"
| STATS total = SUM(amount) BY product
The plugin automatically detects Iceberg tables by looking for the metadata/ directory structure.
s3://bucket/warehouse/db/table/
├── data/
│ ├── part-00000.parquet
│ ├── part-00001.parquet
│ └── ...
└── metadata/
├── v1.metadata.json
├── v2.metadata.json
├── snap-*.avro
└── version-hint.text
This plugin bundles significant dependencies for Iceberg, Arrow, and AWS support:
| Dependency | Version | Purpose |
|---|---|---|
| iceberg-core | 1.x | Iceberg table operations |
| iceberg-aws | 1.x | S3FileIO implementation |
| iceberg-parquet | 1.x | Parquet file support |
| iceberg-arrow | 1.x | Arrow vectorized reading |
| Dependency | Version | Purpose |
|---|---|---|
| arrow-vector | 18.x | Arrow vector types |
| arrow-memory-core | 18.x | Arrow memory management |
| arrow-memory-unsafe | 18.x | Off-heap memory allocation |
| Dependency | Version | Purpose |
|---|---|---|
| parquet-hadoop-bundle | 1.16.0 | Parquet file reading |
| hadoop-client-api | 3.4.1 | Hadoop Configuration |
| hadoop-client-runtime | 3.4.1 | Hadoop runtime |
| Dependency | Version | Purpose |
|---|---|---|
| software.amazon.awssdk:s3 | 2.x | S3 client |
| software.amazon.awssdk:sts | 2.x | STS for role assumption |
| software.amazon.awssdk:kms | 2.x | KMS for encryption |
┌─────────────────────────────────────────┐
│ IcebergDataSourcePlugin │
│ implements DataSourcePlugin │
└─────────────────┬───────────────────────┘
│
│ provides
▼
┌─────────────────────────────────────────┐
│ IcebergTableCatalog │
│ implements TableCatalog │
│ │
│ - metadata(tablePath, config) │
│ - planScan(tablePath, config, preds) │
│ - catalogType() → "iceberg" │
│ - canHandle(path) │
└─────────────────┬───────────────────────┘
│
│ uses
▼
┌─────────────────────────────────────────┐
│ IcebergCatalogAdapter │
│ │
│ Adapts Iceberg's StaticTableOperations │
│ to work with S3 metadata locations │
└─────────────────┬───────────────────────┘
│
│ uses
▼
┌─────────────────────────────────────────┐
│ S3FileIOFactory │
│ │
│ Creates S3FileIO instances for │
│ Iceberg table operations │
└─────────────────────────────────────────┘
| Feature | Status |
|---|---|
| Schema discovery | Supported |
| Column projection | Supported |
| Partition pruning | Supported |
| Predicate pushdown | Supported |
| Time travel | Not yet supported |
| Schema evolution | Read-only |
| Hidden partitioning | Supported |
| Row-level deletes | Not yet supported |
| Iceberg Type | ESQL Type |
|---|---|
| boolean | BOOLEAN |
| int | INTEGER |
| long | LONG |
| float | DOUBLE |
| double | DOUBLE |
| decimal | DOUBLE |
| date | DATE |
| time | TIME |
| timestamp | DATETIME |
| timestamptz | DATETIME |
| string | KEYWORD |
| uuid | KEYWORD |
| fixed | KEYWORD |
| binary | KEYWORD (base64) |
| list | Not yet supported |
| map | Not yet supported |
| struct | Not yet supported |
The plugin supports pushing filter predicates to Iceberg for partition pruning and data skipping:
-- Partition pruning: only scans partitions matching the predicate
FROM "s3://bucket/table"
| WHERE sale_date >= "2024-01-01"
-- Data skipping: uses column statistics to skip row groups
FROM "s3://bucket/table"
| WHERE amount > 1000
Supported predicates:
=, !=<, <=, >, >=IS NULL, IS NOT NULLfield IN (value1, value2, ...)S3 access is configured via environment variables or Elasticsearch settings:
AWS_ACCESS_KEY_ID=your-access-key
AWS_SECRET_ACCESS_KEY=your-secret-key
AWS_REGION=us-east-1
| Setting | Default | Description |
|---|---|---|
esql.iceberg.s3.endpoint | (AWS default) | Custom S3 endpoint (for MinIO, etc.) |
esql.iceberg.s3.path_style_access | false | Use path-style S3 access |
./gradlew :x-pack:plugin:esql-datasource-iceberg:build
# Unit tests
./gradlew :x-pack:plugin:esql-datasource-iceberg:test
# Integration tests (requires S3 fixture)
./gradlew :x-pack:plugin:esql-datasource-iceberg:qa:javaRestTest
The qa/ directory contains test fixtures for integration testing:
qa/src/javaRestTest/resources/iceberg-fixtures/
├── employees/ # Sample Iceberg table
│ ├── data/
│ │ └── data.parquet
│ └── metadata/
│ ├── v1.metadata.json
│ └── ...
└── standalone/
└── employees.parquet # Standalone Parquet file
The plugin is bundled with Elasticsearch and enabled by default when the ESQL feature is available.
Elastic License 2.0