airbyte-integrations/connectors/destination-s3-data-lake/src/test-integration/resources/polaris/README.md
This directory contains the Docker Compose configuration and supporting code for running integration tests against Apache Polaris, an open-source catalog service for Apache Iceberg.
Apache Polaris is a catalog management service that implements the Iceberg REST Catalog API. It provides multi-tenancy, role-based access control, and credential vending for Iceberg tables, making it suitable for production data lake deployments.
The docker-compose.yml file orchestrates a complete Polaris environment with the following services:
- Image: minio/minio:RELEASE.2025-09-07T16-13-09Z
- Ports: 9000 (S3 API), 9001 (Web Console)
- Credentials: minio_root / m1n1opwd
MinIO provides S3-compatible object storage for Iceberg table data and metadata files. The service includes a health check that polls /minio/health/ready to ensure readiness before dependent services start.
- Image: minio/mc:latest
- Purpose: One-shot initialization job
- Creates: bucket123
This service uses the MinIO Client (mc) to automatically create the required S3 bucket. It:
bucket123 with automatic retries- Image: postgres:16-alpine
- Port: 5432
- Database: POLARIS
- Credentials: postgres / postgres
PostgreSQL stores Polaris metadata including catalog definitions, principal information, role assignments, and privilege grants. All Polaris configuration persists here, making it the source of truth for the catalog's state.
- Image: apache/polaris:1.1.0-incubating
- Ports: 8181 (Catalog API), 8182 (Health/Metrics)
- Bootstrap Credentials: POLARIS realm, root principal, s3cr3t secret
The Polaris service provides:
/api/catalog/v1/oauth/tokens for credential exchange/api/management/v1/* for catalog/principal/role administrationPolaris is configured with:
Understanding Polaris requires familiarity with its security and organizational model:
A realm is a top-level tenant in Polaris, providing complete isolation between different organizations or environments. Each realm has:
In this setup, we use the POLARIS realm (configured via Polaris-Realm HTTP header).
A principal is an authenticated identity (user or service account) in Polaris. Each principal has:
clientId and clientSecret)Principals authenticate using OAuth 2.0 client credentials flow and receive bearer tokens for API access.
A principal role is a collection of catalog role assignments that can be granted to principals. Think of it as a group membership:
Example: quickstart_user_role is a principal role that contains the assignment of the quickstart_catalog_role for the quickstart_catalog.
A catalog role is a named set of privileges within a specific catalog. It defines what operations can be performed on catalog resources:
Example: quickstart_catalog_role might have TABLE_CREATE, TABLE_READ_PROPERTIES, and NAMESPACE_LIST privileges.
Privileges are atomic permissions that control access to catalog operations. Polaris supports privileges such as:
| Privilege | Description |
|---|---|
CATALOG_MANAGE_CONTENT | Manage tables and namespaces in the catalog |
TABLE_CREATE | Create new tables |
TABLE_DROP | Delete tables |
TABLE_LIST | List tables in a namespace |
TABLE_READ_PROPERTIES | Read table metadata |
TABLE_WRITE_PROPERTIES | Update table metadata |
TABLE_WRITE_DATA | Write data to tables |
NAMESPACE_CREATE | Create new namespaces |
NAMESPACE_LIST | List namespaces |
NAMESPACE_READ_PROPERTIES | Read namespace metadata |
The permission flow in Polaris follows this hierarchy:
Principal (user/service account)
↓ assigned to
Principal Role (group)
↓ contains
Catalog Role Assignment (catalog-specific permissions)
↓ grants
Catalog Role (role within catalog)
↓ has
Privileges (specific operations)
The PolarisEnvironment.kt object (src/test-integration/kotlin/.../PolarisEnvironment.kt) orchestrates the test environment setup. Here's the detailed workflow:
startServices()
docker compose up -d using the docker-compose.yml fileGET http://localhost:9000/minio/health/readyGET http://localhost:8182/q/health/readyfetchToken(scope = "PRINCIPAL_ROLE:ALL")
The bootstrap credentials are used for initial authentication:
http://localhost:8181/api/catalog/v1/oauth/tokensroot:s3cr3t (bootstrap principal)Polaris-Realm: POLARIS header to specify the realmcreateCatalogIfNeeded()
Creates the quickstart_catalog via POST /api/management/v1/catalogs:
{
"catalog": {
"type": "INTERNAL",
"name": "quickstart_catalog",
"properties": {
"default-base-location": "s3://bucket123/"
},
"storageConfigInfo": {
"storageType": "S3",
"region": "us-east-1",
"endpoint": "http://localhost:9000",
"endpointInternal": "http://minio:9000",
"pathStyleAccess": true,
"stsUnavailable": true
}
}
}
Key configuration points:
localhost:9000) for host access, internal (minio:9000) for container-to-container communicationcreatePrincipalAndGrants()
This is the most complex step, creating a complete authorization chain:
POST /api/management/v1/principals
{
"name": "quickstart_user-{timestamp}"
}
clientId and clientSecret for OAuthappClientId and appClientSecret for test usePOST /api/management/v1/principal-roles
{
"name": "quickstart_user_role"
}
POST /api/management/v1/catalogs/quickstart_catalog/catalog-roles
{
"name": "quickstart_catalog_role"
}
PUT /api/management/v1/principal-roles/quickstart_user_role/catalog-roles/quickstart_catalog
{
"catalogRole": {
"name": "quickstart_catalog_role"
}
}
This grants the catalog role to the principal role for the specific catalog.
grantAirbytePrivileges(catalogName, catalogRole)
Iterates through required privileges and grants each one:
PUT /api/management/v1/catalogs/quickstart_catalog/catalog-roles/quickstart_catalog_role/grants
{
"grant": {
"type": "catalog",
"privilege": "TABLE_CREATE" // repeated for each privilege
}
}
Grants include:
CATALOG_MANAGE_CONTENT: Overall catalog managementTABLE_*: Table operations (list, create, drop, read/write properties, write data)NAMESPACE_*: Namespace operations (list, create, read properties)PUT /api/management/v1/principals/quickstart_user-{timestamp}/principal-roles
{
"principalRole": {
"name": "quickstart_user_role"
}
}
Final step: assigns the principal role to the newly created principal.
getConfig()
Returns JSON configuration for the S3 Data Lake connector:
{
"catalog_type": {
"catalog_type": "POLARIS",
"server_uri": "http://localhost:8181/api/catalog",
"catalog_name": "quickstart_catalog",
"client_id": "{dynamically-generated-clientId}",
"client_secret": "{dynamically-generated-clientSecret}",
"namespace": "<DEFAULT_NAMESPACE_PLACEHOLDER>"
},
"s3_bucket_name": "bucket123",
"s3_bucket_region": "us-east-1",
"access_key_id": "minio_root",
"secret_access_key": "m1n1opwd",
"s3_endpoint": "http://localhost:9000",
"warehouse_location": "s3://bucket123/",
"main_branch_name": "main"
}
Configuration highlights:
quickstart_catalog created during setupclient_id and client_secret fields with dynamically created principal credentialsstopServices()
docker compose down -v to stop and remove containers and volumesPolaris supports two credential modes:
The current configuration uses client-side credentials, where the client provides S3 credentials (access_key_id and secret_access_key) directly in the configuration. The connector uses these MinIO root credentials to access the underlying storage directly.
Polaris also supports server-side credential vending. When header.X-Iceberg-Access-Delegation: vended-credentials is added to the catalog configuration, Polaris generates temporary S3 credentials for each operation:
AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY)The docker-compose.yml is already configured with AWS_ACCESS_KEY_ID: minio_root and AWS_SECRET_ACCESS_KEY: m1n1opwd to support this mode, but it's not currently enabled in the test configuration.
The Polaris environment is automatically managed by integration tests:
# Run all Polaris integration tests
./gradlew :airbyte-integrations:connectors:destination-s3-data-lake:integrationTestNonDocker \
--tests "*PolarisWriteTest"
PolarisEnvironment.getConfig() for catalog configurationstartServices() is called (idempotent, runs once per JVM)stopServices() can be called to clean up (optional)