Back to Delta

IcebergWriterCompatV1

protocol_rfcs/iceberg-writer-compat-v1.md

4.2.09.4 KB
Original Source

IcebergWriterCompatV1

**Associated Github issue for discussions: https://github.com/delta-io/delta/issues/4284

This protocol change introduces a compatibility flag, which ensures that a delta table can be safely read and written as an Apache Iceberg™ format table, similar to IcebergCompatV1 and IcebergCompatV2.


IcebergWriterCompatV1

New Section after Iceberg Compatibility V2

This table feature (icebergWriterCompatV1) ensures that Delta tables can be converted to Apache Iceberg™ format, though this table feature does not implement or specify that conversion.

To support this feature:

  • Since this table feature depends on Column Mapping, the table must be on Reader Version = 2, or it must be on Reader Version >= 3 and the feature columnMapping must exist in the protocol's readerFeatures.
  • The table must be on Writer Version 7.
  • The feature icebergCompatV2 must exist in the table protocol's writerFeatures.
  • The feature icebergWriterCompatV1 must exist in the table protocol's writerFeatures.

This table feature is enabled when the table property delta.enableIcebergWriterCompatV1 is set to true.

Writer Requirements for IcebergWriterCompatV1

For IcebergWriterCompatV1 writers must ensure:

  • The table is using Column Mapping and that it is set to id mode.

    • Note this is a tightening of the IcebergCompatV2 requirement which supports name and id mode.
  • Each field must have a column mapping physical name that is exactly col-[column id]. That is the delta.columnMapping.physicalName in the column metadata must be equal to col-[delta.columnMapping.id]. The following is an example compliant schema definition:

json
{
  "type": "struct",
  "fields": [
    {
      "name": "a",
      "type": "integer",
      "nullable": false,
      "metadata": {
        "delta.columnMapping.id": 1,
        "delta.columnMapping.physicalName": "col-1"
      }
    },
    {
      "name": "b",
      "type": "string",
      "nullable": false,
      "metadata": {
        "delta.columnMapping.id": 2,
        "delta.columnMapping.physicalName": "col-2"
      }
    }
  ]
}
  • The table does not contain any columns with the type byte or short

    • Note that these types are allowed by IcebergCompatV2
    • Therefore the list of allowed types for a table with IcebergWriterCompatV1 enabled is: [integer, long, float, double, decimal, string, binary, boolean, timestamp, timestampNTZ, date, array, map, struct].
  • Iceberg Compatibility V2 is enabled on the table.

  • The writer must block any schema changes to a struct that is used as a map key.

    • For example, if the schema contains map MAP<STRUCT<s: STRING>, INT>, then any schema change to map.key must be disallowed.
    • Changes to the schema of the value are allowed.
    • This matches Iceberg's behavior, which is documented here. In practice Iceberg writers block any changes, not just column additions.
  • Any enabled features are in the allowlist

  • All Disallowed features are not supported and/or inactive (see below)

Disallowed Features

For this section, we use the specific meanings of "supported" and "active" from Supported Features. All the following features must not be used in the table. For legacy features (any feature introduced before writer version 7), the feature can be "supported", but must not be "active".

FeatureLegacyCan be "supported"?Not Active Check
column invariantsYesYes, if not activeNo column includes delta.invariants in its Metadata
Change Data FeedYesYes, if not activeThe delta.enableChangeDataFeed configuration flag in the Metadata of the table does not exist (or is disabled?)
CHECK ConstraintsYesYes, if not activeNo keys in the configuration field of Metadata start with delta.constraints..
Identity ColumnsYesYes, if not activeNo columns exist in the schema with any of the properties specified in Identity Columns in the column metadata: delta.identity.start, delta.identity.step, delta.identity.highWaterMark, delta.identity.allowExplicitInsert
Generated ColumnsYesYes, if not activeNo column metadata contains the key delta.generationExpression
Default ColumnsNoNoN/A
Row TrackingNoYes, if not activeThe delta.enableRowTracking configuration flag in the Metadata of the table does not exist (or has a value of false)
CollationsNoNoN/A
Variant TypesNoNoN/A

Allowed Supported list of features

To ensure that future features do not break tables with IcebergWriterCompatV1 enabled, all enabled features must also be checked against an allowlist. Any enabled table features must be in the list: [appendOnly, columnMapping, icebergWriterCompatV1, icebergCompatV2, domainMetadata, vacuumProtocolCheck, v2Checkpoint, inCommitTimestamp, clustering, timestampNtz, typeWidening]

Additionally, the following features are allowed to be "supported", but must not be "active" (see Disallowed Features): [invariants, changeDataFeed, checkConstraints, identityColumns, generatedColumns, rowTracking]. These features, if supported, must be verified to be "inactive" via the checks specified above.

We allow these legacy features to be "supported" because protocol updates can cause features to be carried over even though they are not in use. For example, if a table is on writer version 2, and then is updated to version 7, invariants can appear in the writerFeatures list because it was implicitly supported at version 2, even if it was not in use.