Back to Langfuse

Order Columns by Cardinality (Low to High)

.agents/skills/clickhouse-best-practices/rules/schema-pk-cardinality-order.md

3.172.11.4 KB
Original Source

Order Columns by Cardinality (Low to High)

Impact: CRITICAL

Since the sparse primary index operates on data blocks (granules) rather than individual rows, low-cardinality leading columns create more useful index entries that can skip entire blocks. Place lower-cardinality columns before higher-cardinality ones in the ordering key.

Incorrect (high cardinality first):

sql
-- UUID first means no pruning benefit
CREATE TABLE events (...)
ENGINE = MergeTree()
ORDER BY (event_id, event_type, timestamp);
-- Every granule has different event_id values, index can't skip anything

Correct (low cardinality first):

sql
-- Low cardinality first enables pruning
CREATE TABLE events (...)
ENGINE = MergeTree()
ORDER BY (event_type, event_date, event_id);
-- Index can skip entire event_type groups

Column Order Guidelines:

PositionCardinalityExamples
1stLow (few distinct values)event_type, status, country
2ndDate (coarse granularity)toDate(timestamp)
3rd+Medium-Highuser_id, session_id
LastHigh (if needed)event_id, uuid

Tip: Use toDate(timestamp) instead of raw DateTime columns when day-level filtering suffices - this reduces index size from 32-bit to 16-bit representations.

Reference: Choosing a Primary Key