.agents/skills/clickhouse-best-practices/rules/schema-partition-lifecycle.md
Impact: HIGH
Partitioning is primarily a data management technique, not a query optimization tool. It excels at:
Incorrect (no time alignment for lifecycle):
-- Cannot efficiently drop old data by time
CREATE TABLE events (...)
ENGINE = MergeTree()
PARTITION BY event_type -- No time alignment
ORDER BY (timestamp);
-- Slow: must scan and delete row by row
DELETE FROM events WHERE timestamp < '2023-01-01';
Correct (time-based for lifecycle):
CREATE TABLE events (
timestamp DateTime,
event_type LowCardinality(String)
)
ENGINE = MergeTree()
PARTITION BY toStartOfMonth(timestamp)
ORDER BY (event_type, timestamp)
TTL timestamp + INTERVAL 1 YEAR DELETE; -- Drops whole partitions
-- Fast: metadata-only operation
ALTER TABLE events DROP PARTITION '202301';
-- Archive to cold storage
ALTER TABLE events_archive ATTACH PARTITION '202301' FROM events;
Reference: Choosing a Partitioning Key