documentation/operations/data-retention.md
The nature of time-series data is that the relevance of information diminishes over time. If stale data is no longer required, users can delete old data from QuestDB to either save disk space or adhere to a data retention policy. This is achieved in QuestDB by removing data partitions from a table.
QuestDB offers two approaches for data retention:
DROP PARTITION commands as described on this page for
explicit control over which partitions to remove and when.This page provides a high-level overview of partitioning with examples to drop data by date. For more details on partitioning, see the partitioning page.
This section covers the manual approach to removing stale data by dropping
partitions. A table must have a
designated timestamp assigned and a
partitioning strategy specified during a CREATE TABLE operation to achieve
this.
:::note
Users cannot alter the partitioning strategy after a table is created.
:::
Tables can be partitioned by one of the following:
YEARMONTHWEEKDAYHOURCREATE TABLE my_table(ts TIMESTAMP, symb SYMBOL, price DOUBLE) timestamp(ts)
PARTITION BY DAY;
:::caution
Use DROP PARTITION with care, as QuestDB cannot recover data from dropped
partitions.
:::
To drop partitions, users can use the ALTER TABLE DROP PARTITION syntax. Partitions may be dropped by:
DROP PARTITION LIST - specifying a comma-separated list of partitions to
drop
--Delete a partition
ALTER TABLE my_table DROP PARTITION LIST '2021-01-01';
--Delete a list of two partitions
ALTER TABLE my_table DROP PARTITION LIST '2021-01-01', '2021-01-02';
WHERE timestamp = - exact date matching by timestamp
ALTER TABLE my_table DROP PARTITION
WHERE timestamp = to_timestamp('2021-01-01', 'yyyy-MM-dd');
WHERE timestamp < - using comparison operators (< / >) to delete by time
range relative to a timestamp. Note that the now() function may be used to
automate dropping of partitions relative to the current time, i.e.:
--Drop partitions older than 30 days
ALTER TABLE my_table DROP PARTITION
WHERE timestamp < dateadd('d', -30, now());
Usage notes:
DROP PARTITION commands must be triggered manually or via
external scheduling (e.g., cron jobs). For fully automated retention, consider
using TTL instead.The following example demonstrates how to create a table with partitioning and
to drop partitions based on time. This example produces 5 days' worth of data
with one incrementing LONG value inserted per hour.
CREATE TABLE my_table (timestamp TIMESTAMP, x LONG) timestamp(timestamp)
PARTITION BY DAY;
INSERT INTO my_table
SELECT timestamp_sequence(
to_timestamp('2021-01-01T00:00:00', 'yyyy-MM-ddTHH:mm:ss'),100000L * 36000), x
FROM long_sequence(120);
For reference, the following functions are used to generate the example data:
x:long columnThe result of partitioning is visible when listing as directories on disk:
my_table
├── 2021-01-01
├── 2021-01-02
├── 2021-01-03
├── 2021-01-04
└── 2021-01-05
Partitions can be dropped using the following query:
--Delete days before 2021-01-03
ALTER TABLE my_table DROP PARTITION
WHERE timestamp < to_timestamp('2021-01-03', 'yyyy-MM-dd');