docs/database_partitioning.md
The enriched_events table uses PostgreSQL native range partitioning managed by pg_partman. Partitions are created monthly based on the timestamp column.
| Parameter | Value |
|---|---|
| Partition key | timestamp |
| Interval | 1 month |
| Type | range |
| Pre-creation | 3 months |
| Retention | 14 months |
| Retention keeps tables | yes |
| Infinite time partitions | yes |
This configuration lives in partman.part_config.
Before any setup, verify that pg_partman is available on your PostgreSQL server:
SELECT * FROM pg_available_extensions WHERE name = 'pg_partman';
If the query returns no rows, the extension is not installed on the server and you need to install it before proceeding. All Lago migrations that depend on pg_partman check this and skip gracefully when the extension is absent.
Once installed as an extension, verify it is enabled in your database:
SELECT * FROM pg_extension WHERE extname = 'pg_partman';
Lago migrations skip partitioning gracefully when pg_partman is not available at migration time. The enriched_events table is then created as a regular (non-partitioned) table. If you install pg_partman later, follow the steps below to convert the existing table to a partitioned one and register it with pg_partman.
All SQL below must be run by a role with ownership on the
enriched_eventstable.
CREATE SCHEMA IF NOT EXISTS partman;
CREATE EXTENSION IF NOT EXISTS pg_partman SCHEMA partman;
ALTER TABLE public.enriched_events RENAME TO enriched_events_old;
CREATE TABLE public.enriched_events (
id uuid DEFAULT gen_random_uuid() NOT NULL,
organization_id uuid NOT NULL,
event_id uuid NOT NULL,
transaction_id character varying NOT NULL,
external_subscription_id character varying NOT NULL,
code character varying NOT NULL,
"timestamp" timestamp(6) without time zone NOT NULL,
subscription_id uuid NOT NULL,
plan_id uuid NOT NULL,
charge_id uuid NOT NULL,
charge_filter_id uuid,
grouped_by jsonb DEFAULT '{}'::jsonb NOT NULL,
value character varying,
decimal_value numeric(40,15) DEFAULT 0.0 NOT NULL,
enriched_at timestamp(6) without time zone NOT NULL,
PRIMARY KEY (id, "timestamp")
) PARTITION BY RANGE ("timestamp");
CREATE TABLE public.enriched_events_default PARTITION OF public.enriched_events DEFAULT;
CREATE INDEX idx_billing_on_enriched_events
ON public.enriched_events (organization_id, subscription_id, charge_id, charge_filter_id, "timestamp");
CREATE INDEX idx_lookup_on_enriched_events
ON public.enriched_events (organization_id, external_subscription_id, code, "timestamp");
CREATE UNIQUE INDEX idx_unique_on_enriched_events
ON public.enriched_events (organization_id, external_subscription_id, transaction_id, "timestamp", charge_id);
CREATE INDEX index_enriched_events_on_event_id
ON public.enriched_events (event_id);
INSERT INTO public.enriched_events
SELECT * FROM public.enriched_events_old;
If the table is large, consider batching inserts or running this during a maintenance window.
DROP TABLE public.enriched_events_old;
SELECT partman.create_parent(
p_parent_table := 'public.enriched_events',
p_control := 'timestamp',
p_interval := '1 month',
p_type := 'range',
p_premake := 3,
p_start_partition := '2024-12-01'
);
UPDATE partman.part_config
SET infinite_time_partitions = true,
retention = '14 months',
retention_keep_table = true
WHERE parent_table = 'public.enriched_events';
Trigger a first maintenance run to create the monthly partitions and move data out of the default partition into the correct ones:
CALL partman.run_maintenance_proc();
After this, configure one of the two scheduled maintenance approaches described below.
pg_partman requires periodic execution of partman.run_maintenance_proc() to:
p_premake)retention)If maintenance does not run, inserts will fall into the enriched_events_default default partition, degrading query performance and making future partition creation harder to reconcile.
There are two approaches to schedule this.
pg_partman_bgw)This is a built-in background worker shipped with pg_partman. It requires no additional extension but needs PostgreSQL server-level configuration (i.e. access to postgresql.conf).
postgresql.confAdd pg_partman_bgw to shared_preload_libraries and set its parameters:
shared_preload_libraries = 'pg_partman_bgw'
pg_partman_bgw.dbname = lago
pg_partman_bgw.interval = 3600 # seconds (1 hour)
pg_partman_bgw.role = lago
dbname — the database(s) to run maintenance on (comma-separated for multiple).interval — how often to run, in seconds. 3600 = hourly.role — the PostgreSQL role used to execute maintenance. Must have ownership or sufficient privileges on the partitioned tables and the partman schema.Changes to shared_preload_libraries require a full server restart.
SELECT * FROM pg_stat_activity WHERE backend_type = 'pg_partman_bgw';
You should see one active row. You can also check the PostgreSQL logs for entries like:
LOG: pg_partman_bgw: running maintenance on database "lago"
pg_partman_bgw is not provided by some managed PostgresSQL providers, in this case or if you prefer a SQL-level scheduling interface you could rely on pg_cron.
pg_cron also requires being loaded at server start. In postgresql.conf:
shared_preload_libraries = 'pg_cron'
cron.database_name = 'postgres'
Changes to shared_preload_libraries require a full server restart.
Open a connection to the postgres database
CREATE EXTENSION IF NOT EXISTS pg_cron;
SELECT cron.schedule_in_database(
'partman-maintenance',
'@hourly',
$$CALL partman.run_maintenance_proc()$$,
'lago'
);
SELECT jobid, schedule, command, nodename, active
FROM cron.job
WHERE jobname = 'partman-maintenance';
SELECT jobid, start_time, end_time, status, return_message
FROM cron.job_run_details
WHERE jobid = (SELECT jobid FROM cron.job WHERE jobname = 'partman-maintenance')
ORDER BY start_time DESC
LIMIT 10;
The Lago Docker image (getlago/postgres-partman) ships with pg_partman pre-installed. The provided scripts/postgresql.conf already configures the pg_partman_bgw approach with hourly maintenance. No additional setup is required when using the default Docker Compose configuration.