docs/documentation/performance-tuning/create-index.mdx
These actions can improve the performance and memory consumption of CREATE INDEX and REINDEX statements.
ParadeDB uses Postgres' max_parallel_maintenance_workers setting to determine the degree of parallelism during CREATE INDEX/REINDEX. Postgres' default is 2, which may be too low for large tables.
SET max_parallel_maintenance_workers = 8;
In order for max_parallel_maintenance_workers to take effect, it must be less than or equal to both max_parallel_workers and max_worker_processes.
The default Postgres maintenance_work_mem value of 64MB is quite conservative and can slow down parallel index builds. We recommend at least 64MB per
parallel indexing worker.
SET maintenance_work_mem = '2GB';
If possible, creating the BM25 index should be deferred until after a table has been populated. To illustrate:
-- This is preferred
CREATE TABLE test (id SERIAL, data text);
INSERT INTO test (data) VALUES ('hello world'), ('many more values');
CREATE INDEX ON test USING bm25 (id, data) WITH (key_field = 'id');
-- ...to this
CREATE TABLE test (id SERIAL, data text);
CREATE INDEX ON test USING bm25 (id, data) WITH (key_field = 'id');
INSERT INTO test (data) VALUES ('hello world'), ('many more values');
This allows the BM25 index to create a more tightly packed, efficient representation on disk and will lead to faster build times.