doc/development/database/partitioning/_index.md
[!warning] If you have questions not answered below, check for and add them to this issue. Tag
@gitlab-org/database-team/triageand we'll get back to you with an answer as soon as possible. If you get an answer in Slack, document it on the issue as well so we can update this document in the future.
Table partitioning is a powerful database feature that allows a table's data to be split into smaller physical tables that act as a single large table. If the application is designed to work with partitioning in mind, there can be multiple benefits, such as:
VACUUM and index rebuilds can operate on
individual partitions, rather than across a single massive table.Unfortunately, not all models fit a partitioning scheme, and there are significant drawbacks if implemented incorrectly. Additionally, tables can only be partitioned at their creation, making it nontrivial to apply partitioning to a busy database. A suite of migration tools are available to enable backend developers to partition existing tables, but the migration process is rather heavy, taking multiple steps split across several releases. Due to the limitations of partitioning and the related migrations, you should understand how partitioning fits your use case before attempting to leverage this feature.
The partitioning migration helpers work by creating a partitioned duplicate of the original table and using a combination of a trigger and a background migration to copy data into the new table. Changes to the original table schema can be made in parallel with the partitioning migration, but they must take care to not break the underlying mechanism that makes the migration work. For example, if a column is added to the table that is being partitioned, both the partitioned table and the trigger definition must be updated to match.
While partitioning can be very useful when properly applied, it's imperative to identify if the data and workload of a table naturally fit a partitioning scheme. Understand a few details to decide if partitioning is a good fit for your particular problem:
WHERE
clause on almost all queries accessing that table.The available partitioning strategy choices are date range, int range, hash, and list.