content/shared/v3-distributed-admin-custom-partitions/_index.md
When writing data to {{< product-name >}}, the InfluxDB 3 storage engine stores data in Apache Parquet format in the Object store. Each Parquet file represents a partition--a logical grouping of data. By default, InfluxDB partitions each table by day. If this default strategy yields unsatisfactory performance for single-series queries, you can define a custom partitioning strategy by specifying tag values and different time intervals to optimize query performance for your specific schema and workload.
[!Note]
When to consider custom partitioning
Consider custom partitioning if:
- You have taken steps to optimize your queries, and
- Performance for single-series queries (querying for a specific tag value or tag set) is still unsatisfactory.
Before choosing a partitioning strategy, weigh the advantages, disadvantages, and limitations of custom partitioning.
The primary advantage of custom partitioning is that it lets you customize your storage structure to improve query performance specific to your schema and workload.
Using custom partitioning may increase the load on other parts of the InfluxDB 3 storage engine, but you can scale each part individually to address the added load.
[!Note] The weight of these disadvantages depends upon the cardinality of tags and the specificity of time intervals used for partitioning.
Custom partitioning has the following limitations:
After you have considered the advantages, disadvantages, and limitations of custom partitioning, use the guides in this section to:
A partition template defines the pattern used for partition keys and determines the time interval that InfluxDB partitions data by. Partition templates use tag values and Rust strftime date and time formatting syntax.
For more detailed information, see Partition templates.
A partition key uniquely identifies a partition.
A partition template defines the partition key format.
Partition keys are
composed of up to 8 dimensions (1 time part and up to 7 tag or tag bucket parts).
A partition key uses the partition key separator (|) to delimit parts.
The default format for partition keys is %Y-%m-%d (for example, 2024-01-01),
which creates 1 partition for each day.
{{< expand-wrapper >}} {{% expand "View example partition templates and keys" %}}
Given the following line protocol with the following timestamps:
production,line=A,station=cnc temp=81.2,qty=35i 1704063600000000000
production,line=A,station=wld temp=92.8,qty=35i 1704063600000000000
production,line=B,station=cnc temp=101.1,qty=43i 1704063600000000000
production,line=B,station=wld temp=102.4,qty=43i 1704063600000000000
production,line=A,station=cnc temp=81.9,qty=36i 1704067200000000000
production,line=A,station=wld temp=110.0,qty=22i 1704067200000000000
production,line=B,station=cnc temp=101.8,qty=44i 1704067200000000000
production,line=B,station=wld temp=105.7,qty=44i 1704067200000000000
production,line=A,station=cnc temp=82.2,qty=35i 1704070800000000000
production,line=A,station=wld temp=92.1,qty=30i 1704070800000000000
production,line=B,station=cnc temp=102.4,qty=43i 1704070800000000000
production,line=B,station=wld temp=106.5,qty=43i 1704070800000000000
{{% flex %}}
<!---------------------- BEGIN PARTITION EXAMPLES GROUP 1 --------------------->{{% flex-content "half" %}}
%Y-%m-%d <em class="op50">time (by day, default format)</em>{{% /flex-content %}} {{% flex-content %}}
2023-12-312024-01-01{{% /flex-content %}}
<!----------------------- END PARTITION EXAMPLES GROUP 1 ---------------------->{{% /flex %}}
{{% flex %}}
<!---------------------- BEGIN PARTITION EXAMPLES GROUP 2 --------------------->{{% flex-content "half" %}}
line <em class="op50">tag</em>%d %b %Y <em class="op50">time (by day, non-default format)</em>{{% /flex-content %}} {{% flex-content %}}
A | 31 Dec 2023B | 31 Dec 2023A | 01 Jan 2024B | 01 Jan 2024{{% /flex-content %}}
<!----------------------- END PARTITION EXAMPLES GROUP 2 ---------------------->{{% /flex %}}
{{% flex %}}
<!---------------------- BEGIN PARTITION EXAMPLES GROUP 3 --------------------->{{% flex-content "half" %}}
line <em class="op50">tag</em>station <em class="op50">tag</em>%Y-%m-%d <em class="op50">time (by day, default format)</em>{{% /flex-content %}} {{% flex-content %}}
A | cnc | 2023-12-31A | wld | 2023-12-31B | cnc | 2023-12-31B | wld | 2023-12-31A | cnc | 2024-01-01A | wld | 2024-01-01B | cnc | 2024-01-01B | wld | 2024-01-01{{% /flex-content %}}
<!----------------------- END PARTITION EXAMPLES GROUP 3 ---------------------->{{% /flex %}}
{{% flex %}}
<!---------------------- BEGIN PARTITION EXAMPLES GROUP 4 --------------------->{{% flex-content "half" %}}
line <em class="op50">tag</em>station,3 <em class="op50">tag bucket</em>%Y-%m-%d <em class="op50">time (by day, default format)</em>{{% /flex-content %}} {{% flex-content %}}
A | 0 | 2023-12-31B | 0 | 2023-12-31A | 0 | 2024-01-01B | 0 | 2024-01-01{{% /flex-content %}}
<!----------------------- END PARTITION EXAMPLES GROUP 4 ---------------------->{{% /flex %}}
{{% flex %}}
<!---------------------- BEGIN PARTITION EXAMPLES GROUP 5 --------------------->{{% flex-content "half" %}}
line <em class="op50">tag</em>station <em class="op50">tag</em>%Y-%m <em class="op50">time (by month)</em>{{% /flex-content %}} {{% flex-content %}}
A | cnc | 2023-12A | wld | 2023-12B | cnc | 2023-12B | wld | 2023-12A | cnc | 2024-01A | wld | 2024-01B | cnc | 2024-01B | wld | 2024-01{{% /flex-content %}}
<!----------------------- END PARTITION EXAMPLES GROUP 5 ---------------------->{{% /flex %}}
{{% flex %}}
<!---------------------- BEGIN PARTITION EXAMPLES GROUP 6 --------------------->{{% flex-content "half" %}}
line <em class="op50">tag</em>station,50 <em class="op50">tag bucket</em>%Y-%m <em class="op50">time (by month)</em>{{% /flex-content %}} {{% flex-content %}}
A | 47 | 2023-12A | 9 | 2023-12B | 47 | 2023-12B | 9 | 2023-12A | 47 | 2024-01A | 9 | 2024-01B | 47 | 2024-01B | 9 | 2024-01{{% /flex-content %}}
<!----------------------- END PARTITION EXAMPLES GROUP 6 ---------------------->{{% /flex %}}
{{% /expand %}} {{< /expand-wrapper >}}
When querying data:
The faster the query engine can identify what partitions to read and then read the data in those partitions, the more performant queries are.
For more information about the query lifecycle, see InfluxDB 3 query life cycle.
Consider the following query that selects everything in the production table
where the line tag is A and the station tag is cnc:
SELECT *
FROM production
WHERE
time >= now() - INTERVAL '1 week'
AND line = 'A'
AND station = 'cnc'
Using the default partitioning strategy (by day), the query engine reads eight separate partitions (one partition for today and one for each of the last seven days):
The query engine must scan all rows in the partitions to identify rows
where line is A and station is cnc. This process takes valuable time
and results in less performant queries.
However, including tags in your partitioning strategy allows the query engine to identify partitions containing only the required tag values. This avoids scanning rows for tag values.
For example, if you partition data by line, station, and day, although
the number of files increases, the query engine can quickly identify and read
only those with data relevant to the query:
{{% columns 4 %}}
{{% /columns %}}
{{< children >}}