docs/sources/get-started/labels/cardinality.md
The cardinality of a data attribute is the number of distinct values that the attribute can have. For example, a boolean column in a database, which can only have a value of either true or false has a cardinality of 2.
High cardinality refers to a column or row in a database that can have many possible values. For an online shopping system, fields like userId, shoppingCartId, and orderId are often high-cardinality columns that can have hundreds of thousands of distinct values.
Other examples of high cardinality attributes include the following:
When we talk about cardinality in Loki we are referring to the combination of labels and values and the number of log streams they create. In Loki, the fewer labels you use, the better. This is why Loki has a default limit of 15 index labels.
High cardinality can result from using labels with a large range of possible values, or combining many labels, even if they have a small and finite set of values, such as combining status_code and action. A typical set of status codes (200, 404, 500) and actions (GET, POST, PUT, PATCH, DELETE) would create 15 unique streams. But, adding just one more label like endpoint (/cart, /products, /customers) would triple this to 45 unique streams.
To see an example of series labels and cardinality, refer to the LogCLI tutorial. As you can see, the cardinality for individual labels can be quite high, even before you begin combining labels for a particular log stream, which increases the cardinality even further.
To view the cardinality of your current labels, you can use logcli.
logcli series '{}' --since=1h --analyze-labels
High cardinality causes Loki to create many streams, especially when labels have many unique values, and when those values are short-lived (for example, active for seconds or minutes). This causes Loki to build a huge index, and to flush thousands of tiny chunks to the object store.
Loki was not designed or built to support high cardinality label values. In fact, it was built for exactly the opposite. It was built for very long-lived streams and very low cardinality in the labels. In Loki, the fewer labels you use, the better.
High cardinality can lead to significant performance degradation.
To avoid high cardinality in Loki, you should:
{{< admonition type="note" >}}
Structured metadata is a feature in Loki and Cloud Logs that allows customers to store metadata that is too high cardinality for log lines, without needing to embed that information in log lines themselves.
It is a great home for metadata which is not easily embeddable in a log line, but is too high cardinality to be used effectively as a label. Query acceleration with Blooms also utilizes structured metadata.
{{< /admonition >}}