docs/sources/community/lids/0006-api-expose-split.md
Author: Karsten Jeschkies ([email protected])
Date: 03/2025
Sponsor(s): @trevorwhitney
Type: API
Status: Review
Related issues/PRs: N/A
Thread from mailing list: N/A
Loki has an internal logic to split and shard log and metric queries by time into multiple queries. However, this logic is not accessible outside of the code base. This proposal intends to create an API for clients to split queries by exposing the internal split logic.
A split query is divided by time. The results of a split query can be concatenated in order to form the final result.
A sharded query is divided by label values. The results of a sharded cannot always be concatenated but require some
extra logic to form the final result. Some queries, such as topk cannot be sharded at all.
Loki clients such as the Grafana Loki datasource or the Trino Loki connector benefit from splitting LogQL queries into multiple sub-queries either to process smaller chunks or to distribute work on query results.
Splitting a query requires parsing the LogQL query first but there are no parsers for other languages except Go and JavaScript.
The intended goal is to enable any client to split a query into multiple sub-queries that can be either executed sequentially or in parallel. The joined result of the sub-queries must be the same as executing the same query.
This proposal does not aim to provide pagination for query results.
Without an API each client will have to use a LogQL parser.
Pros
Cons
A new endpoint GET /loki/api/v1/split_query is introduced that takes a splits parameter and the same parameters as the /loki/api/v1/query_range endpoint. The new endoint returns sub-queries split by time.
The splits parameter optionally defines the number of desired splits. The API is allowed to return fewer splits than requested.
The limit parameter has extended semantics. Setting it to 0 for a log stream query indicates to query all logs.
The response body is JSON encoded:
{
"resultType": "matrix" | "streams" | "vector",
"subqueries": [
{
start: <timestamp nanoseconds>,
end: <timestamp nanoseconds>,
limit: <number>,
query: <query string>
},
{
start: <timestamp nanoseconds>,
end: <timestamp nanoseconds>,
limit: <number>,
query: <query string>
}
]
}
Pros
Cons
Loki could support Apache Arrow Flight RPC which is designed to exchange large data sets in shards between services.
Pros
Cons