docs/sources/reference-pyroscope-v2-architecture/metadata-index/index.md
The metadata index stores information about all data objects (blocks and segments) in object storage. It is maintained by the metastore service and provides fast lookups for query planning.
The metadata index enables:
The index is implemented using:
BoltDB was chosen for its simplicity and efficiency with a single writer and concurrent readers. For better performance, the index can be stored on an in-memory volume since it's recovered from the Raft log on startup.
Each block in object storage has a corresponding metadata entry containing:
Each dataset within a block includes:
service_name label identifying the applicationThe index is partitioned by time, with each partition covering a 6-hour window:
Partition (6h window)
├── Tenant A
│ ├── Shard 0
│ ├── Shard 1
│ └── Shard N
└── Tenant B
├── Shard 0
└── Shard N
Within each shard:
Index writes are performed by segment-writers when new segments are created:
{{< mermaid >}} sequenceDiagram participant SW as segment-writer participant M as Metastore participant R as Raft participant I as Index
SW->>M: AddBlock(metadata)
M->>R: Propose ADD_BLOCK
R->>R: Commit to log
R->>I: Insert block
I-->>R: Success
R-->>M: Committed
M-->>SW: Success
{{< /mermaid >}}
Before adding a block, the index checks for tombstones to prevent re-adding blocks that were already compacted. This handles cases where:
Queries use the linearizable read pattern to ensure consistency:
This allows both leader and follower replicas to serve queries while ensuring they see the latest committed state.
The index supports two main query patterns:
Metadata queries: Find blocks matching criteria
Query:
- Time range: [start, end]
- Tenant: ["tenant-1"]
- Labels: {service_name="frontend"}
Label queries: List available labels without reading data
Query:
- Return: distinct values for "profile_type" label
- Filter: {service_name="frontend"}
When blocks are compacted:
Retention policies delete entire partitions based on:
Retention policies are tenant-specific and configurable per tenant.
The cleaner runs on the Raft leader and:
{{< mermaid >}} sequenceDiagram participant C as Cleaner participant M as Metastore participant R as Raft participant I as Index
C->>M: TruncateIndex(policy)
M->>I: List partitions
I-->>M: Partition list
M->>M: Apply retention policy
M->>R: Propose TRUNCATE_INDEX
R->>I: Delete partitions
R->>I: Add tombstones
R-->>M: Committed
M-->>C: Success
{{< /mermaid >}}
The index uses several caches:
For detailed implementation information, including the protobuf schema and internal structures, refer to the internal documentation.