docs/sources/reference-pyroscope-v2-architecture/compaction/index.md
Compaction is the process of merging multiple small segments into larger, optimized blocks. This is essential for maintaining query performance and controlling metadata index size.
The ingestion pipeline creates many small segments—potentially millions of objects per hour at scale. Without compaction:
Compaction in Pyroscope v2 is coordinated by the metastore and executed by compaction-workers.
{{< mermaid >}} sequenceDiagram participant W as Compaction Worker participant M as Metastore participant S as Object Storage
loop Continuous
W->>M: Poll for jobs
M->>W: Assign job with source blocks
W->>S: Download source segments
W->>W: Merge segments into block
W->>S: Upload compacted block
W->>M: Report completion
M->>M: Update metadata index
end
{{< /mermaid >}}
The compaction service runs within the metastore and is responsible for:
The compaction service relies on Raft to guarantee consistency:
This ensures all replicas maintain consistent views of compaction state.
The job planner maintains a queue of blocks eligible for compaction:
Profiling data from each service is stored as a separate dataset within a block. During compaction:
The scheduler uses a Small Job First strategy:
Workers specify available capacity when polling for jobs. The scheduler:
Jobs are assigned using a lease-based model:
When a worker fails:
Jobs that repeatedly fail are deprioritized to prevent blocking the queue.
{{< mermaid >}} stateDiagram-v2 [*] --> Unassigned : Create Job Unassigned --> InProgress : Assign Job InProgress --> Success : Job Completed InProgress --> LeaseExpired: Job Lease Expires LeaseExpired: Abandoned Job
LeaseExpired --> Excluded: Failure Threshold Exceeded
Excluded: Faulty Job
Success --> [*] : Remove Job from Schedule
LeaseExpired --> InProgress : Reassign Job
{{< /mermaid >}}
After successful compaction:
This two-phase deletion prevents query failures during compaction.
For detailed implementation information, including job scheduling algorithms and lease management, refer to the internal documentation.