Pyroscope v2 compaction-worker

The compaction-worker is a stateless component responsible for merging small segments into larger blocks. This improves query performance by reducing the number of objects that need to be read from object storage.

Why compaction is needed

The ingestion pipeline creates many small segments—potentially millions of objects per hour at scale. Without compaction, this leads to:

Read amplification: Queries must fetch many small objects
Increased costs: More API calls to object storage
Metadata bloat: The metastore index grows unboundedly
Performance degradation: Both read and write paths slow down

How it works

Job polling: Workers poll the metastore for available compaction jobs.
Segment download: Workers download source segments from object storage.
Merge operation: Matching datasets from different segments are merged.
Block upload: The compacted block is uploaded to object storage.
Status report: Workers report job completion to the metastore.

Compaction speed

Compaction workers compact data as soon as possible after it's written to object storage:

Median time to first compaction: Less than 15 seconds
Continuous operation: Workers constantly poll for new jobs

This ensures that query performance remains optimal even during high ingestion rates.

Job scheduling

Compaction jobs are coordinated by the metastore, which:

Creates jobs when enough segments are available for compaction
Assigns jobs to workers based on available capacity
Tracks job progress and handles failures
Uses a "Small Job First" strategy to prioritize smaller blocks

Workers specify their available capacity when polling for jobs, allowing the system to adapt to the available resources.

Data layout

Profiling data from each service (identified by the service_name label) is stored as a separate dataset within a block. During compaction:

Matching datasets from different blocks are merged
TSDB indexes are combined
Symbols and profile tables are merged and rewritten

The output block contains non-overlapping, independent datasets optimized for efficient reading.

Stateless design

Compaction workers are completely stateless:

Require no persistent local storage
Scale horizontally by adding more instances
Allow instances to be added or removed at any time
Use default concurrency based on available CPU cores

Fault tolerance

If a compaction worker fails:

The job lease expires
The metastore reassigns the job to another worker
Source segments remain in object storage until compaction succeeds

Jobs that repeatedly fail are deprioritized to prevent blocking the compaction queue.

Garbage collection

After compaction completes, the original source blocks are not immediately deleted. Instead, tombstones are created in the metastore. The actual deletion happens after a configurable delay, giving queries time to discover the new compacted blocks and stop accessing the original ones. Eventually, tombstones are included in compaction jobs, and the worker removes the source objects from object storage.

For detailed information about the compaction process, refer to Compaction.