docs/sources/reference-pyroscope-v2-architecture/components/compaction-worker.md
The compaction-worker is a stateless component responsible for merging small segments into larger blocks. This improves query performance by reducing the number of objects that need to be read from object storage.
The ingestion pipeline creates many small segments—potentially millions of objects per hour at scale. Without compaction, this leads to:
Compaction workers compact data as soon as possible after it's written to object storage:
This ensures that query performance remains optimal even during high ingestion rates.
Compaction jobs are coordinated by the metastore, which:
Workers specify their available capacity when polling for jobs, allowing the system to adapt to the available resources.
Profiling data from each service (identified by the service_name label) is stored as a separate dataset within a block. During compaction:
The output block contains non-overlapping, independent datasets optimized for efficient reading.
Compaction workers are completely stateless:
If a compaction worker fails:
Jobs that repeatedly fail are deprioritized to prevent blocking the compaction queue.
After compaction completes, the original source blocks are not immediately deleted. Instead, tombstones are created in the metastore. The actual deletion happens after a configurable delay, giving queries time to discover the new compacted blocks and stop accessing the original ones. Eventually, tombstones are included in compaction jobs, and the worker removes the source objects from object storage.
For detailed information about the compaction process, refer to Compaction.