crates/obs/src/cleaner/README.md
The cleaner module is a production-focused background lifecycle manager for RustFS log archives.
It periodically discovers rolled files, applies retention constraints, compresses candidates, and then deletes sources safely.
The subsystem is designed to be conservative by default:
Discovery (scanner.rs)
read_dir scan (no recursion) for predictable latency..gz / .zst) in one pass.Selection (core.rs)
Compression + Deletion (core.rs + compress.rs)
zstd and gzip codecs.Injector + Worker::new_fifo + Stealer).Archive Expiry (core.rs)
zstd (default) for better ratio and faster decompression.gzip when zstd fallback is enabled.symlink_metadata.*.gz or *.zst target means the file is treated as already compressed.The parallel path in core.rs uses this fixed lookup sequence per worker:
local_worker.pop()injector.steal_batch_and_pop(&local_worker)Steal::from_iter(...)This strategy keeps local cache affinity while still balancing stragglers.
The cleaner emits tracing events and runtime metrics:
rustfs_log_cleaner_deleted_files_total (counter)rustfs_log_cleaner_freed_bytes_total (counter)rustfs_log_cleaner_compress_duration_seconds (histogram)rustfs_log_cleaner_steal_success_rate (gauge)rustfs_log_cleaner_rotation_total (counter)rustfs_log_cleaner_rotation_failures_total (counter)rustfs_log_cleaner_rotation_duration_seconds (histogram)rustfs_log_cleaner_active_file_size_bytes (gauge)These values can be wired into dashboards and alert rules for cleanup health.
For regular logs, the cleaner evaluates candidates in this order:
keep_files newest matching generations;max_total_size_bytes;max_single_file_size_bytes;| Env Var | Meaning |
|---|---|
RUSTFS_OBS_LOG_COMPRESSION_ALGORITHM | zstd or gzip |
RUSTFS_OBS_LOG_PARALLEL_COMPRESS | Enable work-stealing compression |
RUSTFS_OBS_LOG_PARALLEL_WORKERS | Worker count for parallel compressor |
RUSTFS_OBS_LOG_ZSTD_COMPRESSION_LEVEL | Zstd level (1-21) |
RUSTFS_OBS_LOG_ZSTD_FALLBACK_TO_GZIP | Fallback switch on zstd failure |
RUSTFS_OBS_LOG_ZSTD_WORKERS | zstdmt worker threads per compression task |
RUSTFS_OBS_LOG_DRY_RUN | Dry-run mode |
RUSTFS_OBS_LOG_COMPRESSED_FILE_RETENTION_DAYS | Retention window for *.gz / *.zst archives |
RUSTFS_OBS_LOG_DELETE_EMPTY_FILES | Remove zero-byte regular log files during scanning |
RUSTFS_OBS_LOG_MIN_FILE_AGE_SECONDS | Minimum age for regular log eligibility |
use rustfs_obs::LogCleaner;
use rustfs_obs::types::{CompressionAlgorithm, FileMatchMode};
use std::path::PathBuf;
let cleaner = LogCleaner::builder(
PathBuf::from("/var/log/rustfs"),
"rustfs.log".to_string(),
"rustfs.log".to_string(),
)
.match_mode(FileMatchMode::Suffix)
.keep_files(30)
.max_total_size_bytes(2 * 1024 * 1024 * 1024)
.compress_old_files(true)
.compression_algorithm(CompressionAlgorithm::Zstd)
.parallel_compress(true)
.parallel_workers(6)
.zstd_compression_level(8)
.zstd_fallback_to_gzip(true)
.zstd_workers(1)
.dry_run(false)
.build();
let _ = cleaner.cleanup();
FileMatchMode::Suffix when rotations prepend timestamps to the filename.FileMatchMode::Prefix when rotations append counters or timestamps after a stable base name.parallel_workers modest when zstd_workers is greater than 1, because each compression task may already use internal codec threads.