v2/benchmark/examples/reporting/README.md
This directory contains advanced tools for viewing, analyzing, and comparing Claude Flow benchmark reports with detailed metrics and file references.
enhanced_report_viewer.py)Advanced report analysis with detailed metrics and file references.
# View summary of all reports
python enhanced_report_viewer.py --summary
# View detailed report for specific benchmark
python enhanced_report_viewer.py --id <benchmark-id>
# View latest benchmark report
python enhanced_report_viewer.py --latest
# Analyze trends across multiple benchmarks
python enhanced_report_viewer.py --trends
# Use custom report directory
python enhanced_report_viewer.py --dir ./my-reports --summary
Features:
realtime_monitor.py)Monitor benchmark execution with live metrics updates.
# Monitor a simple task
python realtime_monitor.py "Create a REST API"
# Monitor with specific strategy
python realtime_monitor.py "Build authentication system" --strategy development
# Monitor with more workers
python realtime_monitor.py "Complex analysis task" --max-workers 8
# Custom output directory
python realtime_monitor.py "Test task" --output-dir ./live-reports
Features:
compare_benchmarks.py)Run and compare multiple benchmarks with detailed analysis.
# Quick comparison (3 different tasks)
python compare_benchmarks.py --preset quick
# Thorough comparison (5 configurations)
python compare_benchmarks.py --preset thorough
# Strategy comparison (6 strategies, same task)
python compare_benchmarks.py --preset strategies
# Custom output directory
python compare_benchmarks.py --preset quick --output-dir ./comparisons
Features:
Each benchmark generates multiple report files:
benchmark_<id>.json){
"benchmark_id": "uuid",
"status": "success",
"duration": 12.34,
"metrics": {
"wall_clock_time": 12.34,
"tasks_per_second": 0.81,
"success_rate": 1.0,
"peak_memory_mb": 256.5,
"average_cpu_percent": 45.2
},
"results": [...]
}
metrics_<id>.json){
"summary": {...},
"performance": {...},
"resources": {...},
"process_executions": {...}
}
process_report_<id>.json){
"summary": {...},
"command_statistics": {...},
"executions": [...]
}
# 1. Run a monitored benchmark
python realtime_monitor.py "Build user authentication" --strategy development
# 2. View the detailed report
python enhanced_report_viewer.py --latest
# 3. Compare with other strategies
python compare_benchmarks.py --preset strategies
# 4. Analyze trends over time
python enhanced_report_viewer.py --trends
# Run multiple benchmarks
for task in "Create API" "Add tests" "Optimize code"; do
swarm-benchmark real swarm "$task" --output-dir ./batch-reports
done
# Analyze all results
python enhanced_report_viewer.py --dir ./batch-reports --summary
python enhanced_report_viewer.py --dir ./batch-reports --trends
# Check report directory exists
ls -la reports/
# Ensure benchmarks are saving to correct location
swarm-benchmark real swarm "test" --output-dir ./reports
# Make scripts executable
chmod +x *.py
# Ensure write permissions for reports
chmod 755 reports/
from enhanced_report_viewer import BenchmarkReport, EnhancedReportViewer
# Load and process reports programmatically
viewer = EnhancedReportViewer()
viewer.load_reports()
for report in viewer.reports:
print(f"ID: {report.benchmark_id}")
print(f"Success Rate: {report.success_rate:.1%}")
print(f"Tokens: {report.total_tokens}")
# GitHub Actions example
- name: Run Benchmarks
run: |
python compare_benchmarks.py --preset quick
python enhanced_report_viewer.py --trends