Back to Ruflo

Claude Flow Benchmark System v2.0

v2/benchmark/README.md

3.6.301.2 KB
Original Source

Claude Flow Benchmark System v2.0

Production-ready benchmarking for Claude Flow with real command execution and authentic metrics.

Quick Start

bash
cd benchmark
pip install -e .

# Run real benchmark
python examples/real_swarm_benchmark.py "Build REST API"

Features

  • Real Execution: Actual ./claude-flow commands via subprocess
  • Stream JSON: Parses --non-interactive --output-format stream-json
  • Authentic Metrics: Real tokens, timing, and resource usage
  • No Mocks: 100% genuine execution

Supported Commands

bash
./claude-flow swarm "task" --non-interactive --output-format stream-json
./claude-flow hive-mind spawn "task" --non-interactive
./claude-flow sparc run code "task" --non-interactive

Usage

python
from swarm_benchmark import BenchmarkEngine

engine = BenchmarkEngine(use_real_executor=True)
result = await engine.run_real_benchmark("Build microservices")
print(f"Tokens: {result.metrics['total_tokens']}")

Testing

bash
pytest tests/
python examples/verify_real_integration.py

Version: 2.0.0 | Status: Production Ready | Real: Yes | Mocks: None