perf-tests/README.md
This directory contains performance/CPU integration tests for the Gemini CLI. These tests measure wall-clock time, CPU usage, and event loop responsiveness to detect regressions across key scenarios.
CPU performance is inherently noisy, especially in CI. The harness addresses this with:
# Run tests (compare against committed baselines)
npm run test:perf
# Update baselines (after intentional changes)
npm run test:perf:update-baselines
# Verbose output
VERBOSE=true npm run test:perf
# Keep test artifacts for debugging
KEEP_OUTPUT=true npm run test:perf
The PerfTestHarness class (in packages/test-utils) provides:
performance.now() — high-resolution wall-clock timingprocess.cpuUsage() — user + system CPU microseconds (delta between
start/stop)perf_hooks.monitorEventLoopDelay() — event loop delay histogram
(p50/p95/p99/max)Baselines are stored in baselines.json in this directory. Each scenario has:
{
"cold-startup-time": {
"wallClockMs": 1234.5,
"cpuTotalUs": 567890,
"eventLoopDelayP99Ms": 12.3,
"timestamp": "2026-04-08T..."
}
}
Tests fail if the measured value exceeds baseline × 1.15 (15% tolerance).
To recalibrate after intentional changes:
npm run test:perf:update-baselines
# then commit baselines.json
After all tests, the harness prints an ASCII summary:
═══════════════════════════════════════════════════
PERFORMANCE TEST REPORT
═══════════════════════════════════════════════════
cold-startup-time: 1234.5 ms (Baseline: 1200.0 ms, Delta: +2.9%) ✅
idle-cpu-usage: 2.1 % (Baseline: 2.0 %, Delta: +5.0%) ✅
skill-loading-time: 1567.8 ms (Baseline: 1500.0 ms, Delta: +4.5%) ✅
perf-tests/
├── README.md ← you are here
├── baselines.json ← committed baseline values
├── globalSetup.ts ← test environment setup
├── perf-usage.test.ts ← test scenarios
├── perf.*.responses ← fake API responses per scenario
├── tsconfig.json ← TypeScript config
└── vitest.config.ts ← vitest config (serial, isolated)
packages/test-utils/src/
├── perf-test-harness.ts ← PerfTestHarness class
└── index.ts ← re-exports
These tests are excluded from preflight and designed for nightly CI:
- name: Performance regression tests
run: npm run test:perf
perf.<scenario-name>.responsesperf-usage.test.ts using harness.runScenario()npm run test:perf:update-baselines to establish initial baselinebaselines.json