docs/rfc/0002-perf-runbook-results.md
Status: all three acceptance bars cleared — phase 4 (defaults flip) is perf-cleared from this runbook's perspective.
29bc8644 (PR #3485 merge), deployed via SKAFFOLD_PROFILE=kind make skaffold-deploy.helm upgrade --reuse-values --set executor.functionServices.enabled=true --set router.endpointSliceCache.mode=on.test/benchmark/rfc0002-perf-runbook.sh; benchmarks: test/benchmark/tests/{cold-start,warm-path}.requestsPerPod=200 (one pod serves all VUs, so the measurement is router overhead, not function capacity).| Metric | Bar | Gates off | Gates on | Delta | Verdict |
|---|---|---|---|---|---|
| Warm p99 | ≥20% lower | 91.8ms | 22.1ms | −75.9% | PASS |
| Cold-start p95 | <10% regression | 1154.7ms | 144.7ms | −87.5% (no regression) | PASS |
| Steady-state hit ratio | ≥99% | n/a | 212,846 hits / 31 misses / 0 fallbacks = 99.985% | PASS |
Secondary observations:
fission_router_endpointcache_mode{requested="on",effective="on"} 1, quarantines_total 0, tap_flush_errors_total 0.requestsPerPod=1 makes poolmgr specialize a pod per concurrent request — a pod storm that measures node saturation, not the router. The committed warm-path benchmark pins requestsPerPod high for exactly this reason.executor.functionServices.enabled=true + router.endpointSliceCache.mode=on defaults; together with the multi-replica and index-scale addendum below, this evidence backed pulling the flip forward into v1.26 itself.endpointLB flag, shadow-comparator removal, EnsureCapacity interface fold, settle() accounting collapse, and the concurrency-enforcement webhook warning in the phase-4 change (see 0002-implementation-plan.md for the two as-shipped deviations).Run against the phase-4 branch (defaults on) on the same kind setup; drivers: rfc/0002-multireplica-check.sh, rfc/0002-scale-check.sh (local), test/benchmark/tests/scale-index/generate.sh (committed).
requestsPerPod=2: 430,231 requests, 5,547 rps, 0.007% failures, p99 12.8ms.effective=on on all three.ceil(VUS/requestsPerPod)) and far inside the documented worst case of ideal + (replicas−1)×requestsPerPod = 19.
Per-replica admission under-admits rather than over-admits at this scale.| baseline | 1,000 fns | after 300-slice churn storm | |
|---|---|---|---|
endpointcache_size | 0 | 1,000 (exact) | 1,000 |
| heap inuse | 14.2MB | 18.3MB (+4.2MB ≈ 4KB/fn) | 20.3MB |
| RSS | 96.1MB | 98.7MB | 99.3MB |
| goroutines | 84 | 84 (flat) | 84 (flat) |
Linear, small, and goroutine-flat through creation and churn; extrapolates to ~42MB at 10k functions, inside the RFC's <50MB projection.