docs/multiple-namespace/performance-benchmarking.md
Companion to: prd.md · testing-plan.md
Performance is a first-class acceptance gate, not an afterthought. The whole premise of issue #3298 is a performance failure (mass restart, function timeouts, node scale-up), and PR #3476's watch-all model trades isolation for a memory/RBAC profile we must out-perform on the isolation axis without regressing latency. This doc defines what we measure, how, against which baselines, and the gates that block merge.
Every benchmark is reported against two reference points so we can prove "better, not just different":
main with additionalFissionNamespaces (per-namespace everything, env-frozen, restart-on-add).The design wins if it matches or beats Baseline-Today on latency/restart and beats Baseline-PR3476 on isolation-cost (memory/RBAC blast radius) without a latency penalty.
This is the single most important number and a hard gate.
| Metric | Baseline-Today | Target (this design) | How measured |
|---|---|---|---|
| Control-plane pod restarts when adding 1 namespace | all control-plane Deployments roll | 0 | Capture pod UIDs before/after onboarding in the serial test; diff. |
| Function 5xx/timeout count during onboarding | non-zero (executor unavailable window) | 0 | Drive steady traffic at an existing-namespace function during onboard; count non-200s. |
| Re-specialization events on existing namespaces | full storm | 0 | Count pod creations in existing function namespaces during the window. |
| Node scale-up triggered | 2–3 extra nodes (reporter) | 0 | Watch Node count / cluster-autoscaler events over the window. |
| Time-to-Ready for the new tenant | n/a (restart-bound) | < 10 s p95 | FissionTenant Ready=True transition time from CR/label create. |
Measured in the serial/tenant_no_restart_test.go integration test and reproduced in a dedicated load scenario (§7).
The architectural claim is "Tier-A (Fission CRDs) is one cluster-wide informer, flat in N; Tier-B (workloads + Secrets/ConfigMaps) scales linearly." Per the backward-compatibility decision, workloads (pods/services/deployments/endpointslices) are in Tier-B too — so the linear term is larger than a secrets-only split, a deliberate trade for zero cluster-wide read of any core resource. Prove it with a sweep at N = {1, 10, 50, 100, 250} tenant namespaces, each holding a small fixed function set.
| Dimension | Expectation | Method |
|---|---|---|
| Executor RSS / Go heap | Sub-linear; Tier-A (CRDs) flat, Tier-B (workload + Secret/ConfigMap informers) the linear term | CI pprof heap capture (kind-ci observability) at each N; plot heap vs N. |
| Router RSS / heap | CRD/HTTPTrigger watch flat (Tier-A); the RFC-0002 EndpointSlice watch is now per-namespace Tier-B, so a bounded per-tenant increment | pprof heap; compare to Baseline-Today (also per-namespace). |
| Goroutine count | Flat for Tier-A; bounded per-tenant increment for each Tier-B informer (more informers/tenant than a secrets-only split) | pprof goroutine profile; assert no unbounded growth. |
| API server LIST/WATCH connections | Tier-A: O(1) cluster-wide CRD watches; Tier-B: O(N × workload+secret+cm types) | apiserver apiserver_longrunning_requests / watch count; compare to Baseline-Today O(N-per-type) and Baseline-PR3476 O(1)-but-cluster-wide-secrets. |
| Informer initial-sync time on N tenants | Bounded; Tier-A one sync, Tier-B parallel per-tenant | Manager WaitForCacheSync duration. |
Explicit honesty in the report: Tier-B is O(N) across more types than a secrets-only split — that is the deliberate price of not granting any cluster-wide core-resource read. We quantify the per-tenant cost (MiB/tenant and watch-connections/tenant) so operators can size, and compare it to Baseline-PR3476's "O(1) informers but every secret + workload in the cluster resident" — cheaper in connections but unbounded and insecure in data resident. This is the one axis where the design consciously pays more than the watch-all alternative, in exchange for the isolation guarantee; the sweep makes that cost explicit rather than hidden.
Go benchmarks in pkg/auth/hmac (go test -bench):
BenchmarkDeriveServiceKeyNS — HKDF per-namespace derivation cost (expected ~µs, one-time per key; cached after first use). Confirm it is not on the per-request hot path.BenchmarkVerify_CandidateSet — verify latency with a 1-key vs the worst-case 4-key candidate set (ns + nsOld + master + masterOld during migration). Each candidate is one crypto/hmac.Equal; assert the worst case adds < a few µs and is constant-time.BenchmarkSign_NSScoped vs BenchmarkSign_MasterScoped — confirm the storagesvc \n<namespace> canonical suffix adds negligible cost and master-scoped signing is unchanged.Gate: HMAC changes must not move storagesvc or fetcher-specialize latency outside the agreed budget (§8).
The poolmgr cold-start path (synchronous getServiceForFunction, ~100ms budget) must stay byte-identical with the tenancy gates on or off — RFC-0002 already guarantees this and the tenancy work must not break it.
| Metric | Method | Gate |
|---|---|---|
| Poolmgr cold-start p50/p99 | Load test: first-invocation latency across many fresh functions, gates on vs off | p99 within the agreed budget of Baseline-Today (hard) |
| Warm-path (router-admitted) p99 | Steady traffic through the slice-fed index | No regression vs Baseline-Today |
| Specialization latency | executor→fetcher /specialize round trip (now ns-key-signed) | Within budget; the HMAC ns-key is precomputed, not per-request HKDF |
Reuse the existing load harness and the cpuburn fixture pattern (from RFC-0006 load tests) so latency is measured under realistic CPU pressure, not idle.
The tenant-lifecycle controller is new and on the onboarding critical path.
BenchmarkTenantReconcile (fake client) — reconcile cost per tenant (RBAC + SA + secret derivation + status write).escalate/bind RBAC writes are batched/idempotent (no hot-loop re-create).Run on kind-ci (or a perf cluster) with the observability profile so pprof + Prometheus are captured:
Hard gates (block merge):
fission_router_route_resync_drift_total stays 0 (existing CI bar — the dynamic watch must not introduce route drift).Soft gates (review, justify, document):
go test -bench=. -benchmem ./pkg/auth/hmac/... and the tenant-controller package; track with benchstat against the baseline commit.debug-github-ci skill's pprof-analysis path (leak-vs-baseline classification, before/after deltas). Tenant-controller and executor are the profile targets.fission_tenant_reconcile_duration_seconds, fission_tenant_ready_total, fission_tenant_watch_caches{tier} (count of active Tier-B caches), fission_internal_auth_failures_total{service,reason} (the auth-failure counter the internal-auth design doc flagged as a future add — now justified). These drive the scaling and reconcile dashboards.| Axis | Baseline-Today | Baseline-PR3476 (watch-all) | This design |
|---|---|---|---|
| Add-namespace restart | all control-plane pods | 0 | 0 |
| RBAC blast radius | per-namespace Roles | cluster-wide (all secrets) | per-namespace Roles (cluster-wide only on opt-in) |
| Secrets resident in control-plane memory | per-watched-ns | every secret in cluster | only onboarded-tenant secrets (Tier-B) |
| Internal auth default | on | off | on |
| Cross-tenant impersonation | master copied everywhere → trivial | master copied everywhere → trivial | cryptographically prevented (per-ns keys) |
| API server watch connections | O(N per type) | O(1) but cluster-wide | O(1) Tier-A (CRDs) + O(N) Tier-B (workloads + secrets/cms) — the deliberate cost of zero cluster-wide core read |
| Cold-start p99 | baseline | baseline | == baseline (RFC-0002 invariant) |
The intent is visible at a glance: equal-or-better on every performance axis, strictly better on every security axis.