Back to Fission

Multi-Namespace Tenancy

docs/multiple-namespace/README.md

1.27.06.7 KB
Original Source

Multi-Namespace Tenancy

Working folder for the multi-namespace tenancy effort. These are living planning + implementation-tracking docs; they stay here until the PRD is fully implemented, then condense into permanent docs (CLAUDE.md / docs/).

Problem in one line: onboarding a Fission namespace today restarts the whole control plane (issue #3298), and the current multi-namespace model leaks the master internal-auth secret into every tenant namespace.

Approach in one line: declarative FissionTenant CRD + label, dynamic per-namespace watching (zero restart), and per-namespace derived HMAC keys (master never leaves the control plane) — least-privilege preserved throughout.

Status: functionally complete. A single chart value, tenancy.mode: static | dynamic | cluster, selects the posture; the critical-path implementation, the full per-namespace key security story, and all three modes have shipped (see Tenancy modes and Phase status).


Documents

DocWhat it covers
prd.mdThe PRD: context, goals, three-pillar design, phased delivery, critical files, and concrete recommendations for PR #3476.
testing-plan.mdPer-phase unit + integration test matrix, framework changes, CI wiring, coverage targets.
performance-benchmarking.mdMetrics, scaling sweeps, baselines (today vs PR #3476), and the hard/soft regression gates.
backward-compatibility.mdUpgrade/rolling-restart compatibility review, issues by severity with mitigations, and the safe-upgrade runbook.

Add implementation notes, decision logs, and sub-task trackers to this folder as work proceeds.


Decisions locked in

ForkDecision
v1 scopeFull vision, phased — disruption fix ships first
Onboarding APIFissionTenant CRD and fission.io/enabled=true label, both
HMAC secret isolationPer-namespace derived keys — master never in a tenant namespace
Cross-namespace invocationAdmission + NetworkPolicy — no runtime podIP guard
Dynamic-watch RBAC (review)Only Fission CRDs go cluster-wide; all core/workload resources per-namespace dynamic
HMAC migration (review)Version-aware signing — control plane signs each pod with the key it expects; no upgrade 401s

Tenancy modes

Set tenancy.mode in the Helm values (default static). It replaced the older tenancy.dynamicNamespaces + tenantController.enabled booleans (removed in #3502).

ModeWhere Fission runsOnboardingControl-plane readsUse when
static (default)the env-seeded set (defaultNamespace + additionalFissionNamespaces)install-time onlyper-namespace Roles, scoped cachessingle namespace, or a fixed known set; behaves exactly like pre-tenancy Fission
dynamicany namespace onboarded at runtimefission tenant enable <ns> or the fission.io/enabled=true label — no control-plane restartper-namespace Roles + per-namespace derived HMAC keys; tenant Secrets/ConfigMaps never in a cluster-wide cacheuntrusted multi-tenant clusters (the recommended isolating posture)
clusterany namespace, automaticallythe controller auto-onboards every namespace (no CR/label needed)executor/buildermgr read Secrets/ConfigMaps and manage workloads cluster-widesingle-tenant / trusted clusters that value simplicity over isolation

Least-privilege holds in every mode: function pods (fetcher/builder) always get a narrow per-namespace RoleBinding and per-namespace derived HMAC key — even in cluster mode the controller provisions those per namespace; only the control plane goes cluster-wide. ⚠️ cluster mode trade-off: a compromised executor/buildermgr can read any namespace's Secrets — use it only on trusted clusters.

Opting a namespace out of cluster mode: label it fission.io/enabled=false. The controller skips it (and offboards it, tearing down its Fission RBAC/keys, if it was already auto-onboarded). Removing the label re-onboards it. The fission.io/enabled label is thus a symmetric override: true opts in (dynamic mode), false opts out (cluster mode), absent = the mode default.


Phase status

All phases shipped as separate PRs off main (not the original feat/multi-namespace-tenancy branch).

PhaseSummaryStatus
0Thread-safe NamespaceResolver (setter + change feed)✅ Shipped (#3497)
1FissionTenant CRD + --tenantController + CLI✅ Shipped (#3497)
2Helm migration Job — fixes #3298 (zero restart)✅ Shipped (#3497)
3Tier-A cluster-wide cache + membership predicate + cross-process resolver-sync✅ Shipped (#3497)
4Executor Tier-A cache + per-namespace fetcher/builder/workload RBAC provisioning✅ Shipped (#3497)
5Per-namespace derived HMAC keys (fetcher, storagesvc, builder) + master-drop✅ Shipped (#3497)
7Archive ns-prefix (storagesvc archive-content isolation)✅ Shipped (#3500)
RBAC unification — single-source fetcher/builder rules across Go ↔ Helm + admission policy✅ Shipped (#3501)
6Opt-in tenancy.mode: cluster (watch-all) + config converged to the tenancy.mode enum✅ Shipped (#3502)
4bTier-B dynamic per-namespace Secret/ConfigMap caches (RFC-0004 recycle in runtime-onboarded namespaces, dynamic mode only)⏳ Remaining — see implementation-status.md

Phases 0–2 close the filed issue with zero security regression; phases 3–5 deliver true multi-tenant isolation; phases 6–7 add the trusted-cluster opt-in and archive-content isolation. Remaining items (the dynamic-mode Tier-B recycle refinement + optional hardening) are tracked in implementation-status.md; none block any mode from working.


Glossary

  • Resource namespace — where Function/Package/Environment/Trigger CRs live (watched by controllers).
  • Function/Builder namespace — where function pods / builder pods run for a tenant (per-tenant mapping; defaults to the resource namespace).
  • Tier-A watch — one cluster-wide, label/predicate-filtered cache for Fission CRDs + Fission-labeled workloads (low sensitivity, free dynamic discovery).
  • Tier-B watch — per-namespace dynamic cluster.Cluster for Secrets/ConfigMaps (tenant data — never cluster-wide).
  • Master vs derived key — the master HMAC secret stays in the control-plane release namespace; tenants receive only HKDF(master, service:namespace) derived keys.