Back to Seaweedfs

V3 Phase 14 S8 — V2 Scenario Port Classification

sw-block/design/v3-phase-14-s8-v2-scenario-classification.md

4.2311.6 KB
Original Source

V3 Phase 14 S8 — V2 Scenario Port Classification

Date: 2026-04-20 Status: draft (S8 scenario classification) Purpose: classify every V2 testrunner scenario (weed/storage/blockvol/testrunner/scenarios/) as P14-runnable / P15-blocked / deferred, producing the table required by v3-phase-14-s8-assignment.md §3.C

1. Classification Vocabulary

ClassMeaningP14 S8 action
RUNNABLE-P14Shape maps onto S4-S7 internal surfaces. Runnable with V3 binary + no external frontend.Keep scenario YAML shape; adapt actions to V3 route; run as L3 classification only (S8 does not need to execute L3 itself).
BLOCKED-FRONTENDRequires iSCSI / NVMe / CSI / real data path. P15 Frontend track gate.Preserve scenario reference; hand off to P15 with required precondition.
BLOCKED-OPSRequires operator CLI / HTTP / admin workflow. P15 Ops track gate.Hand off to P15 Ops.
BLOCKED-HARequires multi-master / leader election / distributed authority. Outside P14/P15 bounded claim.Mark as out-of-scope for both P14 and current P15 bounds. Document only.
BLOCKED-PERFBenchmark / soak / perf-baseline; not a correctness gate.Defer to release-hardening track; document only.
PORT-MECHANISMTestrunner machinery itself (not a scenario): action vocabulary, artifact collection, scenario YAML shape.Port machinery without V2 authority semantics (see v3-phase-14-s6-s8-v2-port-plan.md §5).

2. Testrunner Machinery (port decision)

weed/storage/blockvol/testrunner/ — YAML-driven runner with action vocabulary, artifact collection, report generation. V3 acceptance pack (P14 S8 carries scenario SHAPES forward; actual V3 testrunner integration is a P15 Cluster Validation track deliverable).

ComponentClassificationS8 action
testrunner/*.go (engine / parser / registry / reporter / metrics)PORT-MECHANISMPort to V3 when V3 has a stable CLI surface. S8 does not run the runner directly.
testrunner/actions/*.go (37 registered actions)PORT-MECHANISMPort per-action as V3 surfaces appear. S8 ports NONE directly (no V3 CLI surface exists yet outside sparrow hidden smoke flags).
testrunner/scenarios/*.yamlscenario-by-scenario below

S8 decision: do not port testrunner machinery into the V3 tree as part of S8. The scenario SHAPES below are what S8 classifies; the runner itself is a P15 Cluster Validation deliverable.

3. Public Scenarios Table

Path: weed/storage/blockvol/testrunner/scenarios/public/.

#ScenarioP14 route it touchesClassNotes
1smoke-block-api.yamlblock create / write / read / verifyBLOCKED-FRONTENDRequires V3 block API surface + real data path. No V3 equivalent exists. P15 Frontend track.
2smoke-iscsi.yamlkernel iSCSI sanityBLOCKED-FRONTENDRequires iSCSI target + kernel initiator. P15 Frontend.
3smoke-kv.yamlKV layer smokeBLOCKED-FRONTENDKV path is out of P14 scope. P15 Frontend.
4e2e-block.yaml / e2e-block-auto.yamlend-to-end block I/O with V3 backendBLOCKED-FRONTENDSame preconditions as #1.
5e2e-kv.yaml / e2e-kv-auto.yaml / e2e-combined-auto.yamlKV + block combinedBLOCKED-FRONTENDSame as #3.
6ha-restart-recovery.yamlrestart-and-reload closed loopRUNNABLE-P14Shape maps directly onto S7 restart route. L1 / L2 subprocess smoke already proves the route in-process; scenario YAML would just drive the same route under scenario harness. Keep shape; port when V3 testrunner lands. Covered by Claim 3 / Claim 10 of the evidence matrix at L0/L1/L2.
7ha-failover.yamlfailover under live I/OBLOCKED-FRONTENDShape touches S4-S7 (observation → controller → reassign → adapter), but "under live I/O" requires the data path. Classification: control-plane portions are RUNNABLE-P14 (covered by Claim 14 at L1); I/O-continuity portions are BLOCKED-FRONTEND. Split at P15 port time.
8ha-full-lifecycle.yamlbind → heal → failover → rebuild → restart full cycleBLOCKED-FRONTENDContains rebuild + I/O. Rebuild is P14-internal but has no V3 equivalent outside V2 bridge; I/O is frontend. Defer whole scenario to P15.
9ha-io-continuity.yamlzero data loss across failoverBLOCKED-FRONTENDEntirely data-path. P15 Frontend.
10ha-rebuild.yamlfull-extent rebuild via transportBLOCKED-FRONTENDRebuild transport is V2-side; V3 adapter has no rebuild surface in current slice. Deferred.
11crash-recovery.yamlprocess kill + restart + verifyRUNNABLE-P14Control-plane portion identical to #6. Data-verify portion is BLOCKED-FRONTEND. Split at port time.
12diag-restart-recovery.yamldiagnostics on restartRUNNABLE-P14 (control-plane subset)S7 subprocess smoke emits structured JSON (Bootstrap.ReloadedRecords, ReloadSkips, no-backward-mint); maps onto this shape. Operator dashboards / full diag bundle is P15 Diagnostics.
13fault-partition.yamlnetwork partition + recoveryRUNNABLE-P14 (control-plane)Control-plane partition → stale observation / convergence stuck is covered by Claims 8, 11 at L1. Real netem + data-path partition is BLOCKED-FRONTEND.
14fault-netem.yamlgeneric network fault injectionBLOCKED-FRONTENDNeeds data path under load. P15.
15fault-disk-full.yamlENOSPC on primaryBLOCKED-FRONTENDNeeds write path. P15.
16consistency-epoch.yamlepoch monotonicity across failoverRUNNABLE-P14Pure control-plane claim — already covered at L0 by TestDurableAuthority_PublisherAdvancesFromReloaded and at L1 by TestS7_* restart tests. L3 version would add cross-process multi-primary validation.
17consistency-lease.yamllease-based guardBLOCKED-OPSLease semantics are V2-specific; V3 uses publisher-owned Epoch. Would need a reshaped V3 scenario. Defer.
18lease-expiry-write-gate.yamlwrite blocked on lease expiryBLOCKED-FRONTENDWrite path + lease. P15.
19lease-renewal-under-io.yamllease renewal during I/OBLOCKED-FRONTENDSame.
20cp11b3-auto-failover.yamlautomatic failover triggerRUNNABLE-P14 (control-plane)Control-plane covered at L1 by TestTopologyControllerToPublisher_E2E_MultiVolumePlacementAndFailover.
21cp11b3-manual-promote.yamloperator-initiated promoteBLOCKED-OPSRequires operator CLI + admin API. P15 Ops. Also — manual promote is V2 semantics; V3 does not have a direct equivalent by design (S6-S8 V2 port plan §2 rejects V2 promote ownership).
22cp11b3-fast-reconnect.yamlreconnect skips unnecessary failoverRUNNABLE-P14 (control-plane)S6 normal-lag handling covers this — TestConvergence_NormalLag_OldObservationDoesNotSupersede. Full scenario requires real reconnect transport = BLOCKED-FRONTEND for the transport layer, RUNNABLE-P14 for the control-plane.

Public scenarios summary:

  • RUNNABLE-P14: 6 scenarios (ha-restart-recovery, crash-recovery control plane, diag-restart-recovery, fault-partition control plane, consistency-epoch, cp11b3-auto-failover control plane) — all covered by Claims 3/10/11/12/14 at L0/L1/L2.
  • BLOCKED-FRONTEND: 13 scenarios
  • BLOCKED-OPS: 2 scenarios (lease-consistency, manual-promote)
  • Mixed (split required at port time): ha-failover, cp11b3-fast-reconnect

4. Internal Scenarios Table (selected, by category)

Path: weed/storage/blockvol/testrunner/scenarios/internal/. 50+ files; classifying by category rather than per-file.

CategoryExample filesClassNotes
Recovery baselinesrecovery-baseline-restart.yaml / recovery-baseline-failover.yaml / recovery-baseline-partition.yaml / recovery-baseline-rebuild.yamlRUNNABLE-P14 (control-plane)Restart/failover/partition portions map onto S4-S7; rebuild portions need V3 rebuild surface (not in S8).
Coordination dev-cyclecoord-dev-cycle.yaml / coord-ha-failover.yaml / coord-smoke-iscsi.yamlBLOCKED-FRONTENDiSCSI / end-to-end workflows.
CP103 performance matrixcp103-*.yamlBLOCKED-PERFPerformance, not correctness. Defer to release hardening.
CP85 chaoscp85-chaos-partition.yaml / cp85-chaos-primary-kill-loop.yaml / cp85-chaos-replica-kill-loop.yaml / cp85-role-flap.yaml / cp85-session-storm.yamlMixedControl-plane portions RUNNABLE-P14 as property tests (kill-loop restart, role-flap convergence). I/O portions BLOCKED-FRONTEND.
CP85 metrics / observabilitycp85-metrics-verify.yamlBLOCKED-OPSOperator metrics pipeline. P15 Diagnostics.
CP85 soakcp85-soak-24h.yaml / cp84-soak-4h.yamlBLOCKED-PERFLong-running stability. Release hardening.
CP85 database / filesystemcp85-db-ext4-fsck.yaml / cp85-db-sqlite-crash.yaml / cp85-expand-failover.yamlBLOCKED-FRONTENDReal filesystem + DB workload.
Snapshotcp11a4-snapshot-export-import.yaml / cp83-snapshot-expand.yaml / cp85-snapshot-stress.yamlBLOCKED-FRONTENDSnapshot API is P15 Frontend / Ops.
EC / Erasureec3-*.yaml / ec5-*.yamlBLOCKED-FRONTENDData path erasure.
HA extensionsha-failover-during-rebuild.yaml / ha-multi-client-failover.yamlBLOCKED-FRONTENDMulti-client data path.
Benchmarkbenchmark-*.yaml / bench-validated.yaml / baseline-full-roce.yaml / fsync-only-test.yamlBLOCKED-PERFPerformance.
DM / stripedm-stripe-two-server.yamlBLOCKED-FRONTENDDevice mapper.
Operator lifecycleop-upgrade-rollback.yaml / op-csi-lifecycle.yaml / op-failure-injection.yamlBLOCKED-OPSP15 Ops + Migration.
Real-workload validationcp13-8-real-workload-validation.yamlBLOCKED-FRONTEND + BLOCKED-PERFFull stack.

Internal scenarios summary:

  • Directly useful now as P14-internal control-plane L3 shape: ~8 scenarios (recovery-baseline × 3, cp85 chaos × 3, consistency-epoch at public, diag-restart-recovery at public). All already covered at L0/L1/L2 by the evidence matrix; L3 runnable is P15 Cluster Validation.
  • BLOCKED-FRONTEND majority: ~30 scenarios tied to I/O / iSCSI / NVMe / snapshot / DB workloads.
  • BLOCKED-OPS: ~8 scenarios tied to operator workflows.
  • BLOCKED-PERF: ~10 scenarios tied to perf / soak.
  • BLOCKED-HA: none of the current set explicitly requires multi-master, but cp85-role-flap and any future "multi-master" test would be BLOCKED-HA.

5. Port Shape (V3)

When V3 eventually ships a testrunner integration (P15 Cluster Validation), the port order should be:

  1. RUNNABLE-P14 scenarios first — port scenario YAML shape (not V2 actions) against V3 sparrow binary + testrunner-wrapped smoke flags. Keep the structured-JSON stdout shape from S7 smoke; wrap it in testrunner action vocabulary.
  2. Mixed scenarios second — split each into (control-plane sub-scenario, data-path sub-scenario). Port the control-plane half first.
  3. BLOCKED-FRONTEND last — arrives when P15 Frontend ships iSCSI / NVMe / CSI.
  4. BLOCKED-OPS as needed — arrives with P15 Ops admin surface.
  5. BLOCKED-PERF and BLOCKED-HA — release-hardening and explicit non-goals respectively.

6. Closure

Per S8 assignment §3.C, this table is sufficient for S8. Actual scenario PORTING is P15 Cluster Validation work and is not an S8 deliverable. S8 produces the classification + the residual map; P15 executes.

Evidence that the RUNNABLE-P14 scenarios' underlying claims are proven today is in v3-phase-14-s8-evidence-matrix.md §3 — each RUNNABLE-P14 scenario above has at least one PROVEN row backing it at L0/L1/L2.