docs/plans/2026-05-28-slate-rich-text-benchmark-slice.md
Objective:
Implement the next Slate v2 vs Slate rich-text benchmark slice by renaming the
visible Slate baseline from legacy-slate to slate, adding a generated
headless Slate-v2-vs-Slate transform/navigation comparison artifact, ingesting it
into benchmarks/editor Evidence Kit rows, refreshing rich-text.html, and
verifying the served artifact.
Goal plan: docs/plans/2026-05-28-slate-rich-text-benchmark-slice.md
Template: docs/plans/templates/major-task.md
Primary template: docs/plans/templates/major-task.md
Applied packs:
Major source:
.tmp/slate-v2 + benchmarks/editorrich-text.html, and keeps the old baseline visible as slate.Major lane:
.tmp/slate-v2/scripts/benchmarks,
.tmp/slate-v2/package.json, benchmarks/editor/src,
benchmarks/editor/benchmarks, generated benchmarks/editor/docs/perfCompletion threshold:
bench:core:rich-text-operations:compare:local command writes
.tmp/slate-v2/tmp/slate-rich-text-operations-compare-benchmark.json.benchmarks/editor ingests that artifact under
slate-core-rich-text-operations-compare.slate labels and no legacy-slate labels.http://127.0.0.1:8765/rich-text.html returns 200 and
serves JSON with the new category.node .agents/rules/autogoal/scripts/check-complete.mjs docs/plans/2026-05-28-slate-rich-text-benchmark-slice.md
passes.Verification surface:
.tmp/slate-v2 benchmark command and artifact.benchmarks/editor package check, generated Evidence Kit result, generated
ugly HTML table, served JSON smoke.Constraints:
Boundaries:
.tmp/slate-v2 benchmark/test corpus and
benchmarks/editor Evidence Kit ingestion.http://127.0.0.1:8765/rich-text.html.Blocked condition:
plate-editor-evidence tmux static server and rerun the curl smoke.Major state:
Current verdict:
Completion rule:
update_goal(status: complete) is legal after the completion checker passes.Start Gates:
| Gate | Applies | Evidence |
|---|---|---|
major-task loaded | yes | Used for benchmark architecture/execution lane |
| Active goal checked or created | yes | Active goal created for this benchmark slice |
| Source of truth read before analysis | yes | Read .tmp/slate-v2 compare/current benchmark scripts and benchmarks/editor/src/index.mjs |
| Major lane selected | yes | Benchmark implementation |
| Decision criteria stated | yes | Completion threshold above |
| Existing repo patterns / prior decisions checked | yes | Reused repo-compare.mjs, stats.mjs, compare artifact schema, and Evidence Kit normalizers |
| Helper stack selected | yes | Existing compare harness plus Evidence Kit renderer |
| External research decision recorded | yes | N/A: local repos and generated artifacts were enough |
| Implementation expectation recorded | yes | Implementation expected and completed |
| Workspace authority selected | yes | .tmp/slate-v2 owns benchmark command; benchmarks/editor owns ingestion/docs |
| Branch / PR expectation decided | yes | N/A: no PR requested |
| Browser pack selected | yes | Static route smoke selected |
| Browser route / app surface identified | yes | http://127.0.0.1:8765/rich-text.html |
| Browser tool decision recorded | yes | Browser tool not exposed by tool search; used served HTTP/JSON smoke |
| Console/network caveat policy recorded | yes | Static curl/JSON proof only; no console inspection available |
Work Checklist:
Completion Gates:
| Gate | Applies | Required action | Evidence |
|---|---|---|---|
| Named verification threshold | yes | Run benchmark and ingestion/docs checks | Commands listed in Verification evidence |
| Current-state source audit | yes | Map current owner, boundaries, constraints, and affected surfaces | Findings and source reads recorded here |
| Decision criteria closure | yes | Mark each criterion satisfied, narrowed, rejected, or blocked with evidence | Completion threshold satisfied |
| Options / tradeoffs / rejection record | yes | Record viable options, chosen recommendation, and why alternatives lose | Decisions and tradeoffs below |
| Review / pressure pass | yes | Run self-review against noisy benchmark risk | Merge lane removed; labels and generated artifacts checked |
| Review findings closure | yes | Fix accepted findings and record closure proof | Updated fuzz contract, research source, generated docs/scope |
| External-source audit | no | N/A | Local repos and artifacts were sufficient |
| Implementation gates | yes | Close source, generated docs, and package-script gates | npm run check and benchmark command passed |
| Final handoff contract | yes | Record recommendation, evidence, caveats, residual risk, and next owner | Final handoff contract below |
| Final lint | yes | Run scoped formatter/lint equivalents | bunx biome check ... --fix; npx biome check ... --fix; npm run check |
| Goal plan complete | yes | Run completion checker | Recorded after this edit |
| Browser interaction proof | partial | Exercise route or record tool waiver | Browser tool unavailable; HTTP 200 + JSON parse proof recorded |
| Browser console/network check | no | Record why not applicable | No browser console tool exposed; static route network smoke passed |
| Browser final proof artifact | yes | Record route proof or exact caveat | curl HTTP 200 and served JSON category/label proof |
Phase / pass table:
| Phase | Status | Evidence | Next |
|---|---|---|---|
| Intake and source read | complete | Read existing compare/current benchmark patterns | current-state map |
| Current-state map | complete | Identified Evidence Kit normalizers and visible label points | implementation |
| Options and recommendation | complete | Chose headless Slate-v2-vs-Slate core slice first | verification |
| Review / pressure pass | complete | Removed unstable merge lane and stale labels | closeout |
| Implementation or plan artifact | complete | Added compare script, package script, ingestion, regenerated docs | verification |
| Verification | complete | Commands below passed | closeout |
| Closeout | complete | This file records final state | final response |
Findings:
legacy-slate leaked through source, fuzz
contracts, generated docs, and research source metadata.Decisions and tradeoffs:
.tmp/slate-v2 compare harness so the artifact has
the same current, legacy, deltaMeanMs, and stats shape as existing
Evidence Kit ingestion.legacy where they
are file/lane history, but rename visible library ids to slate.mergeNodes/positions failure inside the compare fixture. Treating that as a
performance number would be fake evidence.Facts:
.tmp/slate-v2/tmp/slate-rich-text-operations-compare-benchmark.json.rich-text-data.json reports 30 rows for
slate-core-rich-text-operations-compare: 15 Slate v2 rows and 15 Slate
baseline rows.rich-text-data.json has slate, slate:baseline,
slate:chunk-on, and slate:chunk-off; it has no legacy-slate labels.Inference:
Recommendation:
.tmp/slate-v2/playwright/integration/examples before widening to
ProseMirror/Lexical/Plate/Tiptap.Implementation notes:
.tmp/slate-v2/scripts/benchmarks/core/compare/rich-text-operations.mjs..tmp/slate-v2 package script
bench:core:rich-text-operations:compare:local.slate-core-rich-text-operations-compare artifact ingestion.core-rich-text-operations-compare workload coverage row.legacy-slate to slate.legacy-slate-package.json to slate-package.json.Review fixes:
.tmp/slate-v2 layout by falling
back from ../slate to ../../../slate.Editor.nodes to NodeApi.nodes to match existing
compare-harness API shape.slate.Error attempts:
| Error / failed attempt | Count | Next different move | Resolution |
|---|---|---|---|
Default ../slate resolved to .tmp/slate | 1 | Add repo-layout fallback | Fixed |
Rich fixture used nested list at text point [0,0] | 1 | Move list pattern off point anchors | Fixed |
| Repeated merge lane hit v2 positions failure | 3 | Remove merge from this compare slice | Resolved by exclusion and caveat |
Editor.nodes unavailable in compare import shape | 1 | Use NodeApi.nodes fallback | Fixed |
Fuzz expected legacy-slate labels | 1 | Update contract to slate | Fixed |
Verification evidence:
cwd=.tmp/slate-v2: node --check scripts/benchmarks/core/compare/rich-text-operations.mjs passed.cwd=.tmp/slate-v2: bunx biome check package.json scripts/benchmarks/core/compare/rich-text-operations.mjs --fix passed.cwd=.tmp/slate-v2: bun run bench:core:rich-text-operations:compare:local passed and wrote the artifact.cwd=benchmarks/editor: npx biome check src/index.mjs benchmarks/render-rich-text-viewer.mjs --fix passed.cwd=benchmarks/editor: npm run research:editor-frameworks:fetch passed.cwd=benchmarks/editor: npm run check passed.cwd=plate-2: rg -n "legacy-slate|Legacy Slate" benchmarks/editor returned no matches.cwd=plate-2: local data check reported richOps=30 and workload coverage
rows 66.cwd=plate-2: served JSON check returned richOps=30, no
legacyLabels, and Slate labels slate, slate:baseline,
slate:chunk-on, slate:chunk-off.cwd=plate-2: curl -I --max-time 2 http://127.0.0.1:8765/rich-text.html
returned HTTP/1.0 200 OK.Final handoff contract:
Timeline:
Reboot status:
| Question | Answer |
|---|---|
| Where am I? | Closeout complete |
| Where am I going? | Final response |
| What is the goal? | Add Slate-v2-vs-Slate rich-text operations benchmark slice |
| What have I learned? | Headless v2 vs Slate comparison is useful but merge/browser replay still need separate lanes |
| What have I done? | Implemented benchmark, ingestion, visualization, rename, generated artifacts, verification |
Open risks: