Back to Plate

Show runtime adapter comparison

docs/plans/2026-05-28-show-runtime-adapter-comparison.md

53.0.812.4 KB
Original Source

Show runtime adapter comparison

Objective: Make the rich-text benchmark dashboard show Lexical and ProseMirror runtime adapter rows as one side-by-side comparison table instead of two isolated single-library sections.

Goal plan: docs/plans/2026-05-28-show-runtime-adapter-comparison.md

Template: docs/plans/templates/task.md

Primary template: docs/plans/templates/task.md

Applied packs:

  • autogoal
  • task

Task source:

  • type: user browser feedback
  • id / link: current in-app browser at http://127.0.0.1:8765/rich-text.html
  • title: Lexical benchmark value is unclear because no comparison is visible
  • acceptance criteria: rich-text-data.json contains a runtime-adapter-compare group with both prosemirror:runtime and lexical:runtime columns for the seven shared runtime fixtures

Completion threshold:

  • Runtime adapter rows from lexical-runtime-adapter and prosemirror-runtime-adapter render under a single runtime-adapter-compare group.
  • The generated data has 7 rows in that group and both runtime libraries.
  • npm run docs:perf:check passes.
  • Served rich-text-data.json exposes runtime-adapter-compare.
  • node .agents/rules/autogoal/scripts/check-complete.mjs docs/plans/2026-05-28-show-runtime-adapter-comparison.md passes.

Verification surface:

  • /Users/zbeyens/git/plate-2/benchmarks/editor
  • benchmarks/editor/benchmarks/render-rich-text-viewer.mjs
  • benchmarks/editor/docs/perf/rich-text-data.json
  • npm run docs:perf:check
  • static served data at http://127.0.0.1:8765/rich-text-data.json

Constraints:

  • Keep raw Evidence Kit result rows unchanged.
  • Fix the dashboard grouping layer, not the benchmark producers.
  • Preserve existing Slate v2 internals split.
  • Do not create PRs, comments, commits, or pushes.

Boundaries:

  • Source of truth: generated benchmarks/results/rich-text-editors-latest.json plus dashboard renderer.
  • Allowed edit scope: rich-text viewer renderer, generated perf docs/data, this goal plan.
  • Browser surface: static docs served from http://127.0.0.1:8765.
  • Tracker sync: N/A, no tracker item.
  • Non-goals: adding new benchmark rows, changing measurements, or making Slate rows directly comparable to Lexical/ProseMirror.

Blocked condition:

  • Work would stop only if the generated rows did not share fixture names. They do share names, so the renderer can group them.

Task state:

  • task_type: visualization fix
  • task_complexity: normal
  • current_phase: closeout
  • current_phase_status: complete
  • next_phase: final response
  • goal_status: active until final update_goal call

Current verdict:

  • verdict: accepted
  • confidence: high
  • next owner: user
  • reason: dashboard data now exposes the missing side-by-side runtime table

Completion rule:

  • Do not call update_goal(status: complete) until every completion threshold above is satisfied, final handoff evidence is recorded, and the autogoal check-complete command passes.

Start Gates:

GateAppliesEvidence
Skill analysis before editsyesuser feedback mapped to renderer grouping bug
Active goal checked or createdyesactive goal created for visible runtime adapter comparison
Source of truth read before editsyesgenerated data showed separate lexical-runtime-adapter and prosemirror-runtime-adapter groups
Tracker comments and attachments readnoN/A: no tracker item
Video transcript evidence requirednoN/A: no video
docs/solutions checked for non-trivial existing-code worknoN/A: narrow generated dashboard fix
TDD decision before behavior change or bug fixyesno new test file; generated data assertion and docs check cover this renderer path
Branch decision for code-changing taskyesno branch/PR requested
Release artifact decisionyesN/A: benchmark docs package only
Browser tool decision for browser surfaceyesBrowser plugin tool not exposed; verified served JSON and generated docs
PR expectation decisionyesno PR requested
Tracker sync expectation decidedyesN/A: no tracker

Work Checklist:

  • Objective includes outcome, completion threshold, verification surface, constraints, boundaries, and blocked condition.
  • Task source classified with source type, id/link, title, task type, acceptance criteria, caveats, likely files/routes/packages, browser surface, and root-cause layer.
  • Required video or screen-recording evidence is cached/read as normalized <video-transcripts> XML, or marked N/A with reason.
  • Nearby repo instructions and implementation patterns read before edits.
  • Implementation fixes the right ownership boundary, or the narrower choice is recorded with reason.
  • Release artifact requirement recorded: changeset, registry changelog, or N/A with reason.
  • Final handoff shape decided: explain value, fix, proof, and caveat.
  • Branch handling recorded for code-changing work: no branch/PR requested.
  • Local-env-rot retry policy recorded: N/A, no surprising repo-wide failure.
  • Workspace authority recorded: every proof command names the cwd/tool that owns the changed behavior.
  • High-risk note recorded: N/A, generated dashboard grouping only.
  • Review/autoreview target selected from actual diff state: N/A, scoped renderer fix with generated proof.
  • Agent-native review decision recorded: N/A, no agent/tooling surfaces.

Completion Gates:

GateAppliesRequired actionEvidence
Named verification thresholdyesProve shared runtime group existsdata assertion recorded below
Bug reproduced before fixyesShow separate single-library groupspre-fix data had separate lexical-runtime-adapter and prosemirror-runtime-adapter groups
Targeted behavior verificationyesRun generated data assertionruntime-adapter-compare has 7 rows and both runtime libraries
TypeScript or typed config changednoN/AJS renderer only
Package exports or file layout changednoN/Ano exports/layout change
Package manifests, lockfile, or install graph changednoN/Ano package metadata change in this slice
Agent rules or skills changednoN/Ano agent surfaces
Workspace authority proofyesRun verification in owning packagecommands run from /Users/zbeyens/git/plate-2/benchmarks/editor
Browser surface changedyesVerify served data or browser proof caveatserved rich-text-data.json returns 200 and includes runtime-adapter-compare; Browser plugin unavailable
Browser final proofyesRecord caveatstatic HTML shell fetches JSON, so string exists in data file, not HTML source
CI-controlled template output changednoN/Ano templates
Package behavior or public API changednoN/Ano published package/API
Registry-only component work changednoN/Ano registry component
Docs or content changedyesVerify generated docsnpm run docs:perf:check passed
High-risk mini gatenoN/Ano public API/runtime behavior change
Agent-native review for agent/tooling changesnoN/Ano agent/tooling change
Local install corruption suspectednoN/Ano env-rot signal
Autoreview for non-trivial implementation changesnoN/Anarrow renderer grouping fix
PR create or updatenoN/Ano PR requested
PR proof image hostingnoN/Ano PR
Tracker sync-backnoN/Ano tracker
Final handoff contractyesFill final handoff fieldssee below
Final lintyesRun scoped equivalentpnpm exec biome check ... --fix passed
Goal plan completeyesRun node .agents/rules/autogoal/scripts/check-complete.mjs docs/plans/2026-05-28-show-runtime-adapter-comparison.mdrecorded below

Phase / pass table:

PhaseStatusEvidenceNext
Intake and source readcompleterenderer and generated data inspectedimplementation
Implementationcompleteruntime adapter categories remapped to one comparison groupverification
Verificationcompletedocs check, data assertion, served JSON smokecloseout
PR / tracker synccompleteN/A, no PR/tracker requestedfinal response
Closeoutcompletegoal plan completedfinal response

Findings:

  • Lexical rows existed, but the renderer grouped by source category. Because Lexical and ProseMirror used different categories, the dashboard showed two single-library sections instead of a comparison.
  • The shared fixture names already existed, so the right fix was to normalize those adapter categories in the viewer data model.

Decisions and tradeoffs:

  • Chosen: map lexical-runtime-adapter and prosemirror-runtime-adapter to runtime-adapter-compare in render-rich-text-viewer.mjs.
  • Rejected: changing raw benchmark result categories, because Evidence Kit artifact identity should stay target-owned.
  • Deferred: direct Slate vs Lexical/ProseMirror normalized runtime comparison. That needs a Slate adapter or normalized fixtures, not a display-only trick.

Implementation notes:

  • Added runtimeAdapterCategories.
  • formatRow now rewrites those categories only for viewer grouping.
  • Library sort order now ranks prosemirror:runtime and lexical:runtime.

Review fixes:

  • Corrected static smoke expectation: the HTML shell does not contain fetched data strings; the served JSON does.

Error attempts:

Error / failed attemptCountNext different moveResolution
Checked rich-text.html source for fetched data string1Verify generated and served rich-text-data.jsonfixed verification claim

Verification evidence:

  • npm run docs:rich-text from /Users/zbeyens/git/plate-2/benchmarks/editor: success.
  • Data assertion from /Users/zbeyens/git/plate-2: runtime-adapter-compare has 7 rows with prosemirror:runtime and lexical:runtime.
  • npm run docs:perf:check from /Users/zbeyens/git/plate-2/benchmarks/editor: success.
  • pnpm exec biome check benchmarks/editor/benchmarks/render-rich-text-viewer.mjs docs/plans/2026-05-28-show-runtime-adapter-comparison.md --fix from /Users/zbeyens/git/plate-2: success.
  • node --check benchmarks/render-rich-text-viewer.mjs from /Users/zbeyens/git/plate-2/benchmarks/editor: success.
  • Served data smoke from /Users/zbeyens/git/plate-2: http://127.0.0.1:8765/rich-text-data.json returned 200 and includes runtime-adapter-compare.
  • node .agents/rules/autogoal/scripts/check-complete.mjs docs/plans/2026-05-28-show-runtime-adapter-comparison.md from /Users/zbeyens/git/plate-2: success.

Final handoff contract:

  • PR line: N/A, no PR requested.
  • Issue / tracker line: N/A, no tracker.
  • Confidence line: high.
  • Flow table:
    • Reproduced: generated viewer data had separate single-library runtime adapter groups.
    • Verified: generated viewer data now has one shared runtime adapter group.
  • Browser check: served data is updated; reload rich-text.html to fetch it.
  • Outcome: visible Lexical vs ProseMirror runtime comparison table.
  • Caveat: this is Lexical vs ProseMirror headless runtime comparison, not a direct Slate-vs-Lexical runtime comparison.
  • Design:
    • Chosen boundary: viewer data normalization.
    • Why not quick patch: raw artifact names are still useful; display grouping is the broken layer.
    • Why not broader change: direct Slate normalization is a separate benchmark design problem.
  • Verified: docs check, data assertion, syntax check, served JSON smoke, autogoal check.

Final handoff / sync:

  • PR: N/A.
  • Issue / tracker: N/A.
  • Browser proof: served JSON proof; Browser plugin unavailable in tool search.
  • Caveats: reload the page to refetch rich-text-data.json.

Timeline:

  • 2026-05-28T20:17:35Z Task goal plan created.
  • 2026-05-28T20:18:00Z Runtime adapter categories grouped in viewer data.
  • 2026-05-28T20:19:00Z Docs and served data verified.

Reboot status:

QuestionAnswer
Where am I?Closeout
Where am I going?Final response
What is the goal?Show Lexical/ProseMirror runtime adapter comparison visibly
What have I learned?The benchmark rows existed; the dashboard grouping hid the comparison
What have I done?Added shared runtime-adapter comparison group and regenerated docs/data

Open risks:

  • Direct Slate vs Lexical/ProseMirror runtime comparison still needs normalized Slate-side fixtures or a Slate adapter.