Back to Plate

Slate v2 Benchmark Candidate Map

docs/slate-issues/benchmark-candidate-map.md

53.0.617.2 KB
Original Source

Slate v2 Benchmark Candidate Map

Purpose

This file is the benchmark handoff layer.

It exists so a maintainer can choose a performance issue and start a reproducible workload lane without rereading the full GitHub thread.

Live-state note:

  • research snapshot: 682 open issues
  • post-Batch-A live repo state: 628 open issues
  • Batch A queue execution status: 54/54 closed

Issue #6038

  • package: slate
  • benchmark readiness: ready-now
  • public benchmark seam: packages/slate/test/perf/set-nodes-bench.js

Why This Is Benchmark-Ready

This issue is already framed as a benchmark-driven engine problem. The workload class, success criteria, and active implementation seam are all explicit.

Minimal Workload Shape

  • repeated tree updates over a large editor state
  • exact-path set_node batches
  • mixed structural batches where planner and draft behavior matter

Primary Metric

  • end-to-end batch runtime against the replay baseline

Secondary Metric

  • semantic parity between Transforms.applyBatch(...) and manual Editor.withBatch(...)

Notes

Do not turn this into one giant benchmark lane. Keep it split by operation family and mixed-batch shape.


Issue #5992

  • package: slate
  • benchmark readiness: ready-with-minor-setup
  • public benchmark seam: huge-document cut benchmark

Current Proof

The workload now has a stable harness lane in .tmp/slate-v2/scripts/benchmarks/core/current/clipboard-large-payload.mjs. The issue-size gate is:

bash
SLATE_CLIPBOARD_BENCH_HUGE_CUT_BLOCKS=50000 SLATE_CLIPBOARD_BENCH_ISSUE_TARGETS=1 bun ./scripts/benchmarks/slate/5945-large-plaintext-paste.mjs

The current proof supports Improves #5992, not Fixes #5992: exact whole-child range delete lowers to one replace_children operation and meets the accepted warm interaction benchmark target, and browser stress covers a 5,000-block huge-document cut row. Exact closure still needs maintainer acceptance that the benchmark plus browser stress matches the original repro.

Minimal Workload Shape

  • large document, likely tens of thousands of blocks
  • selection spanning a small range inside that large document
  • cut operation through the public editor surface

Primary Metric

  • cut latency as document size increases

Secondary Metric

  • any visible selection or normalization amplification during the cut path

Current Blocker

The harness exists, the warm interaction target is green, and a browser huge-document cut row exists. Exact closure now depends on whether maintainers accept the cold snapshot allocation row as startup/editor-preparation cost and the 5,000-block browser stress row as sufficient user-path coverage.

Best Next Step

Build one narrow benchmark lane first:

  • fixed large-document generator
  • fixed two-node selection
  • cut through the same public transform path users hit

Issue #5945

  • package: slate
  • benchmark readiness: ready-now
  • public benchmark seam: large plaintext paste benchmark

Why This Is Benchmark-Ready

The issue already has a reproducible workload and contributor profiling that points at the expensive seams.

Minimal Workload Shape

  • generate large plain text with many newline splits
  • paste into the plaintext example or equivalent public insert-data path
  • measure the full ingest cost

Primary Metric

  • paste latency for large plain-text payloads

Secondary Metric

  • time spent in normalization and editor validation during ingest

Notes

Do not benchmark “paste is slow” as one opaque blob. Break out the normalization-heavy path and the editor-validation path if the harness can expose both.


Issue #4483

  • package: slate-react
  • benchmark readiness: ready-now
  • public benchmark seam: dynamic decorations rerender cost

Why This Is Benchmark-Ready

The issue already has a concrete workload, a performance claim with before/after numbers, and a narrow renderer seam.

Minimal Workload Shape

  • moderately-sized document with dynamic decorations driven by external state
  • edits confined to one logical region that force redecorating elsewhere
  • compare global decorate churn versus local per-node decoration rendering

Primary Metric

  • end-to-end edit latency when decoration inputs change

Secondary Metric

  • rerender breadth across unrelated elements

Notes

This is not a generic “decorations are slow” complaint. It is specifically about the global invalidation model in slate-react.


Issue #3656

  • package: slate-react
  • benchmark readiness: ready-with-minor-setup
  • public benchmark seam: leaf rerender breadth inside one block element

Why This Is Benchmark-Ready

The issue has a concrete workload and the thread already frames the core complaint as rerender breadth, not vague slowness.

Minimal Workload Shape

  • one block containing many leaves with distinct marks or properties
  • edit one leaf repeatedly through the public editor surface
  • measure rerender spread across sibling leaves in the same block

Primary Metric

  • rerender count or render breadth per keystroke inside the block

Secondary Metric

  • end-to-end typing latency as leaf count grows within one block

Current Blocker

The harness still needs one lane that isolates leaf breadth inside a single block instead of whole-editor rerender pressure.

Best Next Step

Add one narrow slate-react perf lane for many-leaf blocks and a single edited leaf.



Issue #3430

  • package: slate-react
  • benchmark readiness: ready-now
  • public benchmark seam: single-paragraph many-inline typing benchmark

Why This Is Benchmark-Ready

The issue already frames a concrete workload: one paragraph, lots of inline nodes, then normal typing, paste, and backspace falling off a cliff.

Minimal Workload Shape

  • one paragraph containing many inline nodes
  • repeated typing or backspace inside that paragraph
  • measure rerender spread and end-to-end latency as inline count grows

Primary Metric

  • typing latency per keystroke inside the heavily inline paragraph

Secondary Metric

  • rerender breadth across leaves in the same paragraph

Current Blocker

The harness still needs a narrow lane for one-inline-heavy paragraph instead of whole-editor rerender pressure.

Best Next Step

Add one slate-react perf lane for a single paragraph with many inlines and edit one inline repeatedly.

Pilot Note

The pilot already proves benchmark extraction is a separate artifact, not a footnote under TDD. Performance issues want workload capture, metrics, and harness seams, not fake red-test prose.

The 25-issue expansion did not add new benchmark-worthy issues beyond the existing large-document and batch-engine lanes. That is useful signal too. Most of the new batch was correctness, input-method, or ecosystem triage, not hidden perf work.

The 76-issue mark still says the same thing. Even the stronger runtime-design issues in the new batch, like #5697, are architecture and correctness pressure first, not benchmark-first workload reports.

The 101-issue mark still did not add a genuine new benchmark issue. That is good discipline: not every painful bug deserves a perf harness, and this batch was overwhelmingly runtime semantics, browser integration, and API shape.

The 251-issue mark still did not produce a cleaner new benchmark target than the existing large-document, selection, and batch-engine lanes. This batch was mostly DOM ownership, mobile input, plugin seam, and API-shape pressure rather than hidden perf work.

The 301-issue mark still does not surface a cleaner new benchmark target than the existing large-document, selection, and batch-engine lanes. This 50-issue tranche was mostly DOM bridge, mobile input, typing, docs/example debt, and plugin/runtime seam pressure rather than fresh hidden perf work.

The 351-issue mark still does not produce a cleaner benchmark target than the existing large-document, selection, and batch-engine lanes. This tranche was dominated by runtime-boundary, clipboard-strategy, mobile input, and docs/example pressure rather than fresh performance reports.

The 401-issue mark finally adds one clean new benchmark lane: dynamic decorations in slate-react from #4483. Most of the rest of that tranche still leaned runtime-boundary, mobile input, Shadow DOM, and docs/example pressure rather than hidden perf work.

The 451-issue mark adds one more real renderer-performance lane: rerender breadth in #4210, with #4141 as the depth-sensitive variant of the same problem. The rest of this tranche is still dominated by selection, composition, plugin-surface, and example/process noise.

The 501-issue mark adds one older but still legitimate large-document clipboard lane from #4056. Most of this tranche still reinforced IME, focus, iframe, readonly, and docs/process pressure rather than surfacing a pile of new perf work.

The 551-issue mark adds one real history-memory benchmark lane from #3752. Most of this tranche still leaned history semantics, cross-window runtime ownership, IME, and docs/example pressure rather than a flood of fresh perf reports.

The 601-issue mark adds one older but still useful slate-react renderer lane from #3656: leaf rerender breadth inside a single block. Most of the rest of this tranche still reinforced focus ownership, Android/input debt, history semantics, and structural delete failures rather than surfacing a pile of new perf work.


Issue #4056

  • package: cross-package
  • benchmark readiness: ready-with-minor-setup
  • public benchmark seam: large text paste/copy into a populated editor

Why This Is Benchmark-Ready

The workload is concrete and user-visible, and the thread keeps pointing back to large-document ingest cost rather than one weird payload.

Minimal Workload Shape

  • large plain-text or rich-text payload
  • paste or cut/copy path through the public editor surface
  • editor state large enough for normalization and rerender cost to matter

Primary Metric

  • end-to-end paste/cut latency as document size grows

Secondary Metric

  • normalization and rerender amplification during ingest

Current Blocker

The package benchmark lane now covers populated-editor copy and middle paste at 10,000-block scale. Exact browser closure still needs the historical full-book/user-path repro.

Best Next Step

Add the browser/user-path reproduction before upgrading the claim beyond Improves.


Issue #3752

  • package: slate-history
  • benchmark readiness: ready-now
  • public benchmark seam: history memory retention under edit churn

Why This Is Benchmark-Ready

The issue has a concrete reproduction path, a measurable memory symptom, and a strong hint about where retained references are coming from.

Minimal Workload Shape

  • rich text editor with slate-history enabled
  • repeated edit churn followed by undo stack growth
  • memory inspection focused on detached DOM nodes or retained editor-linked objects

Primary Metric

  • retained memory or detached node count after repeated edit churn

Secondary Metric

  • whether cloning or stripping retained operation payloads changes retention significantly

Notes

This is a memory-retention lane, not a latency lane. Treat it like a leak benchmark, not a typing benchmark.


Issue #5216

  • package: slate-dom and slate-react
  • benchmark readiness: ready-with-minor-setup
  • public benchmark seam: Safari long-paragraph backward selection

Why This Is Benchmark-Ready

The workload is tight: one browser, one document shape, one user-visible lag path.

Minimal Workload Shape

  • long paragraph in Safari, well over 300 words
  • backward text selection where focus trails anchor
  • repeated drag selection over the same paragraph

Primary Metric

  • selection latency and visible lag under backward selection

Secondary Metric

  • whether the lag scales with paragraph length

Current Blocker

This still needs a stable Safari harness, not just a screen recording.

Best Next Step

Build one browser-scoped lane first instead of pretending this should be cross-browser by default.


Issue #5131

  • package: slate-react
  • benchmark readiness: ready-now
  • public benchmark seam: selection-driven rerender count

Why This Is Benchmark-Ready

This is a clean subscription-granularity question: how much work does useSlate do when only selection changes?

Minimal Workload Shape

  • editor subtree using useSlate
  • rapid selection changes without content edits
  • render count instrumentation around the subscribed subtree

Primary Metric

  • rerender count per selection change

Secondary Metric

  • commit time amplification versus a more selective hook shape

Notes

This is not about micro-optimizing hooks in the abstract. It is about whether slate-react subscriptions are too broad.


Issue #4210

  • package: slate-react
  • benchmark readiness: ready-now
  • public benchmark seam: selection/edit rerender breadth benchmark

Why This Is Benchmark-Ready

The issue is already a clean renderer invalidation complaint with a public repro path and an obvious measurement target.

Minimal Workload Shape

  • moderately sized editor tree with many rendered elements
  • tiny edits and pure selection changes through the public editor surface
  • render instrumentation or React profiling around rerender breadth

Primary Metric

  • rerender breadth per edit or selection change

Secondary Metric

  • commit time amplification versus a more selective subscription/runtime shape

Notes

This is the same family as the later nested-block rerender issues, so it should become one reusable renderer benchmark lane, not five nearly identical ones.


Issue #4141

  • package: slate-react
  • benchmark readiness: ready-with-minor-setup
  • public benchmark seam: nested-block rerender breadth benchmark

Why This Is Benchmark-Ready

The pain is concrete, but the benchmark should be framed as a depth-sensitive variant of the broader rerender-breadth lane instead of its own silo.

Minimal Workload Shape

  • deeply nested block tree
  • edit confined to one low-level text node
  • measure rerender breadth up the ancestor chain

Primary Metric

  • number of rerendered ancestors and siblings per low-level edit

Secondary Metric

  • end-to-end edit latency as nesting depth increases

Current Blocker

It still needs a stable public harness instead of screenshots from React devtools alone.

Best Next Step

Add one depth-aware variant to the rerender benchmark lane instead of inventing a separate perf harness.

The 651-issue mark adds one real older slate-react perf lane from #3430: single-paragraph many-inline editing where render breadth and typing latency collapse together. The rest of this tranche mostly reinforced focus ownership, placeholder behavior, decoration invalidation, Android input, and extension-surface pressure instead of surfacing a pile of new benchmark work.


Issue #2195

  • package: slate
  • benchmark readiness: ready-now
  • public benchmark seam: large paste dirty-path tracking benchmark

Why This Is Benchmark-Ready

The issue already names the hot path and ties it to a reproducible large-paste workload.

Minimal Workload Shape

  • large plain-text or fragment paste that creates many inserted text nodes
  • normalization path that recalculates dirty entries repeatedly
  • measurement isolated to the public insert path, not synthetic internals

Primary Metric

  • paste latency under heavy dirty-path churn

Secondary Metric

  • time spent recalculating dirty entries during normalization

Notes

This should stay narrow. The useful comparison is dirty-path tracking cost, not “paste is slow” as one opaque blob.


Issue #2051

  • package: slate-react
  • benchmark readiness: ready-now
  • public benchmark seam: simple typing rerender breadth benchmark

Why This Is Benchmark-Ready

The issue directly frames a measurable runtime problem: simple native edits should not force broad rerender work.

Minimal Workload Shape

  • simple character insertion and deletion in a document with many leaves or decorated spans
  • comparison of rerender breadth across unrelated leaves or siblings
  • public typing path, not synthetic direct DOM mutation

Primary Metric

  • end-to-end edit latency for simple typing

Secondary Metric

  • rerender breadth across leaves or custom rendering islands

Notes

This lane is about runtime invalidation breadth. Do not mix it with IME correctness or spellcheck behavior.


Issue #790

  • package: slate-react
  • benchmark readiness: ready-with-minor-setup
  • public benchmark seam: large-document virtualization and initial render benchmark

Why This Is Not Ready Yet

The issue has an obvious workload and user pain, but it still needs a stable harness shape for windowing or deferred render comparisons.

Minimal Workload Shape

  • very large multi-block document with realistic element rendering
  • initial mount plus first interaction or first scroll
  • optional comparison between full render and staged or windowed render strategies

Primary Metric

  • initial render latency for very large documents

Secondary Metric

  • DOM node count and first-edit latency after mount

Current Blocker

The benchmark still needs a stable huge-document fixture and a fair comparison seam. Windowing strategies that break DOM lookup are not useful baselines.