docs/performance/editor-performance-master-plan.md
Keep Plate performance brutally honest over time.
This plan owns the full editor-performance program:
nodeId/dev/editor-perf benchmark harnessIt replaces the earlier split perf-plan docs so the work stops scattering across half a dozen files.
Do not merge the separate Slate batching track into this document.
That work stays in:
That is a different lane with a different owner.
Slate is the standing reference floor on equivalent workloads.
Plate does not need to imitate Slate’s architecture, but every extra millisecond above Slate needs a reason. If the cost buys real framework value, budget it tightly. If it does not, kill it.
Default optimization order:
Do not rip out jotai-x, zustand-x, plugin composition, or the framework
model just to win one screenshot. That is fake progress.
Execution order from here:
Do not run a plugin census on top of unresolved core tax. That just blames the wrong layer.
2026-04-03)CodePlugin was the last real core-plugin embarrassment and got a major cut
from the hard-affinity redesign:
386.96 ms -> 248.19 ms334.55 ms -> 264.41 ms392.68 ms -> 295.71 msbasic-nodes plugins split like this:
KbdPlugin, SubscriptPlugin, SuperscriptPluginHighlightPlugin, StrikethroughPluginHighlightPluginStrikethroughPlugin2026-04-04)The new standalone benchmark lab under benchmarks/editor surfaced an important distinction:
apps/www10k markdown mount laneCurrent local-built standalone result:
03_mount-10k: 736.30 ms03_mount-10k: 437.60 msThat does not invalidate the earlier public harness. It means the public harness was narrower than the richer markdown profile.
The latest standalone decomposition says the gap is concentrated in richer mount surfaces, not generic core mount:
strikethrough worstSee:
Latest exact mark finding:
BasicMarksPluginpipeRenderLeaf(...) /
pluginRenderLeaf(...)pipeRenderLeaf(...) and pipeRenderText(...) by
active mark so inactive mark renderers stop running on every leaf/text nodepipeRenderLeaf(...), which reduced the main bundle lane again:
48_mount-10k-marks-basic: about 1387 ms -> 1310 ms86_mount-10k-bold-basic: about 673 ms -> 597 ms90_mount-10k-bold-single: about 439 ms -> 428 msObject.keys(...).flatMap(...).sort(...)
churn from the shared mark pipes without making plain leaves pay the full
simple-mark loop:
48_mount-10k-marks-basic in the 1245-1289 ms
range versus the older 1310 ms baseline90_mount-10k-bold-single moved from about 428 ms to 400-425 ms91_mount-10k-italic-single moved from about 427 ms to 388-423 ms93_mount-10k-strikethrough-single moved from about 482 ms to
440-450 msLatest exact list finding:
ListPlugin islist-core: flattened list payload with no ListPluginlist-only: flattened list payload with only ListPluginlist-markdown: full markdown bundlelist-only rendered one <ul> per itemlist-only renders paragraph elements as [role="listitem"]<ul> per logical listbelowNodes wrappers49_mount-10k-list-markdown: Plate 890.40 ms, Slate 630.10 ms97_mount-10k-list-only: Plate 848.70 ms, Slate 671.70 msnodeId
initialization, not construction cost.zustand-x creation cost is real but small.jotai-x had redundant sync work worth trimming, but it was not the whole
bottleneck.useElement() and usePath() were reading through the per-node atom store
even when they only needed nearest-node context.nodeId init was catastrophically wrong when it used one Slate transform per
missing id./dev/editor-perf were drifting until they
shared one config seam.ElementProvider / useElement path cuts where context was unnecessarygetRenderNodeProps(...) plain-node fast pathnodeId initial-value rewrite moved to a pure value pathnodeId live normalization moved to batch updatesdata-block-id stopped paying a mounted-store gatejotai-x sync hydration stopped doing redundant mount-time workuseElement() and usePath() now prefer a cheap chained React context from
ElementProvider; the atom-store path remains for useElementSelector() and
exported store consumersThe remaining provider-backed red zone was the exported element-hook surface, not generic rich rendering in the abstract.
Fresh 5,000 blockquote core-mount numbers before the context-first hook fix:
434.65 ms485.81 ms317.58 ms367.46 msTake:
The compatibility-preserving fix was to keep ElementProvider and the exported
store surface, but let useElement() and usePath() read a cheap chained React
context first.
Same-batch rerun after that fix:
472.92 ms490.41 msThat shrank the within-batch hook-consumer gap from 51.16 ms to 17.49 ms.
Summary artifacts:
editor-perf-5000-hook-consumer-context-summary.json and
editor-perf-5000-hook-consumer-context-after.json were older compact
summaries that were not retained after the raw-artifact moveThe next red seam was not useElement() or usePath() anymore. It was the
selector/store consumer path.
Two facts came out of that slice:
ElementProvider was relying on the generic createAtomProvider(...)
hydration path for element, entry, and path, even though it already
owned those live propsuseElementSelector() was paying an extra derived-atom layer through
selectAtom(...) before it ever hit the storeThe first fix was correctness-first:
ElementProvider now seeds its own per-node store immediately and syncs later
prop changes in a layout effectElementProvider, elementStore, useElementStore(), and
useElementSelector() intact for compatibilityentry = null on first read in the
focused testsThe second fix was the real selector perf cut:
useElementSelector() now uses useEntryValue(...) directly instead of
selectAtom(...) + useStoreAtomValue(...)Fresh 5,000 blockquote selector numbers:
459.72 ms469.84 msuseEntryValue(...) rewrite:
editable-element-plugin-render-node-selector-5000-blockquote-after-direct-entry.json
449.81 ms326.85 ms383.86 msTake:
459.72 ms -> 449.81 ms122.96 ms over plain context65.95 ms over raw JotaiuseElement() / usePath() anymoreThe next follow-up finally killed the fake extra provider tax:
ElementProvider no longer wraps every node in the redundant
ElementStoreProvider layer384.33 ms449.81 ms -> 384.33 ms384.33 ms vs 383.86 msThat changed the conclusion:
One more idea got tested and rejected:
397.81 msUpdated selector/store take:
326.85 ms383.86 ms384.33 msThat architecture change landed next:
ElementProvider no longer creates a Jotai store on the hot pathelemententrypathuseElementSelector() reads that runtime store directlyThe next honest red lane is table selection, not mount.
The new /dev/table-perf runner can now measure:
Real selection numbers on plain unmerged tables showed the problem scales badly:
20x20, select 5x5 (25 cells): 55.50 ms40x40, select 10x10 (100 cells): 224.51 ms60x60, select 15x15 (225 cells): 454.39 msTwo obvious ideas were tested and rejected:
useSelectedCells() inside
useTableSelectionDom.tsNeither moved the real lane enough to keep.
The kept win is in getTableGridByRange.ts: unmerged tables no longer pay the merge-aware selection-grid path.
Kept artifacts:
table-perf-selection-60x60-15x15-summary.json
454.39 mstable-perf-selection-60x60-15x15-current-summary.json
419.67 msTake:
useElementStore() still works, but it now lazily materializes a bridged
Jotai store only when someone actually asks for itFocused artifacts:
385.05 ms317.90 msInterpretation:
384.33 ms -> 385.05 ms368.10 ms317.90 ms-50.20 msTake:
useElement, usePath, useElementSelector, useElementStore, and
elementStoreThere is now a dedicated store-alternatives benchmark in:
It compares:
jotai-x providerjotai-x store APIjotai-x direct keyed hookAcross:
Headline numbers at 1,000 nodes:
1.44 ms8.39 msjotai-x provider: 12.38 msjotai-x store API: 11.93 msjotai-x direct keyed hook: 12.82 ms2.08 ms12.84 msjotai-x provider: 18.94 msjotai-x store API: 22.80 msjotai-x direct keyed hook: 23.66 ms2.18 ms15.52 msjotai-x provider: 20.45 msjotai-x store API: 25.05 msjotai-x direct keyed hook: 23.74 ms4.28 ms32.85 msjotai-x provider: 39.73 msjotai-x store API: 39.25 msjotai-x direct keyed hook: 42.61 msTake:
jotai-x is a real extra tax over raw Jotai in every laneTwo library cuts shipped with that benchmark work:
createAtomStore(...).useXValue() now bypasses selectAtom(...) entirely
when no selector/equalityFn is providedselectAtom(...) atom
instead of recreating it on every renderThose are worth keeping. The benchmark says the remaining wall is bigger than one dumb helper branch.
Implementation note:
packages/jotai-x/node_modules, which makes the harness
lie or crashUse /docs/examples/huge-document for:
Plate + Slate, Plate only, and Slate onlyDo not use it as the source of truth for benchmark numbers.
Use /dev/editor-perf for:
The Huge Document page exposes Open in benchmark mode, which deep-links into
/dev/editor-perf with the current shared config.
That is the intended workflow:
/dev/editor-perfThese workloads exist now and should remain the base family:
huge-mixed-blockhuge-paragraphhuge-headinghuge-blockquotehuge-dense-texthuge-dense-inline-propshuge-paragraph-fallback1,000 blocks: smoke and seam work5,000 blocks: primary comparison lane10,000 blocks: stress laneChunked is realistic. No chunking is the honesty test.
Layer 0 answers one question continuously:
The Layer 0 baseline family is:
nodeIdnodeId seededLayer 0 also includes the activated-core family:
nodeId seedednodeId unseeded initThe dense text-heavy lanes stay in the program, but they are not baseline gate material anymore:
pnpm --filter ./apps/www perf:editor:layer0-smokepnpm --filter ./apps/www perf:editor:layer0pnpm --filter ./apps/www perf:editor:stress-coreCurrent smoke artifacts:
editor-perf-layer0-smoke-summary.json (historical compact summary not retained).tmp/editor-perf-layer0-smoke.json (local, gitignored)Current full-run artifacts:
editor-perf-layer0-summary.json (historical compact summary not retained).tmp/editor-perf-layer0.json (local, gitignored)From the retained smoke snapshot notes:
1k chunked:
65.85 ms57.67 msnodeId: 61.43 msnodeId seeded: 66.08 ms59.67 ms1k chunked:
59.05 ms58.99 ms1k chunked:
59.79 ms76.21 ms1k chunked:
62.91 ms63.55 msnodeId init 5k:
8.24 ms4.85 ms4.11 ms5k chunked is basically parity:
311.94 ms312.30 ms5k chunked is green:
347.07 ms336.26 ms5k chunked is the only small red core-activated lane left:
326.40 ms331.57 ms5k chunked is green after the fallback-path cut:
347.05 ms317.38 ms5k no-chunk is green:
419.17 ms385.24 msnodeId init is no longer the giant cliff it used to be5k: plain context 254.92 ms, raw Jotai 308.17 ms,
ElementProvider 351.46 ms5k: plain context 349.40 ms, raw Jotai
372.01 ms, ElementProvider 412.93 msrender.as plugin path after
all; it was the unknown-element renderElement fallback still forcing
useNodePath even though RenderElementProps.path is optional5k blockquote rerun
from 425.99 ms to 301.66 ms, and the clean full Layer 0 rerun kept the
lane green at 317.38 mspluginRenderElement with precomputed paths:
504.70 ms -> 484.80 ms-19.90 ms (-3.94%)getEditorPlugin(...) context on every node for a
fixed plugin render pathjotai-x cut was dumb but real:
createAtomProvider(...) was calling createStore() eagerly inside
useState(...)useState(() => createStore()) moved the clean sequential
direct rich lane:
484.80 ms -> 453.39 ms-31.41 ms (-6.48%)351.46 ms353.56 ms504.70 ms -> 516.94 msprotocolTimeout to the benchmark timeout/dev/editor-perf must not SSR default query-param state and hydrate a
different workload on the clientUse Layer 3 for lanes that are worth measuring but too heavy or too volatile to serve as the always-on baseline gate.
Current core-stress workloads:
5k chunked5k chunkedLayer 1 is live for the first cheap core batch.
Current harness coverage:
BlockquotePluginHeadingPluginBoldPluginItalicPluginUnderlinePluginEach plugin now runs:
Current artifacts:
editor-perf-layer1-core-plugins-smoke-summary.json (historical compact summary not retained)editor-perf-layer1-core-plugins-summary.json (historical compact summary not retained)editor-perf-5000-code-dissection-summary.json (historical compact summary not retained)editor-perf-5000-code-hard-affinity-fast-path-summary.json (historical compact summary not retained)Every shipped plugin gets:
The plugin is loaded but the document does not activate it.
This answers:
The document actually contains the nodes/marks/decorations/overlays that the plugin handles.
This answers:
Assign each plugin to a primary class:
The class decides:
layer-1-core-plugins batch is now healthy on the real Plate dev
server at http://localhost:3011/dev/editor-perf
localhost:3001 hang was a dead-server assumption, not a benchmark
seamBlockquotePlugin is green in the refreshed 5k batch
-19.69 ms-14.49 msHeadingPlugin is green in the refreshed 5k batch
-4.66 ms-17.33 msBoldPlugin, ItalicPlugin, and UnderlinePlugin are still the live cheap
mark family seam, but the text-path cut changed the shape
pipeRenderText(...) split:
render.as text plugins no longer pay a per-plugin hook/function
call stack inside the outer text pipeisDecoration: false
mark path directly5k one-off reruns:
BoldPlugin: inactive +6.19 ms, activated +15.00 msItalicPlugin: inactive -0.51 ms, activated +14.11 msUnderlinePlugin: inactive +6.49 ms, activated +17.10 ms5k batch:
BoldPlugin: inactive +4.46 ms, activated +13.67 msItalicPlugin: inactive +3.17 ms, activated +15.71 msUnderlinePlugin: inactive +7.36 ms, activated +19.44 ms5k dissection before any more underline-specific
surgery:
huge-underline:
Editable + underline direct renderers: 254.70 msEditable + underline plugin leaf direct: 251.01 msEditable + underline leaf/text pipes: 267.41 mspluginRenderLeaf(underline) is basically already at the lower bound12.71 ms
above the direct <u> lower bound and 16.40 ms above the isolated
plugin-leaf laneeditor-perf-5000-underline-dissection-summary.json (historical compact summary not retained)5k Layer 1 CodePlugin rerun:
-4.29 ms+154.06 ms5k chunked huge-code dissection on the correct
core-mount path:
Editable + code direct renderers: 257.29 msEditable + code plugin leaf direct: 367.82 msEditable + code leaf/text pipes: 392.68 msCodePlugin is the next real core target110.54 ms above
the direct <code> lower bound24.85 mspipeRenderText(...) cleanupeditor-perf-5000-code-dissection-summary.json (historical compact summary not retained)pluginRenderLeaf(...):
render.as leaves with affinity: 'hard' now skip
getRenderNodeProps(...) and go straight to PlateLeaf5k chunked huge-code rerun:
Editable + code PlateLeaf direct: 337.39 msEditable + code plugin leaf direct: 334.55 msEditable + code plugin leaf direct: 367.82 mspluginRenderLeaf(code) is now basically at the PlateLeaf floor77.26 ms above the direct <code> lower boundpluginRenderLeaf(...) surgery; it is deciding whether the hard-edge
DOM shape is worth redesigning at alleditor-perf-5000-code-hard-affinity-fast-path-summary.json (historical compact summary not retained)
That changes the next move. Stop treating UnderlinePlugin as a unique target.
The next cheap-mark cut is no longer generic pluginRenderLeaf(...). That part
of the active CodePlugin path is basically green. The remaining question is
whether the hard-affinity leaf body itself is worth a riskier redesign.After the single-plugin census:
Bundle lanes exist to catch interaction cliffs between plugins that look cheap in isolation.
Layer 3 exists to catch lies from smaller lanes:
5k no chunk10k init10k mountKeep the benchmark harness narrow:
Huge Document should mirror Slate’s control surface and workload semantics, but remain honest about engine isolation.
That means:
Mounted editors control so the docs surface can be honest when neededThe next provider-rich core fix is not a free-for-all rewrite.
Today, core still exports:
ElementProvideruseElementStoreelementStoreuseElementusePathuseElementSelectorThat means a lighter element-context architecture needs one of two shapes:
Do not pretend this is “just internal” if the export surface says otherwise.
nodeId shapeKeep the split explicit:
nodeId.normalize() is editor-operation workDo not collapse those two paths into one abstraction just because they both touch ids.
Initial-value hooks are value transforms, not half-imperative side-effect seams.
That means:
transformInitialValue is Value -> ValueThe raw JSON pile can stay. It is useful archaeology.
The main artifacts that matter right now:
docs/plans/ were
not retained after the raw-artifact moveeditor-perf-5000-render-as-summary.jsoneditor-perf-5000-bold-leaf-wrapper-summary.jsoneditor-perf-5000-nodeid-mounted-gate-summary.jsoneditor-perf-layer0-smoke-summary.jsoneditor-perf-layer1-core-plugins-smoke-summary.jsoneditor-perf-layer1-core-plugins-summary.jsonThe early harness proved:
nodeIdThe render-path dissection proved:
renderElement fallback was also paying path-lookup tax
it did not need, because RenderElementProps.path is optionalElementProvider and getRenderNodeProps(...) mattered in richer pathsPlateElement was not free, but it was not the whole wallThe store-tech split proved:
zustand-x were not meaningfully better than the provider-per-
node patternjotai-x still had some dumb mount-time
waste, but not enough to change that conclusionuseElement, usePath, and useElementSelector semantics while
removing the per-node Jotai store from the hot pathnodeId diagnosisThe nodeId investigation proved:
Plate core + nodeId
stopped trailing Slate on the main huge-doc laneThe Huge Document work proved:
ComponentPreview was the wrong shell for a page this heavy+19 ms activated in the clean batchHrPluginCodePlugin or StrikethroughPluginCompleted:
Still open:
PR automation should run:
Do not run the whole matrix on every PR unless you enjoy wasting compute and ignoring flaky noise.
Start provisional, then freeze after the first real census.
Current stance:
nodeId should stay close to corenodeId init gets its own init budget, not hidden inside mountThe comparison hierarchy is:
If you skip step 2, you blame the wrong thing.
This master plan is a docs consolidation. It does not change runtime behavior.
Relevant runtime verification already exists in the generated artifacts and the recent smoke/full runs:
editor-perf-layer0-smoke-summary.jsoneditor-perf-layer1-core-plugins-smoke-summary.jsoneditor-perf-layer1-core-plugins-summary.json