Slate v2 Issue Intelligence Master Plan

Goal

Read every open Slate issue into a structured research ledger, cluster the real pain patterns, score them, and turn that into a package-by-package v2 architecture plan.

This is explicitly a multi-pass program. The whole point is to avoid a lazy one-shot synthesis where the loudest issues or our favorite ideas dominate the result.

The output must also be valuable for future maintainer triage:

deciding whether an issue is valid
deciding whether an issue is stale, duplicate, or invalid
deciding what kind of maintainer reply would be useful
deciding whether an issue belongs in the v2 roadmap at all

The output must also be directly useful for later TDD:

writing the first red test without reopening the issue thread
identifying the public seam the behavior should be tested through
capturing the minimal repro shape in behavior terms instead of implementation-detail sludge

The output must also be directly useful for later benchmark work:

identifying which performance issues deserve a reproducible benchmark lane
capturing the minimal workload shape without rereading the issue thread
separating benchmark candidates from ordinary correctness tests

Decision

Do not start by designing Slate v2 from taste.

Do not start by reading issue comments at random.

Do not start by clustering from labels alone.

Start with a frozen snapshot of every open issue, classify them under a strict rubric, then derive architecture requirements from the ranked clusters.

North Star

The output is not:

a generic issue summary
a vibe-based list of “what users want”
a random brainstorm about editors

The output is:

a complete open-issue ledger
a scored issue-theme map
a package impact matrix
a v2 requirements document grounded in actual pain
a roadmap for slate-v2, slate-react-v2, and supporting packages

Scope

Primary source:

every open GitHub issue in ianstormtaylor/slate

Secondary source, only after the open-issue pass is done:

high-signal recently closed issues for recurrence
existing local Slate v2 thinking in architecture-contract.md
local learned patterns in docs/solutions/

Out of scope for the first pass:

feature popularity voting
implementation
migration strategy details

Anti-Lazy Rules

These are mandatory.

No architecture recommendation before 100% of open issues are triaged into the ledger.
No issue is allowed to remain “skimmed only”. Every open issue gets a ledger row.
Every issue gets a primary cluster. No floating “misc” pile until the end.
“Unknown” is allowed for root cause. Bullshit certainty is not.
Comments are read only for:
- no exceptions in the full program
- every open issue must have its full current thread read before it is marked fully triaged
- pilot batches may stop after a bounded issue count, but not after reading body-only partial rows
Closed issues do not create themes by themselves. They only strengthen recurrence for open themes.
We rank by weighted architectural relevance, not raw count.
We separate facts, inference, and recommendation in every deliverable.

Artifacts

Create and maintain these in order:

docs/slate-issues/open-issues-ledger.md
docs/slate-issues/open-issues-dossiers.md
docs/slate-issues/test-candidate-map.md
docs/slate-issues/benchmark-candidate-map.md
docs/slate-issues/issue-clusters.md
docs/slate-issues/package-impact-matrix.md
docs/slate-issues/requirements-from-issues.md
docs/slate-issues/roadmap-from-issues.md

This file is the control plan for the whole program.

Live Triage Status

The original research corpus is the frozen 2026-04-02 682-issue snapshot. Keep it as historical evidence, not current open-issue accounting.

Gitcrawl rebuild update (2026-05-04):

live open mirror: 630 issues and 29 PRs, 659 open threads total
comments/reviews hydrated: 1856
embeddings: 659
clusters: 617 total, 28 multi-member, 589 singleton
frozen rows no longer live-open: 54, all closed on GitHub
live issues missing from the frozen corpus: #6051, #6053

Use gitcrawl-live-open-ledger.md for current open issue accounting and gitcrawl-rebuild-report.md for gaps. Keep open-issues-ledger.md as the frozen historical corpus until it is fully archived or replaced.

Pilot Calibration

The first 3-issue pilot exposed a few things worth locking in before the full pass:

maintainer-authored architecture issues are real tracker input and need to fit the same ledger, not get waved away as “not user pain”
comment threads are not optional if the output is supposed to help future maintainer triage; #6022 became materially more precise only after reading the added repro wrapper and apply logs
the ledger needs a stricter enum for reply posture instead of mushy prose
the ledger will eventually need a known-duplicate-target field, not just a duplicate-risk score

Pilot Schema Change Log

After batch 1

Changed:

reply usefulness becomes reply posture
added known duplicate target
compact TDD metadata is expected in the ledger, with detailed test shape in the test-candidate map

Why:

freeform reply prose will rot at scale
duplicate risk without a concrete target is too vague for maintainer triage
TDD extraction should be visible in triage, not hidden in a separate later artifact

Backfill:

backfill batch 1 immediately before processing batch 2

After batch 2

No new schema change yet, but one real pressure point showed up:

an issue can be invalid for the current Slate contract and still be a valuable v2 capability target

Open question for later batches:

whether TDD readiness alone is enough
or whether we need a second field that distinguishes:
- current-contract test candidate
- v2-target test candidate

After batch 3

No schema change yet, but two more pressure points are real now:

performance issues fit the same research program, but they want benchmark extraction more than red-test extraction
some issues stay open even after the thread effectively resolves them, so validity plus maintainer action must keep carrying that signal unless this pattern becomes common enough to deserve a dedicated field

After batch 4

Changed:

add linked artifact tracking to the issue schema

Why:

linked PRs, commits, and related issues often carry the best technical context
they can contain fix attempts, maintainer reasoning, or duplicate consolidation that should not be rediscovered later

Backfill:

backfill obvious linked artifacts lazily as batches are revisited or when they materially affect triage

After batch 5

Changed:

add benchmark extraction to the implementation layer for performance issues

Why:

performance issues were being forced into a TDD-only shape that did not fit the actual work
benchmark-ready workload capture is a real deliverable, not an awkward footnote

Backfill:

backfill obvious current performance issues immediately

After batch 6

No schema change yet.

What got validated:

benchmark extraction deserves its own artifact and works cleanly in practice
current fields are already enough to separate:
- Slate-owned input/runtime issues
- browser-owned input issues
- consumer-side resolved support threads
- v2-interesting layout requests that are weak current tickets

Open pressure for later:

linked-artifact backfill will matter more as older issues start linking to more failed PRs and related threads

After batch 7

No schema change yet.

What got validated:

the dossier format still holds at 50+ issues if the per-issue writeups stay sharp instead of bloated
the issue set now clearly separates into:
- real Slate runtime bugs
- browser-owned input bugs
- consumer-side resolved support threads
- example/plugin ergonomics requests
- repo/tooling maintenance noise
benchmark extraction did not need expansion for this batch, which is useful negative signal

Open pressure for later:

if collaboration issues keep showing up, the ledger may need a slightly richer field for remote-op model assumptions versus local runtime assumptions
if more ecosystem/support threads appear, we may want a stricter enum that separates invalid, stale, and out-of-scope more cleanly

After batch 8

No schema change yet.

What got validated:

the format still holds at 75+ issues without collapsing into unreadable sludge
there is now a very clear separation between:
- real runtime adapter pressure
- official-example bugs
- mobile/input-method failures
- support threads caused by consumer misuse
- repo-only maintenance/security noise
React/runtime pressure is recurring in a meaningful way, but still mostly as transaction, identity, selection, and subscription-model pain rather than “make the core React-shaped”
benchmark extraction still does not need expansion; this batch was mostly correctness and adapter semantics

Open pressure for later:

package impact may eventually want a stricter distinction between site/examples, docs-only, and real package runtime ownership

After batch 9

No schema change yet.

What got validated:

the format still holds at 100+ issues without collapsing into nonsense
there is now a very clear split between:
- current Slate runtime bugs
- browser/input-method contract failures
- docs/example/support noise
- ecosystem adapter demand
- old repo/tooling debris
React/runtime pressure keeps showing up, but mostly as identity, focus, selection, and event-contract pain rather than “make the core React-shaped”
benchmark extraction still did not need expansion, which is exactly the kind of restraint we want

Open pressure for later:

dossier and test-map range splitting is now done; keep the ledger monolithic unless grep quality actually degrades
if more adapter/framework requests pile up, we may want a stricter distinction between ecosystem demand, out-of-scope, and v2-interesting but not core-owned

After batch 10

No schema change yet.

What got validated:

older open issues add real signal in three areas: TypeScript API design, collaboration/history semantics, and slate-react rerender pressure
mobile/input issues stay one of the strongest clusters even this far down the queue, which means they are not just recent churn
stale process and support noise also grows with age, so reply posture keeps mattering as much as architecture classification

Open pressure for later:

if more old browser-owned issues look resolved in-thread, we may eventually want a dedicated resolved-in-thread field instead of overloading stale-candidate
the typing/API cluster is now big enough that it may deserve its own first-pass scoring lens instead of living under generic API ergonomics

After batch 11

No schema change yet.

What got validated:

the mobile/input cluster is not a recent anomaly; it stays strong all the way through much older issues
the React/runtime pressure is also persistent, especially around selection subscriptions, hidden/show lifecycles, autofocus, and external rerenders
the typing/API cluster is now undeniably real: operation guards, hook return types, createEditor, PropsMerge, useSlate, and controlled-value ergonomics keep surfacing in different forms
older issues add a lot more stale docs/support/process noise, which justifies keeping maintainer action and reply posture as first-class fields

Open pressure for later:

the typing/API cluster likely deserves a dedicated first-pass score instead of living under generic API ergonomics
if more old issues resolve in-thread or upstream, we may want a resolved-in-thread field instead of continuing to overload stale-candidate

After batch 13

No schema change yet.

What got validated:

legacy issues still reinforce the same mobile/IME/input-method pressure instead of fading into noise
shadow DOM, nested editor, DOM ownership, and click-hit-testing bugs stay real this far down the queue
plugin seam pressure also stays real: insertText suppression, beforeInsertText, scrollSelectionIntoView, and Android readOnly lifecycle bugs all point at weak runtime boundaries
old issues add even more example/docs/support sludge, which keeps justifying the maintainer-triage fields instead of a pure architecture-only ledger

Open pressure for later:

the package impact matrix is even more justified now, because the same pain keeps spanning slate, slate-react, and slate-dom
if the next batches keep surfacing DOM-boundary bugs, we may want a stricter split between browser-owned behavior, adapter-owned behavior, and core-owned behavior

After batches 14 and 15

No schema change yet.

What got validated:

runtime-boundary pain keeps dominating older issues too: focus after insert, inline-void cursor traps, selections crossing editor ownership lines, and nested editor seams all keep resurfacing
mobile and IME debt stays real in older issues too, including Android backspace, browser-specific composition, Windows emoji insertion, and Safari dead-key behavior
docs, examples, and typing debt also keep compounding, which means that cluster is not just recent churn and should not be mistaken for rejection of the core model
linked PRs and related issues keep adding useful context in older threads, so artifact tracking is carrying its weight

Open pressure for later:

the package impact matrix is even more justified now, because these issues keep cutting across slate, slate-react, slate-dom, examples, and docs
if the next batches look similar, it may be worth codifying a cleaner split between core bug, runtime adapter bug, example/docs debt, and ecosystem/support noise

After first cluster pass

No schema change yet.

What got validated:

the first 201 issues are enough to produce a stable macro-theme map instead of just per-issue notes
the dominant signal is runtime-boundary pain, not rejection of the Slate data model:
- selection/focus/DOM bridge
- mobile/IME/input semantics
- slate-react runtime identity and subscription behavior
docs/support/repo noise is large enough that it would absolutely distort roadmap work without the maintainer-triage fields
performance issues are low-count but still high-leverage, which means scoring cannot follow raw issue volume alone

Open pressure for later:

the next artifact should be package impact, not more freeform clustering
theme counts are useful, but package ownership and v2 requirement extraction are what will actually constrain architecture decisions

After package impact matrix

No schema change yet.

What got clarified:

slate-react-v2 and slate-dom-v2 should carry most runtime-boundary work by default
slate-v2 still owns the highest-leverage architectural work, but mostly as engine semantics: transactions, operations, normalization, identity, and history-friendly commit boundaries
low direct slate-dom issue counts are misleading because many DOM and selection bugs were correctly triaged as cross-package
docs/examples/support noise needs an explicit non-v2 lane or it will keep contaminating package-level roadmap calls

Open pressure for later:

the requirements doc should now be written package-first, not theme-first
if later batches keep reinforcing the same split, the roadmap should explicitly separate:
- slate-v2 engine work
- slate-react-v2 runtime work
- slate-dom-v2 bridge work
- docs/examples debt

After requirements extraction

No schema change yet.

What got locked in:

the corpus supports a very specific v2 shape:
- data-model-first
- op-first externally
- transaction-first internally
- React-optimized runtime
the requirements are now package-first instead of theme-first, which is the right shape for actual roadmap decisions
docs/examples/support noise is explicitly separated from v2 architecture requirements, which should stop it from poisoning later prioritization

Open pressure for later:

the roadmap doc should sequence work by dependency order, not by rhetorical importance
the first roadmap cut should decide what has to exist in slate-v2 before slate-react-v2 can become real instead of listing all packages symmetrically

After full open-issue coverage

No schema change yet.

What got locked in:

the open-issue pass is complete against the 2026-04-02 snapshot:
- GitHub open issues: 682
- ledger rows: 682
- dossier sections: 682
- test-map sections: 682
post-snapshot triage moved the live repo state:
- Batch A executed cleanly
- Batch A issues still open: 0
- live GitHub open issues after Batch A: 628
the caching model held up:
- one canonical ledger
- range-split dossiers
- range-split test map
- one benchmark map
reading the oldest still-open issues did not flip the architecture story; it strengthened it

What got clarified:

runtime-boundary pain is ancient, not recent churn
mobile/IME debt is chronic, not browser-week noise
docs/example/support sludge is big enough to poison any roadmap that does not explicitly de-weight it

Open pressure for later:

no more reading pressure for the open set
only revisit individual issues if:
- GitHub state changes
- linked artifacts matter
- a top-ranked cluster needs deeper recurrence work from recently closed issues

After full-corpus rescore

No schema change yet.

What got locked in:

the 682-issue rescore did not change the center of gravity
top weighted themes are now explicit:
- Mobile, IME, And Input Semantics: 21.37
- Performance And Scalability: 19.58
- React Runtime, Identity, And Subscription Model: 17.41
- Selection, Focus, And DOM Bridge: 17.04
raw count and priority score diverge in useful ways, especially for performance

What got clarified:

the corpus still does not justify replacing Slate’s JSON model
the corpus absolutely does justify replacing Slate’s execution and runtime model
performance must stay benchmark-scored, not popularity-scored

After full-corpus package impact refresh

No schema change yet.

What got locked in:

package ownership is now grounded in the full corpus:
- runtime-boundary ownership: 407
- core-engine ownership: 113
- maintainer-noise: 162
direct package pressure is now clear:
- cross-package: 267
- slate-react: 136
- slate: 100

What got clarified:

slate-react-v2 and slate-dom-v2 should carry most runtime-boundary work
slate-v2 owns the engine semantics that make that runtime sane
low direct slate-dom counts are still misleading because the DOM bridge mostly shows up as cross-package

After full-corpus requirements refresh

No schema change yet.

What got locked in:

the requirements doc now cites the full 682-issue corpus instead of the old pilot pass
the v2 shape is now strongly evidenced instead of merely plausible:
- data-model-first
- op-first externally
- transaction-first internally
- React-optimized runtime

What got clarified:

the package-first requirement split is stable
the non-goals are now clearer too, especially:
- no React-shaped core
- no browser-quirk dumping ground in slate-v2
- no docs/support noise masquerading as architecture pressure

After roadmap extraction

No schema change yet.

What got locked in:

the roadmap now exists as a dependency-first build order, not a theme list
the correct phase order is:
1. lock contract and harnesses
2. build slate-v2
3. build slate-dom-v2
4. build slate-react-v2
5. add history and clipboard boundaries
6. kill the chronic runtime clusters
7. benchmark hardening
8. docs/examples/migration surfaces

What got clarified:

starting with slate-react-v2 would just recreate cleanup-crew architecture
starting with docs/examples or migration would be theater
the smallest serious proof cut is:
- slate-v2 transaction core
- slate-dom-v2 selection bridge foundation
- slate-react-v2 snapshot subscriptions
- one IME lane
- one selection lane
- one rerender-breadth lane

After core foundation spec

No schema change yet.

What got locked in:

the first concrete implementation-spec artifact now exists:
- Part II. Core Foundation Spec
Phase 0 and Phase 1 are now expressed as:
- package shape
- core primitives
- invariants
- first red-test lanes
- first benchmark lanes
- explicit deferrals

What got clarified:

slate-v2 should start alone as a prototype package
slate-dom-v2 and slate-react-v2 should not be scaffolded yet
the first serious implementation cut is now precise enough to start work without reopening the full issue corpus

Format Evolution Rule

Do not let the first schema become a prison.

The point of the pilot and early batches is to improve the research format while the pain is still visible. If the ledger, dossier, cluster file, or package matrix format turns out to be too weak, too vague, or too annoying to use, change it.

Hard Rules

Optimize for the best final research artifact, not loyalty to the first draft.
Evolve the format only at clear batch boundaries, not randomly in the middle of reviewing one issue.
When the format changes, update this plan with:
- what changed
- why it changed
- whether prior rows must be backfilled immediately or can be backfilled in the next cleanup pass
If a new field materially improves triage quality, add it even if it creates backfill work.
If a field is producing mush instead of signal, tighten it or delete it.
Never keep a bad schema just to avoid rework. That is fake efficiency.

Allowed Mid-Run Improvements

tighten freeform fields into enums
split one artifact into two if scale demands it
add evidence fields when ambiguity stays too high
add duplicate-target tracking when duplicate risk is too fuzzy
refine package-impact ownership when cross-package classification is too blunt
promote useful dossier sections into first-class ledger columns
add or tighten test-extraction fields when a dossier is not sufficient to write the red test later
add or tighten benchmark-extraction fields when a performance issue needs a workload lane instead of a behavior test

Guardrail Against Chaos

Format evolution is allowed. Silent drift is not.

Every schema change must be written down here before the run continues at scale.

Execution Model

The research owner is the main thread.

Do not delegate final synthesis, cluster naming, score weighting, or v2 architecture recommendations. Those are the highest-context decisions and should not be split across weaker or partial contexts.

Subagents are allowed only for bounded extraction work:

issue inventory
first-pass ledger triage
recurrence lookups
repo package-boundary grounding

Subagents are not allowed for:

final taxonomy decisions
cluster merges and splits
score normalization
package ownership decisions
roadmap recommendations

Subagent Rules

Prefer the main thread by default.
Only use subagents when the work is embarrassingly parallel and structurally bounded.
If there is any risk of weaker-model drift, do not delegate.
If there is any risk of context fragmentation, do not delegate.
If subagents are used, they must write structured outputs only:
- inventory rows
- triage rows
- factual repo notes
Every delegated output must be merged into the canonical artifacts before the next pass starts.
No subagent is allowed to invent new rubric fields, rename clusters, or score priorities.

Model Parity Rule

If subagents are used, they must run on the same frontier-capable model tier as the main thread, not on a cheaper weaker fallback.

If that is not available, stay single-threaded.

This research is exactly the kind of work where a weaker model quietly produces fake certainty and poisons the later synthesis.

Context-Safety Rule

The canonical source of truth is always the on-disk artifact, not subagent memory.

That means:

one canonical ledger
one canonical cluster file
one canonical package matrix
one canonical requirements file

Subagents may propose rows. The main thread accepts or rewrites them.

Recommended Delegation Shape

If delegation is used at all, keep it to this shape:

one subagent for issue inventory and metadata freeze
at most two subagents for first-pass triage, split by issue-number ranges
one subagent for repo/package grounding

Everything else stays in the main thread.

Caching And Reuse

The cache is not model memory. The cache is the research artifacts.

That means we should never need to fully reread all 600+ issues for a later pass unless the tracker itself changed dramatically.

Canonical Cache Layers

1. Ledger

open-issues-ledger.md stores one compact canonical row per issue:

issue number
title
labels
createdAt
updatedAt
comment count
linked artifacts
rubric fields
primary cluster
confidence
last reviewed at
thread-read status

This is the navigation layer.

2. Dossiers

open-issues-dossiers.md stores the maintainer-grade summary for each fully reviewed issue:

one-paragraph issue summary
one-paragraph thread summary
linked artifacts summary
current repro status
workaround status
whether the issue still looks valid
whether it looks duplicate / invalid / stale / underspecified
maintainer action suggestion
possible future reply direction
v2 relevance note
red-test extraction note
benchmark extraction note

This is the reusable triage layer.

3. Test Candidate Map

test-candidate-map.md stores the implementation-facing TDD extraction for issues that are valid enough to reproduce:

issue number
target package
public test seam
minimal repro setup
minimal UI or operation sequence
expected failing assertion
TDD readiness
blocker note when not ready

This is the reusable implementation layer.

4. Benchmark Candidate Map

benchmark-candidate-map.md stores the implementation-facing benchmark extraction for performance issues:

issue number
target package
benchmark readiness
benchmark seam
minimal workload
primary metric
blocker note when not ready

This is the reusable performance-implementation layer.

5. Clusters

issue-clusters.md stores theme-level synthesis so later passes can work from groups instead of reopening individual issues by default.

Re-read Policy

An issue should only be reopened in a later pass if one of these is true:

updatedAt changed since last review
the issue was marked low-confidence
the issue belongs to a top-ranked cluster under active design discussion
the dossier says the thread was underspecified or contradictory

Otherwise, use the dossier and ledger as the source of truth.

Why Comments Must Be Read

If the goal includes future maintainer triage, comments are not optional.

Issue bodies alone miss:

reproduction clarifications
maintainer rebuttals
user-confirmed workarounds
hidden duplicates
“cannot reproduce anymore” signals
whether the issue is actually a support request, docs gap, ecosystem misuse, or a real engine flaw

That means:

body-only is enough for inventory
body-only is not enough for final triage
full issue threads are required before an issue is marked fully classified

Pass Structure

Pass 0: Frame The Research

Goal:

lock the methodology before reading issues deeply

Tasks:

define the issue rubric
define the scoring model
define cluster rules
define package mapping rules
define recurrence handling

Deliverable:

this plan

Exit criteria:

rubric is explicit enough that two passes would classify issues similarly

Pass 1: Inventory Every Open Issue

Goal:

freeze the full open-issue set

Tasks:

fetch every open issue with minimal fields
record snapshot date
record total issue count
assign each issue a stable row in the ledger

Fetch fields:

number
title
labels
state
createdAt
updatedAt
author
body excerpt
comments count
reactions

Deliverable:

docs/slate-issues/open-issues-ledger.md with one row per open issue

Exit criteria:

open issue count in GitHub equals ledger row count

Pass 2: First-Pass Triage

Goal:

classify every open issue under a strict rubric without prematurely designing solutions
read the full body and full current comment thread for each issue in the active batch

Per-issue fields:

primary subsystem
secondary subsystem, if needed
issue type
user pain
recurrence signal
workaround quality
v2 relevance
scope class
likely root-cause layer
package impact
confidence
issue validity status
duplicate risk
known duplicate target
linked artifacts
maintainer action suggestion
reply posture
TDD readiness
public test seam
minimal red-test shape
benchmark readiness
benchmark seam

Allowed issue validity values:

valid
likely-valid
unclear
likely-invalid
duplicate-candidate
stale-candidate

Allowed maintainer action suggestion values:

keep-open
ask-for-repro
ask-for-scope-clarification
mark-duplicate
close-invalid
close-stale
v2-roadmap
fix-current-architecture

Allowed reply posture values:

none
acknowledge
ask-clarifying-question
request-repro
share-status

Allowed TDD readiness values:

ready-now
ready-with-minor-setup
blocked-on-repro
not-a-test-candidate

Allowed benchmark readiness values:

ready-now
ready-with-minor-setup
blocked-on-repro
not-a-benchmark-candidate

Allowed subsystem values:

core-model
operations
normalization
selection
history
rendering
dom-bridge
react-runtime
performance
api-ergonomics
typing
plugins
collaboration
serialization
mobile-ime
docs

Allowed issue type values:

bug
performance
architectural-limit
api-gap
dx-friction
docs-gap
feature-request

Allowed user pain values:

blocker
severe
moderate
minor

Allowed recurrence values:

isolated
recurring
endemic
unknown

Allowed workaround values:

none
poor
acceptable
strong

Allowed v2 relevance values:

direct
indirect
none

Allowed scope class values:

core-only
react-only
dom-only
history-only
cross-package
ecosystem

Allowed root-cause layer values:

mutable-engine-model
operation-semantics
normalization-design
path-identity
dom-selection-bridge
react-subscription-model
history-model
api-surface
typing-model
unknown

Allowed package impact values:

slate
slate-react
slate-dom
slate-history
slate-hyperscript
docs-only
cross-package

Exit criteria:

every open issue has a full rubric row
every issue in the active batch has a full thread-read mark
every valid issue in the active batch has either:
- a TDD extraction seed
- or an explicit reason it is not yet test-ready
every performance issue in the active batch has either:
- a benchmark extraction seed
- or an explicit reason it is not a benchmark candidate yet
no architecture recommendation has been written yet

Pass 2.5: TDD Extraction

Goal:

make the research directly useful for red-green-refactor work later

Tasks:

identify the public seam each valid issue should be tested through
reduce the issue to the smallest behavior-first reproduction shape
write the expected failing assertion in public behavior terms
mark whether the issue is ready for a red test immediately

Rules:

optimize for integration-style tests through public interfaces
do not describe tests in implementation-detail terms
do not assume a future maintainer will reread the original issue thread
if an issue is not test-ready, say exactly what missing repro detail blocks it

Deliverable:

docs/slate-issues/test-candidate-map.md

Exit criteria:

a maintainer can pick a ready-now issue and start the red test without reopening GitHub

Pass 2.6: Benchmark Extraction

Goal:

make performance issues directly useful for later profiling and benchmark work

Tasks:

identify the benchmark seam each performance issue should use
reduce the issue to the smallest meaningful workload shape
name the primary metric that should move
mark whether the issue is benchmark-ready immediately

Rules:

optimize for reproducible workload lanes, not hand-wavy “Slate feels slow” summaries
separate benchmark candidates from correctness-test candidates
do not assume a future maintainer will reread the original issue thread
if the issue is not benchmark-ready, say exactly what workload detail is still missing

Deliverable:

docs/slate-issues/benchmark-candidate-map.md

Exit criteria:

a maintainer can pick a ready-now performance issue and start a benchmark lane without reopening GitHub

Pass 3: Ambiguity And Foundation Review

Goal:

resolve the issues where the first pass is too fuzzy to trust

Read full issue bodies/comments only for:

low-confidence triage rows
likely foundational constraints
top-severity issues
issues that appear to cross multiple clusters

This pass is no longer the first time comments are read.

It is the pass where ambiguous threads are reread carefully and dossier summaries are tightened.

Tasks:

promote or demote confidence
tighten root-cause classification
add short evidence notes where needed

Exit criteria:

no top-ranked issue remains low-confidence unless the issue itself is underspecified

Pass 4: Cluster By Systemic Pain

Goal:

turn issue rows into actual themes

Cluster rules:

cluster by failing system assumption, not by label wording
every issue gets one primary cluster
one optional secondary cluster only if truly cross-cutting
enhancement requests stay separate from defect clusters unless they clearly describe the same constraint

Example cluster shapes:

transactionality and operation timing
normalization unpredictability
path identity and stale references
DOM selection synchronization
React rerender and subscription pressure
mobile and IME input
plugin override fragility
serialization and direct value replacement
history grouping and undo semantics
large-document performance
API discoverability and typing

Deliverable:

docs/slate-issues/issue-clusters.md

Exit criteria:

3 to 10 clusters
no giant junk-drawer cluster

Pass 5: Score And Rank

Goal:

rank themes by architectural importance, not noise

Score each cluster 1 to 5 on:

pain
recurrence
architectural depth
breadth across packages
v2 leverage

Weighted formula:

text

priority = (pain * recurrence) + architectural_depth + breadth + v2_leverage

Interpretation:

high issue count with low architectural depth should not dominate
smaller clusters that point at a foundational bad assumption can rank very high

Deliverable:

ranked clusters section in docs/slate-issues/issue-clusters.md

Exit criteria:

every cluster has a numeric score and written rationale

Pass 6: Map Clusters To Packages

Goal:

convert issue themes into a package-by-package v2 map

Packages to evaluate:

slate-v2
slate-react-v2
slate-dom-v2
slate-history-v2
slate-hyperscript-v2

Per package:

core responsibility
issue clusters it must solve directly
issue clusters it should not own
must-have v2 principles
defer-to-later items

Deliverable:

docs/slate-issues/package-impact-matrix.md

Exit criteria:

every top-ranked cluster has an owning package or explicit shared ownership

Pass 7: Derive v2 Requirements

Goal:

translate ranked issue clusters into architecture requirements

Requirement format:

problem signal
failing current assumption
v2 requirement
affected packages
expected user-visible improvement
evidence issues

Example:

problem signal: repeated stale-path and move semantics issues
failing assumption: path position is sufficient identity
v2 requirement: stable node identity separate from path location

Deliverable:

docs/slate-issues/requirements-from-issues.md

Exit criteria:

top clusters are all represented by explicit requirements

Pass 8: Build The Roadmap

Goal:

turn requirements into a staged v2 program

Stages:

proof-of-concept core
React-first runtime proof
selection and DOM bridge proof
history model proof
package expansion
migration strategy exploration

Each stage needs:

success criteria
benchmark or correctness proof
non-goals
blockers

Deliverable:

docs/slate-issues/roadmap-from-issues.md

Exit criteria:

roadmap ties directly back to the ranked issue clusters, not just opinion

Scoring Rubric

Pain

cosmetic or narrow inconvenience
meaningful annoyance with workaround
repeated workflow damage
severe limitation or frequent defect
blocker or trust-destroying behavior

Recurrence

isolated
occasional
recurring
very recurring
endemic / keeps reappearing in multiple forms

Architectural Depth

leaf bug
local abstraction issue
subsystem issue
foundational runtime constraint
invalidates a core model assumption

Breadth

one package, one surface
one package, multiple surfaces
two packages
cross-package
ecosystem-wide pressure

v2 Leverage

barely relevant to v2
nice-to-have
moderately shaped by v2
strongly shaped by v2
should directly drive v2 architecture

After batches 16 and 17

No schema change yet.

What got reinforced:

runtime-boundary pain still dominates older issues
clipboard and copy-paste semantics pressure is stronger now because #4716 and #4542 are clearly architecture debates, not ordinary bugs
controlled-vs-uncontrolled slate-react pressure remains one of the strongest package-level signals because of #4612
docs, examples, and support noise keep showing up and still need to stay isolated from v2 architecture work
no new benchmark-worthy cluster emerged from this tranche

After batches 18 and 19

No schema change yet.

What got reinforced:

IME and mobile input debt is still everywhere, including Android headings, suggestion duplication, Japanese and Chinese composition, and AndroidEditable itself
Shadow DOM is not a niche edge case; it keeps surfacing as a real DOM-bridge ownership problem
framework-decoupling and data-model interoperability pressure are real, but still architecture pressure, not near-term bug work
slate-react runtime ownership keeps showing up through focus transfer, render callback churn, shared object identity, and blur-selection behavior
this tranche produced one legitimate new benchmark candidate: dynamic decorations in #4483

Review Gates

Before moving from one pass to the next:

re-read the plan
update the ledger/cluster artifact
log any rubric changes explicitly
do not silently reclassify old rows without recording why

Package-Specific Questions To Answer

`slate-v2`

Should the core be transaction-first?
Should snapshots be immutable and versioned?
Should stable node identity replace path-only identity?
Which current operation semantics are fundamentally bad, not just slow?

`slate-react-v2`

Can the renderer move to selector subscriptions over immutable snapshots?
Which open issues are really React invalidation problems disguised as Slate bugs?
What must be synchronous versus deferred?

`slate-dom-v2`

Which DOM issues are bridge problems versus core engine problems?
How should selection reconciliation work with committed snapshots?

`slate-history-v2`

Which undo/redo issues are symptoms of per-op mutation instead of a real history model?
Should transactions become native history units?

`slate-hyperscript-v2`

Does it need real v2 redesign, or just compatibility once the core stabilizes?

Risks

overfitting v2 to noisy feature requests
letting label taxonomy drive architecture
counting duplicates as independent requirements
reading comments too early and drowning in local detail
deciding on package boundaries before issue clustering is stable

Recommendation

Start with Pass 0 through Pass 2 only.

Do not try to read, cluster, score, and roadmap in one marathon. That is how the whole thing turns into “I kind of remember some issues about selection.”

The first milestone is simple:

every open issue captured
every open issue triaged
no architecture conclusions yet

That is the first point where the research becomes trustworthy.

Pilot Requirement

Do not jump straight to all 600+ issues.

First run a pilot batch of 25 to 50 open issues and use it to validate:

the ledger schema
the dossier format
the triage rubric
the maintainer-action fields
the re-read policy

Only after that pilot should the schema be locked for the full run.

After batches 20 and 21

No schema change yet.

What got reinforced:

delete and caret-positioning debt stays real across old issues, especially around empty blocks, marked leaves, and structural wraps
composition and browser-input debt was already big and this tranche keeps confirming it with Cmd+A/delete, Safari autocorrect, and multibyte input cases
slate-react rerender breadth is now clearly benchmark-worthy from #4210, with nested-block depth in #4141 as the same family rather than a new family
plugin-surface and hook-shape pressure keeps appearing, but mostly as v2/runtime design signal instead of must-fix current bugs
old issues add even more example/support/process sludge, which keeps justifying the maintainer-action fields

After batches 22 and 23

No schema change yet.

What got validated:

old open issues keep reinforcing the same runtime-boundary story instead of opening a totally new architecture front
IME, placeholder, and browser-text-assistance bugs are older and deeper than the recent issue pool alone suggested
readonly, iframe, nested-contenteditable, and static-renderer requests are real package-boundary pressure, not noise
docs/process churn stays noisy enough that the maintainer-triage fields keep earning their keep

Open pressure for later:

if older issues keep producing more renderer-runtime complaints like focus drift and event ownership, the package impact matrix may need a slightly stronger split between slate-react runtime debt and slate-dom bridge debt

After batches 24 and 25

No schema change yet.

What got validated:

older issues add real history and collaboration pressure, not just more runtime noise
cross-window, iframe, and portal ownership bugs were already there years ago, which strengthens the package-boundary case for slate-react-v2 plus slate-dom-v2
IME and composition gating bugs are still ancient debt, especially around empty state, composition, and keydown interaction
older examples/docs/process threads continue to create a lot of sludge, which keeps justifying the triage fields and duplicate handling

Open pressure for later:

if the next older tranche keeps surfacing history/collaboration edge cases, the requirements doc may need a slightly sharper distinction between local transaction semantics and remote operation semantics

After batches 26 and 27

No schema change yet.

What got validated:

old issues keep reinforcing the same center of gravity: focus ownership, IME/input semantics, structural delete behavior, and history restore debt
cross-window and iframe ownership bugs were already strong before the later issue pool, which keeps supporting the slate-react-v2 plus slate-dom-v2 split
history and collaboration pressure stayed real in older issues too, especially around partial selection ops, move_node undo, and grouped state restore
Android demand remains strong, but a lot of those issues are still clearly outside the supported current contract and should stay separated from present-tense bug counts
this tranche adds one legitimate old slate-react perf lane from #3656, but it mostly reinforced already-known runtime and input clusters instead of creating a new architecture front

Open pressure for later:

if the next older tranche keeps surfacing plugin hook confusion around paste, focus, and external stores, the requirements doc may need a slightly sharper split between public hook-surface design and renderer ownership design

After batches 28 and 29

No schema change yet.

What got validated:

old issues keep reinforcing placeholder, decoration, and inline-boundary runtime debt instead of merely repeating generic selection bugs
plugin and render-composition complaints around renderElement, plugin events, and editor hook surfaces are older and more principled than recent noise made them look
Android/input debt and upstream browser event gaps were already shaping Slate in late 2019, so that pressure is foundational, not incidental
this tranche adds one legitimate older slate-react perf lane from #3430, but most of the signal still points at runtime semantics and extension-surface pressure rather than raw throughput alone

Open pressure for later:

if the next older tranche keeps surfacing API-surface and inline data-model tensions together, the roadmap may need a tighter split between core data-model purity and runtime escape hatches

After batch 30

No schema change yet.

What got validated:

the oldest still-open issues still reinforce the same runtime and engine seams instead of opening a totally different architecture front
zero-width sentinels, render-time mark splitting, selection normalization, operation granularity, and dirty tracking are all deep-rooted pressure, not recent fashion
the final tranche adds clear old roadmap pressure around clipboard transfer typing, large-document rendering, and operation composition
the artifact format held through the full corpus without needing another schema rewrite, which means the cache is good enough to reuse for later triage and TDD

Open pressure for later:

the corpus is complete for the current snapshot, so the next step should be rescoring and roadmap extraction, not more blind reading unless the snapshot is refreshed

After The Full-Corpus Extraction Stack

The issue-intelligence program is complete enough to hand off into real v2 planning.

Stable outputs now exist for:

full-corpus clusters
package ownership
requirements
roadmap
core foundation spec
cohesive program plan

That means the next lane is no longer “read more issues.”

It is:

use cohesive-program-plan.md as the connective control doc
treat Part II. Core Foundation Spec as the first implementation-spec artifact
start packages/slate-v2 only after the Phase 0 proof gates are accepted