docs/plans/2026-05-04-slate-v2-clawsweeper-corpus-recluster-ralplan.md
Run ClawSweeper on the full live issue corpus, but do not brute-force 630
issues one by one.
The best shape is:
refresh gitcrawl mirror
+ rebuild machine clusters
+ create a human architecture cluster overlay
+ process clusters by value/risk
+ append one fork-local dossier section per reviewed issue
+ sync only exact claims into PR docs
Machine clusters are candidate discovery. Human architecture clusters are the truth we use for Slate v2 planning.
docs/slate-v2/ledgers/fork-issue-dossier.md.Fixes #...
claims without exact repro proof.From docs/slate-issues/gitcrawl-rebuild-report.md:
| Fact | Value |
|---|---|
| Live open issues | 630 |
| Live open PRs included in gitcrawl | 29 |
| Live open threads total | 659 |
| Hydrated comments/reviews | 1856 |
| Gitcrawl clusters | 617 |
| Multi-member gitcrawl clusters | 28 |
| Singleton clusters | 589 |
The current fork dossier has 10 issue sections. That means the next work is
not “finish the document polish.” The next work is the corpus pipeline.
Principles:
Decision drivers:
589 / 617 gitcrawl clusters are singletons, so machine clustering alone is
too conservative.Chosen option:
Rejected options:
Use three layers.
Refresh and rebuild:
GITHUB_TOKEN="$(gh auth token)" gitcrawl refresh ianstormtaylor/slate --include-comments --state open --json
gitcrawl cluster ianstormtaylor/slate --threshold 0.80 --min-size 1 --max-cluster-size 40 --k 16 --cross-kind-threshold 0.93 --json
gitcrawl clusters ianstormtaylor/slate --json
Store outputs under .tmp/gitcrawl/ with timestamped filenames and summarize
them into docs/slate-issues/gitcrawl-rebuild-report.md.
Use gitcrawl clusters and cluster-detail for multi-member clusters. Use
gitcrawl neighbors and gitcrawl search for singleton expansion.
Create/update:
docs/slate-issues/gitcrawl-recluster-map.md
Each human family must have:
keep, docs, stale, pr-only, split, or needs-reproStart from these known families:
Append reviewed issues to:
docs/slate-v2/ledgers/fork-issue-dossier.md
Use one section per issue with the ClawSweeper shape:
Status
Bucket
Confidence
Issue summary
Evidence
Decision
PR-description text
Do not duplicate the long form in pr-description.md.
Goal:
Outputs:
.tmp/gitcrawl/*docs/slate-issues/gitcrawl-rebuild-report.mddocs/slate-issues/gitcrawl-clusters.mddocs/slate-issues/gitcrawl-recluster-map.mdGate:
Process keep clusters before singleton noise:
For each cluster:
gitcrawl cluster-detailgitcrawl neighbors on representative and suspicious membersImmediate target:
#3777Then process:
Gate:
keep remains unreviewed or unassigned to a human
family.Because 589 singleton clusters exist, search by human family, not by issue
number.
Use this loop:
gitcrawl search ianstormtaylor/slate --query "<family phrase>" --mode hybrid --limit 50 --json
gitcrawl search issues "<family phrase>" -R ianstormtaylor/slate --state open --json number,title,state,url,updatedAt,labels --limit 50
gitcrawl neighbors ianstormtaylor/slate --number <representative> --limit 20 --json
Suggested family phrases:
Cannot resolve a Slate point from DOM pointCannot resolve a Slate node from DOM nodeAndroid composition beforeinputSamsung keyboard Firefox Androidinline void selection keyboardplaceholder compositionReactEditor focus parent stateuseSelected stale pathdecorate async caret jumphistory set_selection undocopy paste inline voidlarge document paste cut performanceGate:
needs-repro, or is explicitly skipped with a reason.Goal:
Outputs:
Gate:
Goal:
fixes-claimedimproves-claimed, cluster-synced,
issue-reviewed, not-claimed, triage-closed, or needs-reproRules:
Outputs:
docs/slate-v2/ledgers/issue-coverage-matrix.mddocs/slate-v2/references/pr-description.mddocs/slate-v2/ledgers/fork-issue-dossier.md| Dimension | Score | Evidence |
|---|---|---|
| React/runtime performance | 0.92 | cluster order prioritizes React focus/subscription/runtime families and performance rows |
| Slate-close unopinionated DX | 0.94 | keeps docs/product/ecosystem rows out of raw Slate claims |
| Plate/slate-yjs migration backbone | 0.88 | uses architecture buckets but does not yet run downstream-specific migration proof |
| Regression-proof testing | 0.90 | exact claims require unit/browser/device/benchmark proof by issue type |
| Research/evidence completeness | 0.94 | live gitcrawl, fork dossier, and current v2 proof refs are required |
| Simplicity/composability | 0.92 | one dossier, one recluster map, no upstream comment machinery |
Total: 0.92
Verdict: ready for user review. Execution should start with Batch 0, then Batch
docs/slate-issues/gitcrawl-rebuild-report.mddocs/slate-issues/gitcrawl-clusters.mddocs/slate-issues/gitcrawl-recluster-map.mddocs/slate-issues/gitcrawl-live-open-ledger.mddocs/slate-v2/ledgers/fork-issue-dossier.mddocs/slate-v2/ledgers/issue-coverage-matrix.mddocs/slate-v2/references/pr-description.md.tmp/completion-checks/The corpus/re-clustering lane is complete only when:
keep multi-member cluster has a human-family decisionneeds-humanFixes #... claim has matching proofpr-description.md includes only short claim/count summariesbun run completion-check passes with this lane marked doneStart with Batch 0:
current_pass: clawsweeper-recluster-batch-0
current_pass_skill: .agents/skills/clawsweeper/SKILL.md
current_pass_owner: docs/slate-issues + docs/slate-v2/ledgers
current_pass_scope: refresh gitcrawl, rebuild clusters, create recluster map skeleton
Then move to Batch 1 high-signal multi-member clusters.
| Time | Pass | Status | Evidence | Next owner |
|---|---|---|---|---|
| 2026-05-04T14:53:01Z | clawsweeper-recluster-batch-0 | in_progress | active goal state refreshed; completion state set to pending for Batch 0 gitcrawl corpus work. | Run gitcrawl doctor/refresh/cluster |
| 2026-05-04T15:00:00Z | clawsweeper-recluster-batch-0 | complete | Live refresh synced 659 threads; cluster output reconciled to 617 clusters, 28 multi-member clusters, and 589 singleton clusters. | Batch 1: high-signal multi-member clusters |
| 2026-05-04T15:08:00Z | clawsweeper-recluster-batch-1 | in_progress | Cluster 1 reviewed; dossier sections appended for #4564, #3723, #4789, #3836, #5711, #3834, and #4984 with no exact closure claims. | Continue Batch 1 with cluster 5 |
| 2026-05-04T15:14:00Z | clawsweeper-recluster-batch-1 | in_progress | Cluster 5 reviewed; dossier sections appended for #4074, #4618, and #3429 with no exact closure claims. | Continue Batch 1 with cluster 6 |
| 2026-05-04T15:20:00Z | clawsweeper-recluster-batch-1 | in_progress | Cluster 6 reviewed; dossier sections appended for #3705, #3756, and #3921 with no exact closure claims. | Continue Batch 1 with cluster 7 |
| 2026-05-04T15:26:00Z | clawsweeper-recluster-batch-1 | in_progress | Cluster 7 reviewed; dossier sections appended for #3634, #5537, and #4961 with no exact closure claims. | Continue Batch 1 with cluster 9 |
| 2026-05-04T15:32:00Z | clawsweeper-recluster-batch-1 | complete | Clusters 9, 10, 11, 12, 13, and 14 reviewed; Android/IME, decoration, inline void, and refocus-scroll PR-linked clusters classified. | Batch 2: mixed and browser selection splits |
| 2026-05-04T15:32:00Z | clawsweeper-recluster-batch-2 | complete | Clusters 3, 19, 20, 22, and 23 reviewed; #3777 routed to input runtime, browser/mobile selection clusters split with no exact claims. | Batch 3: singleton search expansion |
| 2026-05-04T15:45:00Z | clawsweeper-multicluster-sweep | complete | Remaining multi-member clusters 2, 4, 8, 15, 16, 17, 18, 21, 24, 25, 26, 27, and 28 reviewed; all 28 have human-family decisions. | Batch 3: singleton candidate decisions |
| 2026-05-04T15:50:00Z | clawsweeper-recluster-batch-3 | complete | 34 high-signal singleton candidates reviewed, routed by architecture family, and recorded in dossier/matrix with no new exact fixes. | Batch 4/5: docs-noise and claim audit |
| 2026-05-04T15:50:00Z | clawsweeper-recluster-batch-5 | complete | Docs/stale/noise sweep and exact-claim audit complete; fixed claims remain #6013, #5605, and #5709 only. | Completion-check |