.agents/skills/openclaw-test-heap-leaks/SKILL.md
Use this skill for test-memory investigations. Do not guess from RSS alone when heap snapshots are available. Treat snapshot-name deltas as triage evidence, not proof, until retainers or dominators support the call.
For runtime fixes (e.g., closure leaks in long-running services like the gateway), see Validating runtime fixes below — that uses a dedicated harness, not the test-parallel snapshot machinery.
Reproduce the failing shape first.
pnpm canvas:a2ui:bundle && OPENCLAW_TEST_MEMORY_TRACE=1 OPENCLAW_TEST_HEAPSNAPSHOT_INTERVAL_MS=60000 OPENCLAW_TEST_HEAPSNAPSHOT_DIR=.tmp/heapsnap OPENCLAW_TEST_WORKERS=2 OPENCLAW_TEST_MAX_OLD_SPACE_SIZE_MB=6144 pnpm testOPENCLAW_TEST_MEMORY_TRACE=1 enabled so the wrapper prints per-file RSS summaries alongside the snapshots.[test-parallel] start ... lines or pnpm test --plan. Do not assume a single unit-fast lane; local plans often split into unit-fast-batch-*.Wait for repeated snapshots before concluding anything.
.tmp/heapsnap/unit-fast-batch-2/..agents/skills/openclaw-test-heap-leaks/scripts/heapsnapshot-delta.mjs to compare either two files directly or the earliest/latest pair per PID in one lane directory.Classify the growth before choosing a fix.
Module, system / Context, bytecode, descriptor arrays, or property maps, treat it as likely retained module graph growth in long-lived workers.Fix the right layer.
test/fixtures/test-timings.unit.json and whether scripts/test-update-memory-hotspots.mjs should refresh the measured hotspot manifest before hand-editing behavior overrides.test/fixtures/test-parallel.behavior.json only when timing-driven peeling is insufficient.singletonIsolated for files that are safe alone but inflate shared worker heaps.test/fixtures/test-timings.unit.json, call that out explicitly. Missing timings are a scheduling blind spot.afterEach/afterAll, module-reset gaps, retained global state, unreleased DB handles, or listeners/timers that survive the file.Verify with the most direct proof.
unit-fast or unit-fast-batch-* growth can be a worker-lifetime problem rather than an application object leak.scripts/test-parallel.mjs and scripts/test-parallel-memory.mjs are the primary control points for wrapper diagnostics.[test-parallel] start ... and [test-parallel][mem] summary ... tell you where to focus.node .agents/skills/openclaw-test-heap-leaks/scripts/heapsnapshot-delta.mjs before.heapsnapshot after.heapsnapshotnode .agents/skills/openclaw-test-heap-leaks/scripts/heapsnapshot-delta.mjs --lane-dir .tmp/heapsnap/unit-fast-batch-2--top 40--min-kb 32--pid 16133Read the top positive deltas first. Large positive growth in module-transform artifacts suggests lane isolation; large positive growth in runtime objects suggests a real leak. If the names alone do not settle it, open the same snapshot pair in DevTools and inspect retainers/dominators for the top rows before declaring root cause.
The workflow above is for diagnosing Vitest worker memory growth. For validating that a runtime/closure fix actually releases captured state, use the dedicated harness:
pnpm leak:embedded-run — runs scripts/embedded-run-abort-leak.ts. Loops N
aborted runs in a function-shaped scope mimicking runEmbeddedAttempt,
writes heap snapshots, and reports a PASS/FAIL verdict on retention growth
using FinalizationRegistry for tracked-instance counting plus RSS delta.Modes:
closure-extracted (default) — production fix shape (helper at module scope).closure-inline — pre-fix shape (closure inside the runner scope). Use as a
sensitivity check: if it passes you've broken the harness, not fixed a bug.synthetic-leak — deliberately retains via a module-level bucket. Use to
confirm the harness can detect leaks before trusting a PASS on a real fix.Snapshots land in .tmp/embedded-run-abort-leak/. Diff with the same script
as above:
node .agents/skills/openclaw-test-heap-leaks/scripts/heapsnapshot-delta.mjs \
.tmp/embedded-run-abort-leak/baseline-*.heapsnapshot \
.tmp/embedded-run-abort-leak/batch-N-*.heapsnapshot --top 30
When fixing a different runtime leak, add a new harness alongside this one rather than retrofitting it. The fixture function should mimic the lexical scope of the function where the leak lives, not be a generic abort-loop.
When using this skill, report: