.agents/skills/improve/references/audit-playbook.md
What to look for, per category. Each subagent (or direct audit pass) gets the relevant section plus the Finding format at the bottom. Adapt depth to repo size — a 2K-line CLI gets a lighter pass than a 500K-line monorepo.
A finding is only a finding with evidence. "Probably has N+1 queries somewhere" is not a finding; orders/api.ts:142 issues one query per order item inside a loop is.
The highest-trust category — real bugs found by reading, not speculation.
catch (e) { console.log(e) } on critical paths, missing error states in UI code.!) on values that can be null, optional chaining hiding a value that must exist, unchecked array indexing.default: that silently no-ops).any / as casts / @ts-ignore clusters — each one is a place the compiler was overruled.finally.Report only what's evidenced in the code. Do not generate exploit code in plans — describe the fix.
Handling rule: never copy a secret value into a finding or plan — those files get committed. Reference the file:line and credential type only ("Stripe live key at config.ts:12"), and the fix sketch always includes rotation, not just removal (a committed secret is burned even after deletion).
By-design is not a finding: standard platform conventions are intentional behavior — honoring https_proxy/NO_PROXY, reading ~/.netrc, an explicitly local dev tool shelling out to configured package managers. Flag these only when the implementation adds risk beyond the convention itself.
.env files, secrets logged or persisted in event/history stores.dangerouslySetInnerHTML / innerHTML with user data, eval/Function on dynamic input, path traversal on user-supplied filenames.npm audit, pip-audit, cargo audit) in read-only mode; flag critical/high with known exploits, not the noise floor.HttpOnly/Secure/SameSite, debug/verbose modes reachable in production config.Look for the algorithmic and architectural wins, not micro-optimizations.
find/filter inside hot loops where a Map keyed lookup belongs.The goal is not a percentage — it's which untested code is dangerous.
.env.example.CLAUDE.md/AGENTS.md — for repos where agents will execute the plans, this is high-leverage: recommend one and include its outline as a plan.Lowest default priority — only flag where absence has a concrete cost:
Forward-looking: not what's broken, but what this codebase wants to become. Grounding rule: every suggestion must cite evidence from the repo itself — a suggestion that could apply to any project in the category ("add dark mode", "add AI") is noise, not a finding. Sources of grounded direction signal:
Direction findings use the standard format with two adaptations: Impact is product/user value (who wants this and why now), and Confidence reflects how grounded the evidence is — not certainty that it's the right call. Strategy belongs to the maintainer; the advisor's job is grounded options with honest trade-offs. Effort estimates here are coarser; say so. Plans for selected direction findings are usually a design/spike plan (investigate, prototype, define the API, list open questions) rather than a build-everything plan — scope them that way.
Every finding, from every category and every subagent, comes back in this shape:
### [CATEGORY-NN] Short imperative title
- **Evidence**: `path/file.ts:123` — one-sentence description of what's there. (Repeat per location; 2–5 strongest locations, note "and ~N similar sites" if widespread.)
- **Impact**: What goes wrong / what's being paid because of this. Concrete: "every order-list render issues 1+N queries", not "suboptimal".
- **Effort**: S (hours) / M (a day-ish) / L (multi-day) — for the *fix*, including tests.
- **Risk**: What the fix could break; LOW/MED/HIGH plus one line why.
- **Confidence**: HIGH (read the code, certain) / MED (strong signal, needs verification) / LOW (smell, needs investigation). LOW-confidence findings may be reported but get an "investigate" plan, not a "fix" plan.
- **Fix sketch**: 1–3 sentences. Not the plan — just enough to judge effort honestly.
Order findings by leverage = impact ÷ effort, discounted by confidence and fix-risk. Tiebreakers: