Back to Get Shit Done

Health

get-shit-done/workflows/health.md

1.42.07.2 KB
Original Source
<purpose> Validate `.planning/` directory integrity and report actionable issues. Checks for missing files, invalid configurations, inconsistent state, and orphaned plans. Optionally repairs auto-fixable issues. </purpose>

<required_reading> Read all files referenced by the invoking prompt's execution_context before starting. </required_reading>

<process> <step name="parse_args"> **Parse arguments:**

Check if --repair, --backfill, or --context flags are present in the command arguments.

REPAIR_FLAG=""
BACKFILL_FLAG=""
CONTEXT_MODE=""
if arguments contain "--repair"; then
  REPAIR_FLAG="--repair"
fi
if arguments contain "--backfill"; then
  BACKFILL_FLAG="--backfill"
fi
if arguments contain "--context"; then
  CONTEXT_MODE="true"
fi

If CONTEXT_MODE is set, jump to the context_check step and skip the integrity validation steps. The two modes are orthogonal — context utilization has nothing to do with .planning/ directory health. </step>

<step name="context_check"> **Run only when `--context` is set.**

The model running this workflow self-reports the current session's approximate tokensUsed and the active model's contextWindow. Use the values visible in your runtime (Claude Code's /context slash command output, or the model's own session telemetry). If the runtime exposes neither, prompt the user once via AskUserQuestion for both numbers.

TEXT_MODE fallback: when text_mode is true (config or --text flag) the runtime is non-Claude (Codex, Gemini, etc.) and AskUserQuestion is not available — replace the prompt with a plain-text two-question sequence ("Approximate tokens used? Context window size?") and read the answers as plain text from the user's response.

bash
gsd-sdk query validate.context \
  --tokens-used "$TOKENS_USED" \
  --context-window "$CONTEXT_WINDOW"

The query prints a one-line status (Context utilization: NN% (state)) plus a recommendation line for the warning and critical states. Print the SDK output verbatim and end the workflow — do not mix in .planning/ health output, the two modes are independent diagnostics. </step>

<step name="run_health_check"> **Run health validation:**
bash
gsd-sdk query validate.health $REPAIR_FLAG $BACKFILL_FLAG

Parse JSON output:

  • status: "healthy" | "degraded" | "broken"
  • errors[]: Critical issues (code, message, fix, repairable)
  • warnings[]: Non-critical issues
  • info[]: Informational notes
  • repairable_count: Number of auto-fixable issues
  • repairs_performed[]: Actions taken if --repair was used </step>
<step name="format_output"> **Format and display results:**
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 GSD Health Check
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Status: HEALTHY | DEGRADED | BROKEN
Errors: N | Warnings: N | Info: N

If repairs were performed:

## Repairs Performed

- ✓ config.json: Created with defaults
- ✓ STATE.md: Regenerated from roadmap

If errors exist:

## Errors

- [E001] config.json: JSON parse error at line 5
  Fix: Run /gsd-health --repair to reset to defaults

- [E002] PROJECT.md not found
  Fix: Run /gsd-new-project to create

If warnings exist:

## Warnings

- [W002] STATE.md references phase 5, but only phases 1-3 exist
  Fix: Review STATE.md manually before changing it; repair will not overwrite an existing STATE.md

- [W005] Phase directory "1-setup" doesn't follow NN-name format
  Fix: Rename to match pattern (e.g., 01-setup)

If info exists:

## Info

- [I001] 02-implementation/02-01-PLAN.md has no SUMMARY.md
  Note: May be in progress

Footer (if repairable issues exist and --repair was NOT used):

---
N issues can be auto-repaired. Run: /gsd-health --repair
</step> <step name="offer_repair"> **If repairable issues exist and --repair was NOT used:**

Ask user if they want to run repairs:

Would you like to run /gsd-health --repair to fix N issues automatically?

If yes, re-run with --repair flag and display results. </step>

<step name="verify_repairs"> **If repairs were performed:**

Re-run health check without --repair to confirm issues are resolved:

bash
gsd-sdk query validate.health

Report final status. </step>

</process>

<error_codes>

CodeSeverityDescriptionRepairable
E001error.planning/ directory not foundNo
E002errorPROJECT.md not foundNo
E003errorROADMAP.md not foundNo
E004errorSTATE.md not foundYes
E005errorconfig.json parse errorYes
W001warningPROJECT.md missing required sectionNo
W002warningSTATE.md references invalid phaseNo
W003warningconfig.json not foundYes
W004warningconfig.json invalid field valueNo
W005warningPhase directory naming mismatchNo
W006warningPhase in ROADMAP but no directoryNo
W007warningPhase on disk but not in ROADMAPNo
W008warningconfig.json: workflow.nyquist_validation absent (defaults to enabled but agents may skip)Yes
W009warningPhase has Validation Architecture in RESEARCH.md but no VALIDATION.mdNo
W018warningMILESTONES.md missing entry for archived milestone snapshotYes (--backfill)
W019warningUnrecognized .planning/ root file — not a canonical GSD artifactNo
I001infoPlan without SUMMARY (may be in progress)No

</error_codes>

<repair_actions>

ActionEffectRisk
createConfigCreate config.json with defaultsNone
resetConfigDelete + recreate config.jsonLoses custom settings
regenerateStateCreate STATE.md from ROADMAP structure when it is missingLoses session history
addNyquistKeyAdd workflow.nyquist_validation: true to config.jsonNone — matches existing default
backfillMilestonesSynthesize missing MILESTONES.md entries from .planning/milestones/vX.Y-ROADMAP.md snapshotsNone — additive only; triggered by --backfill flag

Not repairable (too risky):

  • PROJECT.md, ROADMAP.md content
  • Phase directory renaming
  • Orphaned plan cleanup

</repair_actions>

<stale_task_cleanup> Windows-specific: Check for stale Claude Code task directories that accumulate on crash/freeze. These are left behind when subagents are force-killed and consume disk space.

When --repair is active, detect and clean up:

bash
# Check for stale task directories (older than 24 hours)
TASKS_DIR="$HOME/.claude/tasks"
if [ -d "$TASKS_DIR" ]; then
  STALE_COUNT=$( (find "$TASKS_DIR" -maxdepth 1 -type d -mtime +1 2>/dev/null || true) | wc -l )
  if [ "$STALE_COUNT" -gt 0 ]; then
    echo "⚠️  Found $STALE_COUNT stale task directories in ~/.claude/tasks/"
    echo "   These are leftover from crashed subagent sessions."
    echo "   Run: rm -rf ~/.claude/tasks/*  (safe — only affects dead sessions)"
  fi
fi

Report as info diagnostic: I002 | info | Stale subagent task directories found | Yes (--repair removes them) </stale_task_cleanup>