.taskmaster/docs/autonomous-tdd-git-workflow.md
Put the existing git and test workflows on rails: a repeatable, automated process that can run autonomously, with guardrails and a compact TUI for visibility.
Flow: for a selected task, create a branch named with the tag + task id → generate tests for the first subtask (red) using the Surgical Test Generator → implement code (green) → verify tests → commit → repeat per subtask → final verify → push → open PR against the default branch.
Build on existing rules: .cursor/rules/git_workflow.mdc, .cursor/rules/test_workflow.mdc, .claude/agents/surgical-test-generator.md, and existing CLI/core services.
Deterministic, resumable automation to execute the TDD loop per subtask with minimal human intervention.
Strong guardrails: never commit to the default branch; only commit when tests pass; enforce status transitions; persist logs/state for debuggability.
Visibility: a compact terminal UI (like lazygit) to pick tag, view tasks, and start work; right-side pane opens an executor terminal (via tmux) for agent coding.
Extensible: framework-agnostic test generation via the Surgical Test Generator; detect and use the repo’s test command for execution with coverage thresholds.
Full multi-language runner parity beyond detection and executing the project’s test command.
Complex GUI; start with CLI/TUI + tmux pane. IDE/extension can hook into the same state later.
Rich executor selection UX (codex/gemini/claude) — we’ll prompt per run; defaults can come later.
One command can autonomously complete a task's subtasks via TDD and open a PR when done.
All commits made on a branch that includes the tag and task id (see Branch Naming); no commits to the default branch directly.
Every subtask iteration: failing tests added first (red), then code added to pass them (green), commit only after green.
End-to-end logs + artifacts stored in .taskmaster/reports/runs/<timestamp-or-id>/.
tm autopilotAs a developer, I can run tm autopilot <taskId> and watch a structured, safe workflow execute.
As a reviewer, I can inspect commits per subtask, and a PR summarizing the work when the task completes.
As an operator, I can see current step, active subtask, tests status, and logs in a compact CLI view and read a final run report.
# Developer starts
$ tm autopilot 42
→ Checks preflight: ✓ clean tree, ✓ npm test detected
→ Creates branch: analytics/task-42-user-metrics
→ Subtask 42.1: "Add metrics schema"
RED: generates test_metrics_schema.test.js → 3 failures
GREEN: implements schema.js → all pass
COMMIT: "feat(metrics): add metrics schema (task 42.1)"
→ Subtask 42.2: "Add collection endpoint"
RED: generates test_metrics_endpoint.test.js → 5 failures
GREEN: implements api/metrics.js → all pass
COMMIT: "feat(metrics): add collection endpoint (task 42.2)"
→ Subtask 42.3: "Add dashboard widget"
RED: generates test_metrics_widget.test.js → 4 failures
GREEN: implements components/MetricsWidget.jsx → all pass
COMMIT: "feat(metrics): add dashboard widget (task 42.3)"
→ Final: all 3 subtasks complete
✓ Run full test suite → all pass
✓ Coverage check → 85% (meets 80% threshold)
PUSH: confirms with user → pushed to origin
PR: opens #123 "Task #42 [analytics]: User metrics tracking"
✓ Task 42 complete. PR: https://github.com/org/repo/pull/123
Run report: .taskmaster/reports/runs/2025-01-15-142033/
$ tm autopilot 42
→ Subtask 42.2 GREEN phase: attempt 1 fails (2 tests still red)
→ Subtask 42.2 GREEN phase: attempt 2 fails (1 test still red)
→ Subtask 42.2 GREEN phase: attempt 3 fails (1 test still red)
⚠️ Paused: Could not achieve green state after 3 attempts
📋 State saved to: .taskmaster/reports/runs/2025-01-15-142033/
Last error: "POST /api/metrics returns 500 instead of 201"
Next steps:
- Review diff: git diff HEAD
- Inspect logs: cat .taskmaster/reports/runs/2025-01-15-142033/log.jsonl
- Check test output: cat .taskmaster/reports/runs/2025-01-15-142033/test-results/subtask-42.2-green-attempt3.json
- Resume after manual fix: tm autopilot --resume
# Developer manually fixes the issue, then:
$ tm autopilot --resume
→ Resuming subtask 42.2 GREEN phase
GREEN: all tests pass
COMMIT: "feat(metrics): add collection endpoint (task 42.2)"
→ Continuing to subtask 42.3...
$ tm autopilot 42 --dry-run
Autopilot Plan for Task #42 [analytics]: User metrics tracking
─────────────────────────────────────────────────────────────
Preflight:
✓ Working tree is clean
✓ Test command detected: npm test
✓ Tools available: git, gh, node, npm
✓ Current branch: main (will create new branch)
Branch & Tag:
→ Create branch: analytics/task-42-user-metrics
→ Set active tag: analytics
Subtasks (3 pending):
1. 42.1: Add metrics schema
- RED: generate tests in src/__tests__/schema.test.js
- GREEN: implement src/schema.js
- COMMIT: "feat(metrics): add metrics schema (task 42.1)"
2. 42.2: Add collection endpoint [depends on 42.1]
- RED: generate tests in src/api/__tests__/metrics.test.js
- GREEN: implement src/api/metrics.js
- COMMIT: "feat(metrics): add collection endpoint (task 42.2)"
3. 42.3: Add dashboard widget [depends on 42.2]
- RED: generate tests in src/components/__tests__/MetricsWidget.test.jsx
- GREEN: implement src/components/MetricsWidget.jsx
- COMMIT: "feat(metrics): add dashboard widget (task 42.3)"
Finalization:
→ Run full test suite with coverage
→ Push branch to origin (will confirm)
→ Create PR targeting main
Run without --dry-run to execute.
Pre‑flight
Verify clean working tree or confirm staging/commit policy (configurable).
Detect repo type and the project’s test command (e.g., npm test, pnpm test, pytest, go test).
Validate tools: git, gh (optional for PR), node/npm, and (if used) claude CLI.
Load TaskMaster state and selected task; if no subtasks exist, automatically run “expand” before working.
Branch & Tag Setup
Checkout default branch and update (optional), then create a branch using Branch Naming (below).
Map branch ↔ tag via existing tag management; explicitly set active tag to the branch’s tag.
Subtask Loop (for each pending/in-progress subtask in dependency order)
Select next eligible subtask using tm-core TaskService getNextTask() and subtask eligibility logic.
Red: generate or update failing tests for the subtask
Use the Surgical Test Generator system prompt .claude/agents/surgical-test-generator.md) to produce high-signal tests following project conventions.
Run tests to confirm red; record results. If not red (already passing), skip to next subtask or escalate.
Green: implement code to pass tests
Use executor to implement changes (initial: claude CLI prompt with focused context).
Re-run tests until green or timeout/backoff policy triggers.
Commit: when green
Commit tests + code with conventional commit message. Optionally update subtask status to done.
Persist run step metadata/logs.
Finalization
Run full test suite and coverage (if configured); optionally lint/format.
Commit any final adjustments.
Push branch (ask user to confirm); create PR (via gh pr create) targeting the default branch. Title format: Task #<id> [<tag>]: <title>.
Post‑Run
Update task status if desired (e.g., review).
Persist run report (JSON + markdown summary) to .taskmaster/reports/runs/<run-id>/.
Never commit to the default branch.
Commit only if all tests (targeted and suite) pass; allow override flags.
Enforce 80% coverage thresholds (lines/branches/functions/statements) by default; configurable.
Timebox/model ops and retries; if not green within N attempts, pause with actionable state for resume.
Always log actions, commands, and outcomes; include dry-run mode.
Ask before branch creation, pushing, and opening a PR unless --no-confirm is set.
CLI: apps/cli provides command structure and UI components.
New command: tm autopilot (alias: task-master autopilot).
Reuse UI components under apps/cli/src/ui/components/ for headers/task details/next-task.
Core services: packages/tm-core
TaskService for selection, status, tags.
TaskExecutionService for prompt formatting and executor prep.
Executors: claude executor and ExecutorFactory to run external tools.
Proposed new: WorkflowOrchestrator to drive the autonomous loop and emit progress events.
Tag/Git utilities: scripts/modules/utils/git-utils.js and scripts/modules/task-manager/tag-management.js for branch→tag mapping and explicit tag switching.
Rules: .cursor/rules/git_workflow.mdc and .cursor/rules/test_workflow.mdc to steer behavior and ensure consistency.
Test generation prompt: .claude/agents/surgical-test-generator.md.
Orchestrator (tm-core): WorkflowOrchestrator (new)
State machine driving phases: Preflight → Branch/Tag → SubtaskIter (Red/Green/Commit) → Finalize → PR.
Exposes an evented API (progress events) that the CLI can render.
Stores run state artifacts.
Test Runner Adapter
Detects and runs tests via the project’s test command (e.g., npm test), with targeted runs where feasible.
API: runTargeted(files/pattern), runAll(), report summary (failures, duration, coverage), enforce 80% threshold by default.
Git/PR Adapter
Encapsulates git ops: branch create/checkout, add/commit, push.
Optional gh integration to open PR; fallback to instructions if gh unavailable.
Confirmation gates for branch creation and pushes.
Prompt/Exec Adapter
Run State + Reporting
JSONL log of steps, timestamps, commands, test results.
Markdown summary for PR description and post-run artifact.
Command: tm autopilot [taskId]
Flags: --dry-run, --no-push, --no-pr, --no-confirm, --force, --max-attempts <n>, --runner <auto|custom>, --commit-scope <scope>
Output: compact header (project, tag, branch), current phase, subtask line, last test summary, next actions.
Resume: If interrupted, tm autopilot --resume picks up from last checkpoint in run state.
Left pane: Tag selector, task list (status/priority), start/expand shortcuts; "Start" triggers the next task or a selected task.
Right pane: Executor terminal (tmux split) that runs the coding agent (claude-code/codex). Autopilot can hand over to the right pane during green.
MCP integration: use MCP tools for task queries/updates and for shell/test invocations where available.
┌─────────────────────────────────────┬──────────────────────────────────┐
│ Task Navigator (left) │ Executor Terminal (right) │
│ │ │
│ Project: my-app │ $ tm autopilot --executor-mode │
│ Branch: analytics/task-42 │ > Running subtask 42.2 GREEN... │
│ Tag: analytics │ > Implementing endpoint... │
│ │ > Tests: 3 passed, 0 failed │
│ Tasks: │ > Ready to commit │
│ → 42 [in-progress] User metrics │ │
│ → 42.1 [done] Schema │ [Live output from Claude Code] │
│ → 42.2 [active] Endpoint ◀ │ │
│ → 42.3 [pending] Dashboard │ │
│ │ │
│ [s] start [p] pause [q] quit │ │
└─────────────────────────────────────┴──────────────────────────────────┘
apps/cli/src/ui/tui/navigator.ts (new, uses blessed or ink)tmux split-window -h running tm autopilot --executor-mode.taskmaster/state/current-run.json + file watching or event streams - Start selected taskp - Pause/resume current runq - Quit (with confirmation if run active)↑/↓ - Navigate task listEnter - Expand/collapse subtasksPrompts are composed in three layers:
Base rules (loaded in order from .cursor/rules/ and .claude/agents/):
git_workflow.mdc → git commit conventions, branch policy, PR guidelinestest_workflow.mdc → TDD loop requirements, coverage thresholds, test structuresurgical-test-generator.md → test generation methodology, project-specific test patternsTask context injection:
You are implementing:
Task #42 [analytics]: User metrics tracking
Subtask 42.2: Add collection endpoint
Description:
Implement POST /api/metrics endpoint to collect user metrics events
Acceptance criteria:
- POST /api/metrics accepts { userId, eventType, timestamp }
- Validates input schema (reject missing/invalid fields)
- Persists to database
- Returns 201 on success with created record
- Returns 400 on validation errors
Dependencies:
- Subtask 42.1 (metrics schema) is complete
Current phase: RED (generate failing tests)
Test command: npm test
Test file convention: src/**/*.test.js (vitest framework detected)
Branch: analytics/task-42-user-metrics
Project language: JavaScript (Node.js)
Phase-specific instructions:
src/. Only modify files necessary for this subtask. Keep changes focused and reviewable."
[Contents of .cursor/rules/git_workflow.mdc]
[Contents of .cursor/rules/test_workflow.mdc]
[Contents of .claude/agents/surgical-test-generator.md]
<TASK CONTEXT>
You are implementing:
Task #42.2: Add collection endpoint
Description:
Implement POST /api/metrics endpoint to collect user metrics events
Acceptance criteria:
- POST /api/metrics accepts { userId, eventType, timestamp }
- Validates input schema (reject missing/invalid fields)
- Persists to database using MetricsSchema from subtask 42.1
- Returns 201 on success with created record
- Returns 400 on validation errors with details
Dependencies: Subtask 42.1 (metrics schema) is complete
<INSTRUCTION>
Generate failing tests for this subtask. Follow project conventions:
- Test file: src/api/__tests__/metrics.test.js
- Framework: vitest (detected from package.json)
- Test cases to cover:
* POST /api/metrics with valid payload → should return 201 (will fail: endpoint not implemented)
* POST /api/metrics with missing userId → should return 400 (will fail: validation not implemented)
* POST /api/metrics with invalid timestamp → should return 400 (will fail: validation not implemented)
* POST /api/metrics should persist to database → should save record (will fail: persistence not implemented)
Do NOT implement the endpoint code yet. Only create test file(s).
Confirm tests fail with messages like "Cannot POST /api/metrics" or "endpoint not defined".
Output format:
1. File path to create: src/api/__tests__/metrics.test.js
2. Complete test code
3. Command to run: npm test src/api/__tests__/metrics.test.js
[Contents of .cursor/rules/git_workflow.mdc]
[Contents of .cursor/rules/test_workflow.mdc]
<TASK CONTEXT>
Task #42.2: Add collection endpoint
[same context as RED phase]
<CURRENT STATE>
Tests created in RED phase:
- src/api/__tests__/metrics.test.js
- 5 tests written, all failing as expected
Test output:
FAIL src/api/tests/metrics.test.js POST /api/metrics ✗ should return 201 with valid payload (endpoint not found) ✗ should return 400 with missing userId (endpoint not found) ✗ should return 400 with invalid timestamp (endpoint not found) ✗ should persist to database (endpoint not found)
<INSTRUCTION>
Implement minimal code to make all tests pass.
Guidelines:
- Create/modify file: src/api/metrics.js
- Use existing patterns from src/api/ (e.g., src/api/users.js for reference)
- Import MetricsSchema from subtask 42.1 (src/models/schema.js)
- Implement validation, persistence, and response handling
- Follow project error handling conventions
- Keep implementation focused on this subtask only
After implementation:
1. Run tests: npm test src/api/__tests__/metrics.test.js
2. Confirm all 5 tests pass
3. Report results
Output format:
1. File(s) created/modified
2. Implementation code
3. Test command and results
See .taskmaster/config.json → prompts section for paths and load order.
{
"autopilot": {
"enabled": true,
"requireCleanWorkingTree": true,
"commitTemplate": "{type}({scope}): {msg}",
"defaultCommitType": "feat",
"maxGreenAttempts": 3,
"testTimeout": 300000
},
"test": {
"runner": "auto",
"coverageThresholds": {
"lines": 80,
"branches": 80,
"functions": 80,
"statements": 80
},
"targetedRunPattern": "**/*.test.js"
},
"git": {
"branchPattern": "{tag}/task-{id}-{slug}",
"pr": {
"enabled": true,
"base": "default",
"bodyTemplate": ".taskmaster/templates/pr-body.md"
}
},
"prompts": {
"rulesPath": ".cursor/rules",
"testGeneratorPath": ".claude/agents/surgical-test-generator.md",
"loadOrder": ["git_workflow.mdc", "test_workflow.mdc"]
}
}
enabled (boolean): Enable/disable autopilot functionalityrequireCleanWorkingTree (boolean): Require clean git state before startingcommitTemplate (string): Template for commit messages (tokens: {type}, {scope}, {msg})defaultCommitType (string): Default commit type (feat, fix, chore, etc.)maxGreenAttempts (number): Maximum retry attempts to achieve green tests (default: 3)testTimeout (number): Timeout in milliseconds per test run (default: 300000 = 5min)runner (string): Test runner detection mode ("auto" or explicit command like "npm test")coverageThresholds (object): Minimum coverage percentages required
lines, branches, functions, statements (number): Threshold percentages (0-100)targetedRunPattern (string): Glob pattern for targeted subtask test runsbranchPattern (string): Branch naming pattern (tokens: {tag}, {id}, {slug})pr.enabled (boolean): Enable automatic PR creationpr.base (string): Target branch for PRs ("default" uses repo default, or specify like "main")pr.bodyTemplate (string): Path to PR body template file (optional)rulesPath (string): Directory containing rule files (e.g., .cursor/rules)testGeneratorPath (string): Path to test generator prompt fileloadOrder (array): Order to load rule files from rulesPath# Required for executor
ANTHROPIC_API_KEY=sk-ant-... # Claude API key
# Optional: for PR creation
GITHUB_TOKEN=ghp_... # GitHub personal access token
# Optional: for other executors (future)
OPENAI_API_KEY=sk-...
GOOGLE_API_KEY=...
Each autopilot run creates a timestamped directory with complete traceability:
.taskmaster/reports/runs/2025-01-15-142033/
├── manifest.json # run metadata (task id, start/end time, status)
├── log.jsonl # timestamped event stream
├── commits.txt # list of commit SHAs made during run
├── test-results/
│ ├── subtask-42.1-red.json
│ ├── subtask-42.1-green.json
│ ├── subtask-42.2-red.json
│ ├── subtask-42.2-green-attempt1.json
│ ├── subtask-42.2-green-attempt2.json
│ ├── subtask-42.2-green-attempt3.json
│ └── final-suite.json
└── pr.md # generated PR body
{
"runId": "2025-01-15-142033",
"taskId": "42",
"tag": "analytics",
"branch": "analytics/task-42-user-metrics",
"startTime": "2025-01-15T14:20:33Z",
"endTime": "2025-01-15T14:45:12Z",
"status": "completed",
"subtasksCompleted": ["42.1", "42.2", "42.3"],
"subtasksFailed": [],
"totalCommits": 3,
"prUrl": "https://github.com/org/repo/pull/123",
"finalCoverage": {
"lines": 85.3,
"branches": 82.1,
"functions": 88.9,
"statements": 85.0
}
}
Event stream in JSON Lines format for easy parsing and debugging:
{"ts":"2025-01-15T14:20:33Z","phase":"preflight","status":"ok","details":{"testCmd":"npm test","gitClean":true}}
{"ts":"2025-01-15T14:20:45Z","phase":"branch","status":"ok","branch":"analytics/task-42-user-metrics"}
{"ts":"2025-01-15T14:21:00Z","phase":"red","subtask":"42.1","status":"ok","tests":{"failed":3,"passed":0}}
{"ts":"2025-01-15T14:22:15Z","phase":"green","subtask":"42.1","status":"ok","tests":{"passed":3,"failed":0},"attempts":2}
{"ts":"2025-01-15T14:22:20Z","phase":"commit","subtask":"42.1","status":"ok","sha":"a1b2c3d","message":"feat(metrics): add metrics schema (task 42.1)"}
{"ts":"2025-01-15T14:23:00Z","phase":"red","subtask":"42.2","status":"ok","tests":{"failed":5,"passed":0}}
{"ts":"2025-01-15T14:25:30Z","phase":"green","subtask":"42.2","status":"error","tests":{"passed":3,"failed":2},"attempts":3,"error":"Max attempts reached"}
{"ts":"2025-01-15T14:25:35Z","phase":"pause","reason":"max_attempts","nextAction":"manual_review"}
Each test run stores detailed results:
{
"subtask": "42.2",
"phase": "green",
"attempt": 3,
"timestamp": "2025-01-15T14:25:30Z",
"command": "npm test src/api/__tests__/metrics.test.js",
"exitCode": 1,
"duration": 2340,
"summary": {
"total": 5,
"passed": 3,
"failed": 2,
"skipped": 0
},
"failures": [
{
"test": "POST /api/metrics should return 201 with valid payload",
"error": "Expected status 201, got 500",
"stack": "..."
}
],
"coverage": {
"lines": 78.5,
"branches": 75.0,
"functions": 80.0,
"statements": 78.5
}
}
The autopilot system uses an orchestration model rather than direct code execution:
Orchestrator Role (tm-core WorkflowOrchestrator):
Executor Role (Claude Code/AI session via MCP):
Why This Approach?
Example Flow:
// Claude Code (via MCP) queries orchestrator
const workUnit = await orchestrator.getNextWorkUnit('42');
// => {
// phase: 'RED',
// subtask: '42.1',
// action: 'Generate failing tests for metrics schema',
// context: { title, description, dependencies, testFile: 'src/__tests__/schema.test.js' }
// }
// Claude Code executes the work (writes test file, runs tests)
// Then reports back
await orchestrator.completeWorkUnit('42', '42.1', 'RED', {
success: true,
testsCreated: ['src/__tests__/schema.test.js'],
testsFailed: 3
});
// Query again for next phase
const nextWorkUnit = await orchestrator.getNextWorkUnit('42');
// => { phase: 'GREEN', subtask: '42.1', action: 'Implement code to pass tests', ... }
Decision: Commit after each subtask's green state, not after the entire task.
Rationale:
Trade-off: More commits per task (can use squash-merge in PRs if desired)
Decision: Sequential subtask execution in Phase 1; parallel execution deferred to Phase 3.
Rationale:
Trade-off: Slower for truly independent subtasks (mitigated by keeping subtasks small and focused)
Decision: Enforce 80% coverage threshold (lines/branches/functions/statements) before allowing commits.
Rationale:
.taskmaster/config.json if too strictTrade-off: May require more test generation iterations; can be lowered per project
Decision: MVP uses tmux split panes for TUI, not Electron/web-based GUI.
Rationale:
Trade-off: Less visual polish than GUI; requires tmux familiarity
Decision: Start with Claude executor only; add others in Phase 2+.
Rationale:
Trade-off: Users locked to Claude initially; can work around with manual executor selection
Model hallucination/large diffs: restrict prompt scope; enforce minimal changes; show diff previews (optional) before commit.
Flaky tests: allow retries, isolate targeted runs for speed, then full suite before commit.
Environment variability: detect runners/tools; provide fallbacks and actionable errors.
PR creation fails: still push and print manual commands; persist PR body to reuse.
Slugging rules for branch names; any length limits or normalization beyond {slug} token sanitize?
PR body standard sections beyond run report (e.g., checklist, coverage table)?
Default executor prompt fine-tuning once codex/gemini integration is available.
Where to store persistent TUI state (pane layout, last selection) in .taskmaster/state.json?
Include both the tag and the task id in the branch name to make lineage explicit.
Default pattern: <tag>/task-<id>[-slug] (e.g., master/task-12, tag-analytics/task-4-user-auth).
Configurable via .taskmaster/config.json: git.branchPattern supports tokens {tag}, {id}, {slug}.
Use the repository’s default branch (detected via git) unless overridden.
Title format: Task #<id> [<tag>]: <title>.
Functional nodes (capabilities):
Autopilot Orchestration → drives TDD loop and lifecycle
Test Generation (Surgical) → produces failing tests from subtask context
Test Execution + Coverage → runs suite, enforces thresholds
Git/Branch/PR Management → safe operations and PR creation
TUI/Terminal Integration → interactive control and visibility via tmux
MCP Integration → structured task/status/context operations
Structural nodes (code organization):
packages/tm-core:
services/workflow-orchestrator.ts (new)
services/test-runner-adapter.ts (new)
services/git-adapter.ts (new)
existing: task-service.ts, task-execution-service.ts, executors/*
apps/cli:
src/commands/autopilot.command.ts (new)
src/ui/tui/ (new tmux/TUI helpers)
scripts/modules:
.claude/agents/:
Edges (data/control flow):
Autopilot → Test Generation → Test Execution → Git Commit → loop
Autopilot → Git Adapter (branch, tag, PR)
Autopilot → TUI (event stream) → tmux pane control
Autopilot → MCP tools for task/status updates
Test Execution → Coverage gate → Autopilot decision
Topological traversal (implementation order):
Git/Test adapters (foundations)
Orchestrator skeleton + events
CLI autopilot command and dry-run
Surgical test-gen integration and execution gate
PR creation, run reports, resumability
Phase 0: Spike
Implement CLI skeleton tm autopilot with dry-run showing planned steps from a real task + subtasks.
Detect test runner (package.json) and git state; render a preflight report.
Phase 1: Core Rails (State Machine & Orchestration)
Implement WorkflowOrchestrator in tm-core as a state machine that tracks TDD phases per subtask.
Orchestrator guides the current AI session (Claude Code/MCP client) rather than executing code itself.
Add Git/Test adapters for status checks and validation (not direct execution).
WorkflowOrchestrator API:
getNextWorkUnit(taskId) → returns next phase to execute (RED/GREEN/COMMIT) with contextcompleteWorkUnit(taskId, subtaskId, phase, result) → records completion and advances stategetRunState(taskId) → returns current progress and resumability dataMCP integration: expose work unit endpoints so Claude Code can query "what to do next" and report back.
Branch/tag mapping via existing tag-management APIs.
Run report persisted under .taskmaster/reports/runs/ with state checkpoints for resumability.
Phase 2: PR + Resumability
Add gh PR creation with well-formed body using the run report.
Introduce resumable checkpoints and --resume flag.
Add coverage enforcement and optional lint/format step.
Phase 3: Extensibility + Guardrails
Add support for basic pytest/go test adapters.
Add safeguards: diff preview mode, manual confirm gates, aggressive minimal-change prompts.
Optional: small TUI panel and extension panel leveraging the same run state file.
Test Workflow: .cursor/rules/test_workflow.mdc
Git Workflow: .cursor/rules/git_workflow.mdc
CLI: apps/cli/src/commands/start.command.ts, apps/cli/src/ui/components/*.ts
Core Services: packages/tm-core/src/services/task-service.ts, task-execution-service.ts
Executors: packages/tm-core/src/executors/*
Git Utilities: scripts/modules/utils/git-utils.js
Tag Management: scripts/modules/task-manager/tag-management.js
Surgical Test Generator: .claude/agents/surgical-test-generator.md