doc/plans/2026-02-20-issue-run-orchestration-plan.md
We observed cascaded wakeups on a single issue (for example PAP-39) that produced multiple runs at once:
issue_commentedissue_comment_mentionedCurrent behavior is run-centric and agent-centric. It coalesces per-agent+task in heartbeat.wakeup, but does not enforce a single active execution slot per issue across all agents.
heartbeat_runs.context_snapshot.issueId with run status queued or running.checkoutRunId on issues is a work-ownership lock, not an orchestration lock.issues, approvals, agents) and all funnel through heartbeat.wakeup.Use an explicit issue-level orchestration lock on issues.
executionRunId: uuid | null (FK to heartbeat_runs.id, ON DELETE SET NULL)executionAgentNameKey: text | null (normalized lowercase/trimmed agent name)executionLockedAt: timestamptz | nullexecutionRunId is the canonical “who currently owns orchestration for this issue” field.
If a wakeup is issue-scoped and issues.executionRunId points to an active run whose executionAgentNameKey matches the waking agent name key:
coalesced with reason issue_execution_same_nameIf an issue has an active execution lock held by a different agent-name key:
deferred_issue_execution)When the active issue run finishes, promote the oldest deferred request for that issue into a queued run and transfer executionRunId.
For issue-scoped wakeups, run creation is done only while holding a transaction lock on the issue row. This ensures only one queued/running run can become owner at a time.
execution_run_id, execution_agent_name_key, execution_locked_at.Issue type in packages/shared/src/types/issue.ts.heartbeat.wakeupenqueueWakeup, derive issueId from context/payload as today.issueId, keep existing behavior.issueId exists:
SELECT ... FOR UPDATE on issue rowexecutionRunId (if referenced run is not queued|running, clear lock)agentNameKey = agent.name.trim().toLowerCase()succeeded, failed, cancelled, orphan reaped):
issues.executionRunId, clear issue lockstartNextQueuedRunForAgent(promotedAgentId)issueId in payload/context snapshot.heartbeat.wakeup:
@CTO during active assignee run does not create concurrent active runagent_wakeup_requests.reason:
issue_execution_same_nameissue_execution_deferredissue_execution_promotedISSUE_EXECUTION_LOCK_ENABLED) default off.queued|running) at once.Checkout conflict logic should be corrected independently so assignees with checkoutRunId = null can acquire checkout by current run id without false 409 loops.