docs/design/convoy/convoy-lifecycle.md
Making convoys actively converge on completion.
flowchart TB
%% ---- Creation ----
create_manual(["gt convoy create"])
create_sling(["gt sling
(auto-convoy)"])
create_formula(["gt formula run
(convoy type)"])
create_manual --> convoy["Convoy bead created
(hq-cv-*)"]
create_sling --> convoy
create_formula --> convoy
convoy --> track["Track issues
via dep add --type=tracks"]
%% ---- Dispatch ----
track --> dispatch["gt sling dispatches
work to polecats"]
dispatch --> working["Polecats execute
tracked issues"]
%% ---- Completion detection ----
working --> issue_close["Issue closes
(gt done / bd close)"]
issue_close --> check{"All tracked
issues closed?"}
check -->|No| feed["Feed next ready issue
to available polecat"]
feed --> working
check -->|Yes| close["Close convoy +
send notifications"]
close --> done(["Convoy landed"])
%% ---- Daemon observer (event-driven + stranded scan) ----
issue_close -.-> obs_daemon["Daemon ConvoyManager
Event poll (5s) + CheckConvoysForIssue
Stranded scan (30s) as safety net"]
obs_daemon -.-> check
%% ---- Manual overrides ----
force_close(["gt convoy close --force"]) -.-> close
land(["gt convoy land
(owned convoys)"]) -.-> close
classDef successNode fill:#27ae60,color:#fff,stroke:none
classDef extNode fill:#34495e,color:#ecf0f1,stroke:none
class done successNode
class obs_daemon extNode
Three creation paths feed into the same lifecycle. Completion is event-driven
via the daemon's ConvoyManager, which runs two goroutines:
GetAllEventsSince, detects close events, and calls
convoy.CheckConvoysForIssue — which both checks completion and feeds
the next ready issue to a polecat.gt convoy stranded --json to catch
convoys missed by the event-driven path (e.g. after crash/restart). Feeds
ready work or auto-closes empty convoys.Manual overrides (close --force, land) bypass the check entirely.
History: Witness and Refinery observers were originally planned as redundant observers but were removed (spec S-04, S-05). The daemon's multi-rig event poll + stranded scan provide sufficient coverage.
gt sling actually doesgt sling auto-creates a convoy for every bead it dispatches, unless
--no-convoy is passed. The behavior differs significantly between
single-bead and multi-bead (batch) sling.
gt sling sh-task-1 gastown
sh-task-1 is already tracked by an open convoy."Work: <issue-title>" tracking
that single bead.Result: 1 bead, 1 convoy, 1 polecat.
gt sling gt-task-1 gt-task-2 gt-task-3
The rig is auto-resolved from the beads' prefixes via routes.jsonl
(resolveRigFromBeadIDs in sling_batch.go). All beads must resolve
to the same rig. An explicit rig arg still works but prints a
deprecation warning:
gt sling gt-task-1 gt-task-2 gt-task-3 gastown
# Deprecation: gt sling now auto-resolves the rig from bead prefixes.
# You no longer need to explicitly specify <gastown>.
Batch sling creates one convoy tracking all beads. Before spawning
any polecats, runBatchSling (sling_batch.go) calls
createBatchConvoy (sling_convoy.go) which creates a single convoy
with title "Batch: N beads to <rig>" and adds tracks deps for all
beads.
Result: 3 beads, 1 convoy, 3 polecats -- all dispatched in parallel with 2-second delays between spawns.
The convoy ID and merge strategy are stored on each bead via
beadFieldUpdates, so gt done can find the convoy via the fast path.
There is no upper limit on the number of beads. gt sling <10 beads>
spawns 10 polecats sharing 1 convoy. The only throttle is
--max-concurrent (default 0 = unlimited).
When the rig is explicit, the cross-rig guard checks each bead's prefix
against the target rig. On mismatch, it errors with batch-specific
suggested actions (remove the bead, sling it separately, or --force).
When auto-resolving, resolveRigFromBeadIDs errors if:
routes.jsonl (including town-level path=".")If any bead in the batch is already tracked by another convoy, batch sling errors before spawning any polecats. It prints:
gt-task-2 has a blocks
dep on gt-task-1, both get slung. The isIssueBlocked check only
applies to daemon-driven convoy feeding (after a close event), not to
the initial batch sling dispatch.IsSlingableType and isIssueBlocked
before dispatching the next ready issue from the shared convoy.Batch sling with --no-convoy skips convoy creation entirely:
gt sling gt-task-1 gt-task-2 gt-task-3 gastown --no-convoy
Convoys are passive trackers. They group work but don't drive it. The completion loop has a structural gap:
Create → Assign → Execute → Issues close → ??? → Convoy closes
The ??? is "Deacon patrol runs gt convoy check" - a poll-based single point of
failure. When Deacon is down, convoys don't close. Work completes but the loop
never lands.
gt convoy status shows progressgt convoy stranded finds unassigned workgt convoy check auto-closes completed convoysgt convoy checkgt convoy close --force exists, but some docs still describe it as missingConvoy completion should be:
When an issue closes, check if it's tracked by a convoy:
flowchart TD
close["Issue closes"] --> tracked{"Tracked by convoy?"}
tracked -->|No| done1(["Done"])
tracked -->|Yes| check["Run gt convoy check"]
check --> all{"All tracked
issues closed?"}
all -->|No| done2(["Done"])
all -->|Yes| notify["Close convoy +
send notifications"]
Implementation: The daemon's ConvoyManager event poll detects close events
via SDK GetAllEventsSince across all rig stores + hq. This catches all closes
regardless of source (CLI, witness, refinery, manual).
The daemon's ConvoyManager is the sole convoy observer, running two
independent goroutines:
| Loop | Trigger | What it does |
|---|---|---|
| Event poll | GetAllEventsSince every 5s (all rig stores + hq) | Detects close events, calls CheckConvoysForIssue |
| Stranded scan | gt convoy stranded --json every 30s | Feeds first ready issue via gt sling, auto-closes empty convoys |
Both loops are context-cancellable. The shared CheckConvoysForIssue function
is idempotent — closing an already-closed convoy is a no-op.
History: The original design called for three redundant observers (Daemon, Witness, Refinery) per the "Redundant Monitoring Is Resilience" principle. Witness observers were removed (spec S-04) because convoy tracking is orthogonal to polecat lifecycle management. Refinery observers were removed (spec S-05) after S-17 found they were silently broken (wrong root path) with no visible impact, confirming single-observer coverage is sufficient.
Convoys are rig-agnostic. A convoy like hq-cv-6vjz2 lives in the hq store and
tracks issue IDs like sh-pb6sa — but it doesn't store which rig that issue
belongs to. The rig association is resolved at dispatch time via two lookups:
sh-pb6sa → sh- (string parsing)sh- in ~/gt/.beads/routes.jsonl → finds
{"prefix":"sh-","path":"gastown/.beads"} → rig name is gastownThis happens in feedFirstReady (stranded scan path) and feedNextReadyIssue
(event poll path) just before calling gt sling. Issues with prefixes that
don't appear in routes.jsonl (or that map to path="." like hq-*) are
skipped — see isSlingableBead().
Both paths check isRigParked after resolving the rig name. Issues targeting
parked rigs are logged and skipped rather than dispatched.
gt convoy close is implemented, including --force for abandoned convoys.
# Close a completed convoy
gt convoy close hq-cv-abc
# Force-close an abandoned convoy
gt convoy close hq-cv-xyz --reason="work done differently"
# Close with explicit notification
gt convoy close hq-cv-abc --notify mayor/
Use cases:
Track who requested the convoy for targeted notifications:
gt convoy create "Feature X" gt-abc --owner mayor/ --notify overseer
| Field | Purpose |
|---|---|
owner | Who requested (gets completion notification) |
notify | Additional subscribers |
If owner not specified, defaults to creator (from created_by).
OPEN ──(all issues close)──► CLOSED
│ │
│ ▼
│ (add issues)
│ │
└─────────────────────────────┘
(auto-reopens)
Adding issues to closed convoy reopens automatically.
New state for abandonment:
OPEN ──► CLOSED (completed)
│
└────► ABANDONED (force-closed without completion)
Optional due_at field for convoy deadline:
gt convoy create "Sprint work" gt-abc --due="2026-01-15"
Overdue convoys surface in gt convoy stranded --overdue.
gt convoy closegt convoy close <convoy-id> [--reason=<reason>] [--notify=<agent>]
--force closes even when tracked issues remain openclose_reason fieldgt convoy check# Check all convoys (current behavior)
gt convoy check
# Check specific convoy (new)
gt convoy check <convoy-id>
# Dry-run mode
gt convoy check --dry-run
gt convoy reopengt convoy reopen <convoy-id>
Explicit reopen for clarity (currently implicit via add).
Core convoy manager is fully implemented and tested (see spec.md stories S-01 through S-18, all DONE). Remaining future work:
| Component | File |
|---|---|
| Convoy command | internal/cmd/convoy.go |
| Auto-convoy (sling) | internal/cmd/sling_convoy.go |
| Convoy operations | internal/convoy/operations.go (CheckConvoysForIssue, feedNextReadyIssue) |
| Daemon manager | internal/daemon/convoy_manager.go |
| Formula convoy | internal/cmd/formula.go (executeConvoyFormula) |