docs/concepts/queue.md
We serialize inbound auto-reply runs (all channels) through a tiny in-process queue to prevent multiple agent runs from colliding, while still allowing safe parallelism across sessions.
runEmbeddedPiAgent enqueues by session key (lane session:<key>) to guarantee only one active run per session.main by default) so overall parallelism is capped by agents.defaults.maxConcurrent.When unset, all inbound channel surfaces use:
mode: "steer"debounceMs: 500cap: 20drop: "summarize"steer is the default because it keeps the active model turn responsive without
starting a second session run. It drains all steering messages that arrived
before the next model boundary. If the current run cannot accept steering,
OpenClaw falls back to a followup queue entry.
Inbound messages can steer the current run, wait for a followup turn, or do both:
steer: queue steering messages into the active runtime. Pi delivers all pending steering messages after the current assistant turn finishes executing its tool calls, before the next LLM call; Codex app-server receives one batched turn/steer. If the run is not actively streaming or steering is unavailable, OpenClaw falls back to a followup queue entry.queue (legacy): old one-at-a-time steering. Pi delivers one queued steering message at each model boundary; Codex app-server receives separate turn/steer requests. Prefer steer unless you need the previous serialized behavior.followup: enqueue each message for a later agent turn after the current run ends.collect: coalesce queued messages into a single followup turn after the quiet window. If messages target different channels/threads, they drain individually to preserve routing.steer-backlog (aka steer+backlog): steer now and preserve the same message for a followup turn.interrupt (legacy): abort the active run for that session, then run the newest message.Steer-backlog means you can get a followup response after the steered run, so
streaming surfaces can look like duplicates. Prefer collect/steer if you want
one response per inbound message.
For runtime-specific timing and dependency behavior, see
Steering queue. For the explicit /steer <message>
command, see Steer.
Configure globally or per channel via messages.queue:
{
messages: {
queue: {
mode: "steer",
debounceMs: 500,
cap: 20,
drop: "summarize",
byChannel: { discord: "collect" },
},
},
}
Options apply to followup, collect, and steer-backlog (and to steer or legacy queue when steering falls back to followup):
debounceMs: quiet window before draining queued followups. Bare numbers are milliseconds; units ms, s, m, h, and d are accepted by /queue options.cap: max queued messages per session. Values below 1 are ignored.drop: "summarize": default. Drop the oldest queued entries as needed, keep compact summaries, and inject them as a synthetic followup prompt.drop: "old": drop the oldest queued entries as needed, without preserving summaries.drop: "new": reject the newest message when the queue is already full.Defaults: debounceMs: 500, cap: 20, drop: summarize.
For mode selection, OpenClaw resolves:
/queue override.messages.queue.byChannel.<channel>.messages.queue.mode.steer.For options, inline or stored /queue options win over config. Then
channel-specific debounce (messages.queue.debounceMsByChannel), plugin
debounce defaults, global messages.queue options, and built-in defaults are
applied. cap and drop are global/session options, not per-channel config
keys.
/queue <mode> as a standalone command to store the mode for the current session./queue collect debounce:0.5s cap:25 drop:summarize/queue default or /queue reset clears the session override.main) is process-wide for inbound + main heartbeats; set agents.defaults.maxConcurrent to allow multiple sessions in parallel.cron, cron-nested, nested, subagent) so background jobs can run in parallel without blocking inbound replies. Isolated cron agent turns hold a cron slot while their inner agent execution uses cron-nested; both use cron.maxConcurrentRuns. Shared non-cron nested flows keep their own lane behavior. These detached runs are tracked as background tasks.processing past diagnostics.stuckSessionWarnMs with no observed reply, tool, status, block, or ACP progress are classified by current activity. Active work logs as session.long_running; active work with no recent progress logs as session.stalled; session.stuck is reserved for stale session bookkeeping with no active work, and only that path can release the affected session lane so queued work drains. Repeated session.stuck diagnostics back off while the session remains unchanged.