docs/concepts/streaming.md
OpenClaw has two separate streaming layers:
There is no true token-delta streaming to channel messages today. Preview streaming is message-based (send + edits/appends).
Block streaming sends assistant output in coarse chunks as it becomes available.
Model output
└─ text_delta/events
├─ (blockStreamingBreak=text_end)
│ └─ chunker emits blocks as buffer grows
└─ (blockStreamingBreak=message_end)
└─ chunker flushes at message_end
└─ channel send (block replies)
Legend:
text_delta/events: model stream events (may be sparse for non-streaming models).chunker: EmbeddedBlockChunker applying min/max bounds + break preference.channel send: actual outbound messages (block replies).Controls:
agents.defaults.blockStreamingDefault: "on"/"off" (default off).*.blockStreaming (and per-account variants) to force "on"/"off" per channel.agents.defaults.blockStreamingBreak: "text_end" or "message_end".agents.defaults.blockStreamingChunk: { minChars, maxChars, breakPreference? }.agents.defaults.blockStreamingCoalesce: { minChars?, maxChars?, idleMs? } (merge streamed blocks before send).*.textChunkLimit (e.g., channels.whatsapp.textChunkLimit).*.chunkMode (length default, newline splits on blank lines (paragraph boundaries) before length chunking).channels.discord.maxLinesPerMessage (default 17) splits tall replies to avoid UI clipping.Boundary semantics:
text_end: stream blocks as soon as chunker emits; flush on each text_end.message_end: wait until assistant message finishes, then flush buffered output.message_end still uses the chunker if the buffered text exceeds maxChars, so it can emit multiple chunks at the end.
MEDIA: directives are normal delivery metadata. When block streaming sends a
media block early, OpenClaw remembers that delivery for the turn. If the final
assistant payload repeats the same media URL, the final delivery strips the
duplicate media instead of sending the attachment again.
Exact duplicate final payloads are suppressed. If the final payload adds
distinct text around media that was already streamed, OpenClaw still sends the
new text while keeping the media single-delivery. This prevents duplicate voice
notes or files on channels such as Telegram when an agent emits MEDIA: during
streaming and the provider also includes it in the completed reply.
Block chunking is implemented by EmbeddedBlockChunker:
minChars (unless forced).maxChars; if forced, split at maxChars.paragraph → newline → sentence → whitespace → hard break.maxChars, close + reopen the fence to keep Markdown valid.maxChars is clamped to the channel textChunkLimit, so you can’t exceed per-channel caps.
When block streaming is enabled, OpenClaw can merge consecutive block chunks before sending them out. This reduces “single-line spam” while still providing progressive output.
idleMs) before flushing.maxChars and will flush if they exceed it.minChars prevents tiny fragments from sending until enough text accumulates
(final flush always sends remaining text).blockStreamingChunk.breakPreference
(paragraph → \n\n, newline → \n, sentence → space).*.blockStreamingCoalesce (including per-account configs).minChars is bumped to 1500 for Signal/Slack/Discord unless overridden.When block streaming is enabled, you can add a randomized pause between block replies (after the first block). This makes multi-bubble responses feel more natural.
agents.defaults.humanDelay (override per agent via agents.list[].humanDelay).off (default), natural (800–2500ms), custom (minMs/maxMs).This maps to:
blockStreamingDefault: "on" + blockStreamingBreak: "text_end" (emit as you go). Non-Telegram channels also need *.blockStreaming: true.blockStreamingBreak: "message_end" (flush once, possibly multiple chunks if very long).blockStreamingDefault: "off" (only final reply).Channel note: Block streaming is off unless
*.blockStreaming is explicitly set to true. Channels can stream a live preview
(channels.<channel>.streaming) without block replies.
Config location reminder: the blockStreaming* defaults live under
agents.defaults, not the root config.
Canonical key: channels.<channel>.streaming
Modes:
off: disable preview streaming.partial: single preview that is replaced with latest text.block: preview updates in chunked/appended steps.progress: progress/status preview during generation, final answer at completion.streaming.mode: "block" is a preview-streaming mode for edit-capable channels
such as Discord and Telegram. It does not enable channel block delivery there.
Use streaming.block.enabled or the legacy blockStreaming channel key when
you want normal block replies. Microsoft Teams is the exception: it has no
draft-preview block transport, so streaming.mode: "block" maps to Teams block
delivery instead of native partial/progress streaming.
| Channel | off | partial | block | progress |
|---|---|---|---|---|
| Telegram | ✅ | ✅ | ✅ | editable progress draft |
| Discord | ✅ | ✅ | ✅ | editable progress draft |
| Slack | ✅ | ✅ | ✅ | ✅ |
| Mattermost | ✅ | ✅ | ✅ | ✅ |
| MS Teams | ✅ | ✅ | ✅ | native progress stream |
Slack-only:
channels.slack.streaming.nativeTransport toggles Slack native streaming API calls when channels.slack.streaming.mode="partial" (default: true).Legacy key migration:
streamMode and scalar/boolean streaming values are detected and migrated by doctor/config compatibility paths to streaming.mode.streamMode + boolean streaming auto-migrate to streaming enum.streamMode auto-migrates to streaming.mode; boolean streaming auto-migrates to streaming.mode plus streaming.nativeTransport; legacy nativeStreaming auto-migrates to streaming.nativeTransport.Telegram:
sendMessage + editMessageText preview updates across DMs and group/topics./reasoning stream can write reasoning to a transient preview that is deleted after final delivery.Discord:
block mode uses draft chunking (draftChunk).Slack:
partial can use Slack native streaming (chat.startStream/append/stop) when available.block uses append-style draft previews.progress uses status preview text, then final answer.Mattermost:
Matrix:
Preview streaming can also include tool-progress updates — short status lines like "searching the web", "reading file", or "calling tool" — that appear in the same preview message while tools are running, ahead of the final reply. This keeps multi-step tool turns visually alive rather than silent between the first thinking preview and the final answer.
Supported surfaces:
v2026.4.22; keeping them enabled preserves that released behavior.off or when block streaming has taken over the message. On Telegram, streaming.mode: "off" is final-only: generic progress chatter is also suppressed instead of being delivered as standalone status messages, while approval prompts, media payloads, and errors still route normally.streaming.preview.toolProgress to false for that channel. To keep tool-progress lines visible while hiding command/exec text, set streaming.preview.commandText to "status" or streaming.progress.commandText to "status"; the default is "raw" to preserve released behavior. This policy is shared by draft/progress channels that use OpenClaw's compact progress renderer, including Discord, Matrix, Microsoft Teams, Mattermost, Slack draft previews, and Telegram. To disable preview edits entirely, set streaming.mode to off.replyToMode is not "off" and selected quote text is present, OpenClaw skips the answer preview stream for that turn so tool-progress preview lines cannot render. Current-message replies without selected quote text still keep preview streaming. See Telegram channel docs for details.Keep progress lines visible but hide raw command/exec text:
{
"channels": {
"telegram": {
"streaming": {
"mode": "partial",
"preview": {
"toolProgress": true,
"commandText": "status"
}
}
}
}
}
Use the same shape under another compact progress channel key, for example channels.discord, channels.matrix, channels.msteams, channels.mattermost, or Slack draft previews. For progress-draft mode, put the same policy under streaming.progress:
{
"channels": {
"telegram": {
"streaming": {
"mode": "progress",
"progress": {
"toolProgress": true,
"commandText": "status"
}
}
}
}
}