docs/reference/rich-output-protocol.md
Assistant output can carry a small set of delivery/render directives:
MEDIA: for attachment delivery[[audio_as_voice]] for audio presentation hints[[reply_to_current]] / [[reply_to:<id>]] for reply metadata[embed ...] for Control UI rich renderingRemote MEDIA: attachments must be public https: URLs. Plain http:,
loopback, link-local, private, and internal hostnames are ignored as attachment
directives; server-side media fetchers still enforce their own network guards.
Local MEDIA: attachments can use absolute paths, workspace-relative paths, or
home-relative ~/ paths. They still pass through the agent file-read policy and
media type checks before delivery.
Valid:
MEDIA:/workspace/image.png
Invalid (parsed as prose, no attachment delivered):
**MEDIA:/workspace/image.png**
`MEDIA:/workspace/image.png`
Here is your image: MEDIA:/workspace/image.png
Keep MEDIA: on its own line, in plain text, with no surrounding formatting.
</Warning>
Plain Markdown image syntax stays text by default. Channels that intentionally
map Markdown image replies to media attachments opt in at their outbound
adapter; Telegram does this so  can still become a media reply.
These directives are separate. MEDIA: and reply/voice tags remain delivery metadata; [embed ...] is the web-only rich render path.
Trusted tool-result media uses the same MEDIA: / [[audio_as_voice]] parser before delivery, so text tool outputs can still mark an audio attachment as a voice note.
When block streaming is enabled, MEDIA: remains single-delivery metadata for a
turn. If the same media URL is sent in a streamed block and repeated in the final
assistant payload, OpenClaw delivers the attachment once and strips the duplicate
from the final payload.
[embed ...][embed ...] is the only agent-facing rich render syntax for the Control UI.
Self-closing example:
[embed ref="cv_123" title="Status" /]
Rules:
[view ...] is no longer valid for new output.ref="..." or url="...".MEDIA: is not an embed alias and should not be used for rich embed rendering.The normalized/stored assistant content block is a structured canvas item:
{
"type": "canvas",
"preview": {
"kind": "canvas",
"surface": "assistant_message",
"render": "url",
"viewId": "cv_123",
"url": "/__openclaw__/canvas/documents/cv_123/index.html",
"title": "Status",
"preferredHeight": 320
}
}
Stored/rendered rich blocks use this canvas shape directly. present_view is not recognized.