docs/gateway/prometheus.md
OpenClaw can expose diagnostics metrics through the official diagnostics-prometheus plugin. It listens to trusted internal diagnostics and renders a Prometheus text endpoint at:
GET /api/diagnostics/prometheus
Content type is text/plain; version=0.0.4; charset=utf-8, the standard Prometheus exposition format.
For traces, logs, OTLP push, and OpenTelemetry GenAI semantic attributes, see OpenTelemetry export.
```bash
curl -H "Authorization: Bearer $OPENCLAW_GATEWAY_TOKEN" \
http://127.0.0.1:18789/api/diagnostics/prometheus
```
| Metric | Type | Labels |
|---|---|---|
openclaw_run_completed_total | counter | channel, model, outcome, provider, trigger |
openclaw_run_duration_seconds | histogram | channel, model, outcome, provider, trigger |
openclaw_model_call_total | counter | api, error_category, model, outcome, provider, transport |
openclaw_model_call_duration_seconds | histogram | api, error_category, model, outcome, provider, transport |
openclaw_model_tokens_total | counter | agent, channel, model, provider, token_type |
openclaw_gen_ai_client_token_usage | histogram | model, provider, token_type |
openclaw_model_cost_usd_total | counter | agent, channel, model, provider |
openclaw_tool_execution_total | counter | error_category, outcome, params_kind, tool |
openclaw_tool_execution_duration_seconds | histogram | error_category, outcome, params_kind, tool |
openclaw_harness_run_total | counter | channel, error_category, harness, model, outcome, phase, plugin, provider |
openclaw_harness_run_duration_seconds | histogram | channel, error_category, harness, model, outcome, phase, plugin, provider |
openclaw_message_processed_total | counter | channel, outcome, reason |
openclaw_message_processed_duration_seconds | histogram | channel, outcome, reason |
openclaw_message_delivery_total | counter | channel, delivery_kind, error_category, outcome |
openclaw_message_delivery_duration_seconds | histogram | channel, delivery_kind, error_category, outcome |
openclaw_queue_lane_size | gauge | lane |
openclaw_queue_lane_wait_seconds | histogram | lane |
openclaw_session_state_total | counter | reason, state |
openclaw_session_queue_depth | gauge | state |
openclaw_memory_bytes | gauge | kind |
openclaw_memory_rss_bytes | histogram | none |
openclaw_memory_pressure_total | counter | level, reason |
openclaw_telemetry_exporter_total | counter | exporter, reason, signal, status |
openclaw_prometheus_series_dropped_total | counter | none |
Label values are redacted and must match OpenClaw's low-cardinality character policy. Values that fail the policy are replaced with `unknown`, `other`, or `none`, depending on the metric.
Watch this counter as a hard signal that an attribute upstream is leaking high-cardinality values. The exporter never lifts the cap automatically; if it climbs, fix the source rather than disabling the cap.
# Tokens per minute, split by provider
sum by (provider) (rate(openclaw_model_tokens_total[1m]))
# Spend (USD) over the last hour, by model
sum by (model) (increase(openclaw_model_cost_usd_total[1h]))
# 95th percentile model run duration
histogram_quantile(
0.95,
sum by (le, provider, model)
(rate(openclaw_run_duration_seconds_bucket[5m]))
)
# Queue wait time SLO (95p under 2s)
histogram_quantile(
0.95,
sum by (le, lane) (rate(openclaw_queue_lane_wait_seconds_bucket[5m]))
) < 2
# Dropped Prometheus series (cardinality alarm)
increase(openclaw_prometheus_series_dropped_total[15m]) > 0
OpenClaw supports both surfaces independently. You can run either, both, or neither.
<Tabs> <Tab title="diagnostics-prometheus"> - **Pull** model: Prometheus scrapes `/api/diagnostics/prometheus`. - No external collector required. - Authenticated through normal Gateway auth. - Surface is metrics only (no traces or logs). - Best for stacks already standardized on Prometheus + Grafana. </Tab> <Tab title="diagnostics-otel"> - **Push** model: OpenClaw sends OTLP/HTTP to a collector or OTLP-compatible backend. - Surface includes metrics, traces, and logs. - Bridges to Prometheus through an OpenTelemetry Collector (`prometheus` or `prometheusremotewrite` exporter) when you need both. - See [OpenTelemetry export](/gateway/opentelemetry) for the full catalog. </Tab> </Tabs>/healthz and /readyz probes