Back to Sglang

sgl-router (experimental) monitoring

experimental/sgl-router/monitoring/README.md

0.5.142.8 KB
Original Source
<!-- SPDX-FileCopyrightText: Copyright (c) 2026 The SGLang Authors SPDX-License-Identifier: Apache-2.0 -->

sgl-router (experimental) monitoring

Grafana dashboard for the experimental router's Prometheus metrics, exposed on /metrics (text/plain, version 0.0.4) on the router's serving port (default 30000).

Files

  • grafana-dashboard.json — importable Grafana dashboard, SGLang Router (experimental) (uid sgl-router-experimental).

Metrics covered

The dashboard graphs every family the router emits:

MetricTypeWhat it shows
sgl_router_requests_totalCounterDispatches by worker_url, model_id, mode, outcome
sgl_router_request_duration_secondsHistogramEnd-to-end request latency by model_id
sgl_router_ttft_secondsHistogramTime to first token (streaming) by model_id
sgl_router_responses_totalCounterClient-visible HTTP status_code
sgl_router_overlap_blocksHistogramCache-aware-zmq overlap blocks by model_id
sgl_router_active_loadGaugePer-worker prefill-token / decode-block load
sgl_router_workersGaugeRegistered worker count by mode
sgl_router_worker_healthGaugePer-worker health (1=breaker admits, 0=open)
sgl_router_worker_cb_stateGaugePer-worker circuit breaker state (0=closed, 1=open, 2=half_open)
sgl_router_worker_inflight_requestsGaugeIn-flight requests per worker
sgl_router_stale_requests_totalCounterStale-request cancellations
sgl_router_decode_affinity_totalCounterPD decode-affinity outcomes
sgl_router_sticky_totalCounterSticky-session selection outcomes

The sgl_router_workers / sgl_router_worker_* gauges are sampled from the live worker registry on every scrape, so a removed worker stops emitting series immediately rather than leaving a stale value.

Prometheus scrape config

Point Prometheus at the router's /metrics endpoint:

yaml
scrape_configs:
  - job_name: sgl-router
    metrics_path: /metrics
    static_configs:
      - targets:
          - '127.0.0.1:30000'   # router host:port

Import into Grafana

  1. Dashboards → New → Import.
  2. Upload grafana-dashboard.json (or paste its contents).
  3. When prompted, select your Prometheus data source for the Datasource variable. The dashboard uses a templated data source, so it imports into any Grafana without editing the JSON.

The top bar exposes model_id and worker_url template variables (both default to All) to scope the panels.

Regenerating

The JSON is generated programmatically to keep the ~20 panels consistent. If the metric surface changes, update the generator and overwrite the JSON rather than hand-editing — hand-edits drift from the panel conventions.