Back to Microsandbox

Metrics collector

docs/observability/msb-metrics.mdx

0.5.56.2 KB
Original Source

msb-metrics is a sibling process. It reads the microsandbox shared-memory metrics registry on a fixed interval and ships per-sandbox metrics to any OpenTelemetry-compatible backend.

Think of it the way you'd run otel-collector, prometheus-node-exporter, or fluent-bit: one process per host, lifecycle managed independently.

It's one of three ways to read sandbox metrics. For one-shot inspection from the terminal, use the msb metrics CLI command. For programmatic per-sandbox reads from application code, use Sandbox::metrics(). All three read the same shared-memory registry and can coexist; the diagram below shows how they relate.

Where it fits

mermaid
flowchart TD
    classDef store fill:#f5f5f5,stroke:#9d9d9d,color:#333
    classDef thispage fill:#bf84fe,color:#fff,stroke:#9d5fe0

    SB1[Sandbox A]
    SB2[Sandbox B]
    SB3[Sandbox N]
    SB1 -->|writes samples| SHM
    SB2 -->|writes samples| SHM
    SB3 -->|writes samples| SHM
    SHM[("Shared-memory
registry")]:::store

    SHM -->|read| CLI["msb metrics
(CLI, one-shot)"]
    SHM -->|read| SDK["Sandbox::metrics
(SDK, in-process)"]
    SHM -->|read| MSB["msb-metrics
(sidecar, continuous)"]:::thispage

    CLI --> TERM[Terminal output]
    SDK --> APP[Your application]
    MSB -->|OTLP gRPC/HTTP| BACKEND["Backend
(Grafana / Datadog / …)"]:::thispage

Three surfaces read the same shared-memory registry. This page is about the highlighted path: a continuous push to an OTel-compatible backend.

Install

msb-metrics is shipped as a standalone binary and is not bundled with the main msb installer. Download the build for your platform from the latest release and place it on your PATH.

Quick start

<Steps> <Step title="Run msb-metrics against a local OTLP receiver"> <Tabs> <Tab title="gRPC (default)"> Default port `4317`. Recommended for most local OTLP collectors and sidecars.
    ```sh
    msb-metrics otel --endpoint=http://localhost:4317
    ```
  </Tab>
  <Tab title="HTTP/Protobuf">
    Default port `4318`. Use when the backend's gRPC port isn't
    reachable, or when the gateway expects HTTP (e.g. Grafana
    Cloud's OTLP gateway over HTTPS).

    ```sh
    msb-metrics otel --endpoint=https://example.com/otlp/v1/metrics --protocol=http
    ```
  </Tab>
</Tabs>
</Step> <Step title="Boot a sandbox"> ```sh msb run alpine ``` </Step> <Step title="Watch metrics flow"> The collector polls shared memory every second, batches per-exporter, and ships over OTLP. Press Ctrl+C to drain buffers and exit cleanly. </Step> </Steps>

Pick your backend

End-to-end setup walkthroughs live under Recipes:

<CardGroup cols={2}> <Card title="Grafana Cloud" icon="cloud-arrow-up" href="/recipes/metrics-backends/grafana-cloud"> Direct OTLP to Grafana Cloud's gateway. </Card> <Card title="Grafana Alloy" icon="route" href="/recipes/metrics-backends/grafana-alloy"> Local Alloy as a forwarder. Recommended for production. </Card> <Card title="Prometheus" icon="fire" href="/recipes/metrics-backends/prometheus"> Direct OTLP to Prometheus's native receiver. </Card> <Card title="otel-collector" icon="terminal" href="/recipes/metrics-backends/otel-collector"> Local development with the OpenTelemetry Collector. </Card> <Card title="Datadog" icon="chart-line" href="/recipes/metrics-backends/datadog"> Via the Datadog Agent's OTLP receiver. </Card> </CardGroup>

Labels and per-user views

Labels you set at sandbox creation ride along to your backend as metric attributes, so you can build per-user, per-tenant, or per-environment views without naming every sandbox.

bash
msb create alpine --name agent-1 --label user.id=alice --label tenant=acme

msb-metrics reads those labels from the catalog and attaches them to every datapoint emitted for that sandbox.

Label names in PromQL

OpenTelemetry attribute keys allow dots; Prometheus label names do not, so Prometheus and Grafana Cloud normalize . to _. A label set as user.id is queried as user_id. (Metric names are normalized the same way: microsandbox.cpu.utilization becomes microsandbox_cpu_utilization.)

Per-user dashboard queries

Use the labels to slice the standard sandbox metrics. CPU is the simplest to start with (the same labels apply to every metric — see the metric table):

promql
# CPU per user (vCPU-seconds per second)
sum by (user_id) (microsandbox_cpu_utilization)

# Top 10 users by CPU right now
topk(10, sum by (user_id) (microsandbox_cpu_utilization))

# One user's sandboxes
microsandbox_cpu_utilization{user_id="alice"}

# Scope to a tenant as well
sum by (user_id) (microsandbox_cpu_utilization{tenant="acme"})

To make a reusable dashboard, add a Grafana template variable so the panels follow a dropdown:

text
Variable:  user
Query:     label_values(microsandbox_cpu_utilization, user_id)
Panels:    microsandbox_cpu_utilization{user_id="$user"}

Labels are on by default. Pass --no-labels to turn them off, e.g. when a high-cardinality key like user.id would inflate active-series billing. To drop individual noisy keys while keeping the rest, repeat --exclude-label-key <key> (e.g. --exclude-label-key org.opencontainers.image.revision); the key stays in the catalog for msb inspect and is only withheld from metrics. See the Deep dive for the cardinality trade-offs.

Reference

For flags, metric names, attribute tables, operational notes, and troubleshooting, see the Deep dive.

See also

  • Deep dive: flags, emitted metrics, attributes, operations, troubleshooting.
  • Sandbox::metrics(): read metrics for a single sandbox from application code, an alternative to shipping via OTLP.
  • msb metrics: one-shot CLI inspection of current per-sandbox metrics.