client/internal/metrics/infra/README.md
Internal documentation for the NetBird client metrics system.
Client metrics track connection performance and sync durations using InfluxDB line protocol (influxdb.go). Each event is pushed once then cleared.
Metrics collection is always active (for debug bundles). Push to backend is:
NB_METRICS_PUSH_ENABLED=true)Daemon Layer (connect.go)
├─ Creates ClientMetrics instance once
├─ Starts/stops push lifecycle
└─ Updates AgentInfo on profile switch
│
▼
Engine Layer (engine.go)
└─ Records metrics via ClientMetrics methods
Clients do not talk to InfluxDB directly. An ingest server sits between clients and InfluxDB:
Client ──POST──▶ Ingest Server (:8087) ──▶ InfluxDB (internal)
│
├─ Validates line protocol
├─ Allowlists measurements, fields, and tags
├─ Rejects out-of-bound values
└─ Serves remote config at /config
X-Peer-ID header.ingest/main.goMeasurement: netbird_peer_connection
| Field | Timestamps | Description |
|---|---|---|
signaling_to_connection_seconds | SignalingReceived → ConnectionReady | ICE/relay negotiation time after the first signal is received from the remote peer |
connection_to_wg_handshake_seconds | ConnectionReady → WgHandshakeSuccess | WireGuard cryptographic handshake latency once the transport layer is ready |
total_seconds | SignalingReceived → WgHandshakeSuccess | End-to-end connection time anchored at the first received signal |
Tags:
deployment_type: "cloud" | "selfhosted" | "unknown"connection_type: "ice" | "relay"attempt_type: "initial" | "reconnection"version: NetBird version stringos: Operating system (linux, darwin, windows, android, ios, etc.)arch: CPU architecture (amd64, arm64, etc.)Note: SignalingReceived is set when the first offer or answer arrives from the remote peer (in both initial and reconnection paths). It excludes the potentially unbounded wait for the remote peer to come online.
Measurement: netbird_sync
| Field | Description |
|---|---|
duration_seconds | Time to process a sync message from management server |
Tags:
deployment_type: "cloud" | "selfhosted" | "unknown"version: NetBird version stringos: Operating system (linux, darwin, windows, android, ios, etc.)arch: CPU architecture (amd64, arm64, etc.)Measurement: netbird_login
| Field | Description |
|---|---|
duration_seconds | Time to complete the login/auth exchange with management server |
Tags:
deployment_type: "cloud" | "selfhosted" | "unknown"result: "success" | "failure"version: NetBird version stringos: Operating system (linux, darwin, windows, android, ios, etc.)arch: CPU architecture (amd64, arm64, etc.)The InfluxDB backend limits in-memory sample storage to prevent unbounded growth when pushes fail:
| Variable | Default | Description |
|---|---|---|
NB_METRICS_PUSH_ENABLED | false | Enable metrics push to backend |
NB_METRICS_SERVER_URL | (from remote config) | Ingest server URL (e.g., https://ingest.netbird.io) |
NB_METRICS_INTERVAL | (from remote config) | Push interval (e.g., "1m", "30m", "4h") |
NB_METRICS_FORCE_SENDING | false | Skip remote config, push unconditionally |
NB_METRICS_CONFIG_URL | https://ingest.netbird.io/config | Remote push config URL |
NB_METRICS_SERVER_URL and NB_METRICS_INTERVAL override their respective values but do not bypass remote config eligibility checks (version range). Use NB_METRICS_FORCE_SENDING=true to skip all remote config gating.
| Variable | Default | Description |
|---|---|---|
INGEST_LISTEN_ADDR | :8087 | Listen address |
INFLUXDB_URL | http://influxdb:8086/api/v2/write?org=netbird&bucket=metrics&precision=ns | InfluxDB write endpoint |
INFLUXDB_TOKEN | (required) | InfluxDB auth token (server-side only) |
CONFIG_METRICS_SERVER_URL | (empty — disables /config) | server_url in the remote config JSON (the URL clients push metrics to) |
CONFIG_VERSION_SINCE | 0.0.0 | Minimum client version to push metrics |
CONFIG_VERSION_UNTIL | 99.99.99 | Maximum client version to push metrics |
CONFIG_PERIOD_MINUTES | 5 | Push interval in minutes |
The ingest server serves a remote config JSON at GET /config when CONFIG_METRICS_SERVER_URL is set. Clients can use NB_METRICS_CONFIG_URL=http://<ingest>/config to fetch it.
For URL and Interval, the precedence is:
NB_METRICS_SERVER_URL / NB_METRICS_INTERVALNB_METRICS_CONFIG_URLStartPush() spawns background goroutine with timerpush() → Export() → HTTP POST to ingest serverReset() clears pushed samplesStopPush() cancels context and waits for goroutineSamples are collected with exact timestamps, pushed once, then cleared. No data is resent.
# From this directory (client/internal/metrics/infra)
cp .env.example .env
# Edit .env to set INFLUXDB_ADMIN_PASSWORD, INFLUXDB_ADMIN_TOKEN, and GRAFANA_ADMIN_PASSWORD
docker compose up -d
This starts:
X-Peer-ID header, no secret/token auth)export NB_METRICS_PUSH_ENABLED=true
export NB_METRICS_FORCE_SENDING=true
export NB_METRICS_SERVER_URL=http://localhost:8087
export NB_METRICS_INTERVAL=1m
cd ../../../..
go run ./client/ up
# Query via InfluxDB (using admin token from .env)
docker compose exec influxdb influx query \
'from(bucket: "metrics") |> range(start: -1h)' \
--org netbird
# Check ingest server health
curl http://localhost:8087/health