Back to Netdata

scripts.d.plugin (preview)

src/go/plugin/scripts.d/README.md

2.10.36.0 KB
Original Source

scripts.d.plugin (preview)

scripts.d.plugin runs Nagios-style check scripts inside Netdata without changing plugin output format. The active collector is nagios (single collector surface), implemented as a normal V2 collector with collector-local scheduling/state.

Status: preview. Core execution, retry/state tracking, and perfdata routing are implemented; config/docs may still evolve.

Configuration

  • Plugin-level toggles: /etc/netdata/scripts.d.conf
  • Collector jobs: /etc/netdata/scripts.d/nagios.conf

Each job is a Nagios check definition.

Example:

yaml
jobs:
  - name: ping_localhost
    plugin: "/usr/lib/nagios/plugins/check_ping"
    args: ["-H", "127.0.0.1", "-w", "100.0,20%", "-c", "200.0,40%"]
    timeout: 5s
    check_interval: 1m
    retry_interval: 30s
    max_check_attempts: 3

The plugin value must be an absolute path. If you need an interpreter, point plugin to the interpreter executable and pass the script path in args.

Time Periods

check_period is supported. Custom periods are defined with time_periods inside the same job definition.

yaml
jobs:
  - name: local_plugins
    plugin: "/usr/lib/nagios/plugins/check_dummy"
    args: ["0", "ok"]
    check_period: 24x7
    time_periods:
      - name: 24x7
        alias: Always on
        rules:
          - type: weekly
            days: [sunday, monday, tuesday, wednesday, thursday, friday, saturday]
            ranges: ["00:00-24:00"]

Writing Compatible Checks

A compatible check returns a Nagios state with its exit code and prints a status line that Netdata can parse.

  • Exit codes:
    • 0 = OK
    • 1 = WARNING
    • 2 = CRITICAL
    • 3 = UNKNOWN
  • First-line output format:
    • <summary text> | <perfdata>
  • The | separator is optional:
    • text before | is the human-readable summary
    • text after | is performance data used for auto-generated charts
  • Each performance-data item follows:
    • 'label'=value[UOM];warn;crit;min;max
  • Separate multiple metrics with spaces.
  • Common units include:
    • %, s, ms, B, KB, MB, GB, c
  • If the script prints multiple lines:
    • the first line is the summary
    • the remaining lines are kept as long output

Minimal example:

bash
#!/bin/sh
echo "CPU OK - 20% used | cpu=20%;80;90"
exit 0

Execution Model

  • Each jobs: entry becomes one V2 Nagios collector instance.
  • Script execution happens during Collect() only when the job is due.
  • update_every is the scheduling resolution.
  • If update_every is slower than check_interval or retry_interval, Netdata logs a warning and the effective cadence is limited by update_every.
  • Nagios semantics are preserved collector-side:
    • check_interval
    • retry_interval
    • max_check_attempts
    • check_period
  • If a check exceeds timeout, Netdata reports the job state as timeout.
  • If a check is due but the current time is outside check_period, Netdata does not execute it and reports the public job state as paused.
  • Non-due successful cycles replay the last cached perfdata values and threshold states so chartengine series stay alive between executions.
  • When a due run is blocked by check_period, perfdata value charts remain at their last observed values, but threshold-state charts are zeroed until the next successful execution.
  • Counter perfdata keeps counter semantics for the value series. Replayed raw totals naturally flatten to zero deltas between executions.

Defaults:

  • check_interval: 5m
  • retry_interval: 1m
  • timeout: 5s
  • max_check_attempts: 3

Metrics and Charts

Static template charts:

  • nagios.job.execution_state
  • nagios.job.perfdata_threshold_state
  • nagios.job.execution_duration
  • nagios.job.execution_cpu
  • nagios.job.execution_memory

Perfdata is routed plugin-side and materialized via autogen:

  • Unit classes: time, bytes, bits, percent, counter, generic
  • Metric identity: sanitized perfdata key (from Nagios perfdata label)
  • Unit-class changes create a new metric identity
  • Collision policy: deterministic keep-first, drop conflicting label
  • Per-job metric count is capped by the collector budget before emission
  • Each perfdata metric creates one value chart.
  • Non-counter perfdata also creates:
    • one plugin-scoped derived threshold-state chart for visualization
    • one static nagios.job.perfdata.threshold_state duplicate for alerting, labeled by perfdata_value=<class>_<metricKey>
  • Threshold-state values are:
    • no_threshold
    • ok
    • warning
    • critical
  • Counter perfdata currently does not emit a threshold-state chart.
  • Raw min, max, and raw threshold bounds are not charted.

Alerts

  • Built-in Netdata health alerts are shipped for:
    • nagios.job.execution_state
    • nagios.job.perfdata_threshold_state
  • nagios.job.execution_state is a bitset chart. It always exposes the current primary state and also exposes retry=1 while a non-OK result is still retrying.
  • nagios.job.perfdata_threshold_state is also a bitset chart. It exposes the current non-counter perfdata threshold state and also exposes retry=1 while that threshold result comes from a retrying soft run.
  • Stock alerts cover only the warning and critical states and suppress retrying soft states on both built-in alert contexts.
  • If you want alerts for unknown, timeout, paused, or custom perfdata alerting rules, use these contexts as the base for your own rules.

Logging

Checks log through the collector/job logger path. There is no separate public runtime component or scheduler telemetry surface.

Windows Note

  • On Windows, the collector runs the command named in plugin directly.
  • Use an executable path, or point plugin to an interpreter such as powershell.exe and pass the script path in args.

Tests

bash
cd src/go
go test ./plugin/scripts.d/collector/nagios/... -count=1

Build

bash
cmake -DENABLE_PLUGIN_SCRIPTS=On ..
cmake --build . --target scripts-plugin

Binary path:

  • usr/libexec/netdata/plugins.d/scripts.d.plugin

Stock config path:

  • usr/lib/netdata/conf.d/scripts.d/