src/go/plugin/scripts.d/README.md
scripts.d.plugin runs Nagios-style check scripts inside Netdata without changing
plugin output format. The active collector is nagios (single collector surface),
implemented as a normal V2 collector with collector-local scheduling/state.
Status: preview. Core execution, retry/state tracking, and perfdata routing are implemented; config/docs may still evolve.
/etc/netdata/scripts.d.conf/etc/netdata/scripts.d/nagios.confEach job is a Nagios check definition.
Example:
jobs:
- name: ping_localhost
plugin: "/usr/lib/nagios/plugins/check_ping"
args: ["-H", "127.0.0.1", "-w", "100.0,20%", "-c", "200.0,40%"]
timeout: 5s
check_interval: 1m
retry_interval: 30s
max_check_attempts: 3
The plugin value must be an absolute path. If you need an interpreter, point
plugin to the interpreter executable and pass the script path in args.
check_period is supported. Custom periods are defined with time_periods inside
the same job definition.
jobs:
- name: local_plugins
plugin: "/usr/lib/nagios/plugins/check_dummy"
args: ["0", "ok"]
check_period: 24x7
time_periods:
- name: 24x7
alias: Always on
rules:
- type: weekly
days: [sunday, monday, tuesday, wednesday, thursday, friday, saturday]
ranges: ["00:00-24:00"]
A compatible check returns a Nagios state with its exit code and prints a status line that Netdata can parse.
0 = OK1 = WARNING2 = CRITICAL3 = UNKNOWN<summary text> | <perfdata>| separator is optional:
| is the human-readable summary| is performance data used for auto-generated charts'label'=value[UOM];warn;crit;min;max%, s, ms, B, KB, MB, GB, cMinimal example:
#!/bin/sh
echo "CPU OK - 20% used | cpu=20%;80;90"
exit 0
jobs: entry becomes one V2 Nagios collector instance.Collect() only when the job is due.update_every is the scheduling resolution.update_every is slower than check_interval or retry_interval, Netdata
logs a warning and the effective cadence is limited by update_every.check_intervalretry_intervalmax_check_attemptscheck_periodtimeout, Netdata reports the job state as timeout.check_period, Netdata does not execute it and reports the public job state as paused.check_period, perfdata value charts remain at their last observed values, but threshold-state charts are zeroed until the next successful execution.Defaults:
check_interval: 5mretry_interval: 1mtimeout: 5smax_check_attempts: 3Static template charts:
nagios.job.execution_statenagios.job.perfdata_threshold_statenagios.job.execution_durationnagios.job.execution_cpunagios.job.execution_memoryPerfdata is routed plugin-side and materialized via autogen:
time, bytes, bits, percent, counter, genericnagios.job.perfdata.threshold_state duplicate for alerting, labeled by perfdata_value=<class>_<metricKey>no_thresholdokwarningcriticalmin, max, and raw threshold bounds are not charted.nagios.job.execution_statenagios.job.perfdata_threshold_statenagios.job.execution_state is a bitset chart. It always exposes the current
primary state and also exposes retry=1 while a non-OK result is still
retrying.nagios.job.perfdata_threshold_state is also a bitset chart. It exposes the
current non-counter perfdata threshold state and also exposes retry=1 while
that threshold result comes from a retrying soft run.warning and critical states and suppress
retrying soft states on both built-in alert contexts.unknown, timeout, paused, or custom perfdata
alerting rules, use these contexts as the base for your own rules.Checks log through the collector/job logger path. There is no separate public runtime component or scheduler telemetry surface.
plugin directly.plugin to an interpreter such as
powershell.exe and pass the script path in args.cd src/go
go test ./plugin/scripts.d/collector/nagios/... -count=1
cmake -DENABLE_PLUGIN_SCRIPTS=On ..
cmake --build . --target scripts-plugin
Binary path:
usr/libexec/netdata/plugins.d/scripts.d.pluginStock config path:
usr/lib/netdata/conf.d/scripts.d/