Back to Netdata

S.M.A.R.T.

src/go/plugin/go.d/collector/smartctl/integrations/s.m.a.r.t..md

2.10.312.5 KB
Original Source
<!--startmeta custom_edit_url: "https://github.com/netdata/netdata/edit/master/src/go/plugin/go.d/collector/smartctl/README.md" meta_yaml: "https://github.com/netdata/netdata/edit/master/src/go/plugin/go.d/collector/smartctl/metadata.yaml" sidebar_label: "S.M.A.R.T." learn_status: "Published" learn_rel_path: "Collecting Metrics/Collectors/Storage and Filesystems" keywords: ['smart', 'S.M.A.R.T.', 'SCSI devices', 'ATA devices'] message: "DO NOT EDIT THIS FILE DIRECTLY, IT IS GENERATED BY THE COLLECTOR'S metadata.yaml FILE" endmeta-->

S.M.A.R.T.

Plugin: go.d.plugin Module: smartctl

Overview

This collector monitors the health status of storage devices by analyzing S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology) counters. It relies on the smartctl CLI tool but avoids directly executing the binary. Instead, it utilizes ndsudo, a Netdata helper specifically designed to run privileged commands securely within the Netdata environment. This approach eliminates the need to use sudo, improving security and potentially simplifying permission management.

Executed commands:

  • smartctl --json --scan
  • smartctl --json --all {deviceName} --device {deviceType} --nocheck {powerMode}

This collector is only supported on the following platforms:

  • Linux
  • BSD

This collector only supports collecting metrics from a single instance of this integration.

Default Behavior

Auto-Detection

This integration doesn't support auto-detection.

Limits

The default configuration for this integration does not impose any limits on data collection.

Performance Impact

The default configuration for this integration is not expected to impose a significant performance impact on the system.

Setup

You can configure the smartctl collector in two ways:

MethodBest forHow to
UIFast setup without editing filesGo to Nodes → Configure this node → Collectors → Jobs, search for smartctl, then click + to add a job.
FileIf you prefer configuring via file, or need to automate deployments (e.g., with Ansible)Edit go.d/smartctl.conf and add a job.

:::important

UI configuration requires paid Netdata Cloud plan.

:::

Prerequisites

Install smartmontools (v7.0+)

Install smartmontools version 7.0 or later using your distribution's package manager. Version 7.0 introduced the --json output mode, which is required for this collector to function properly.

For Netdata running in a Docker container

Provide access to storage devices.

Netdata requires the SYS_RAWIO capability and access to the storage devices to run the smartctl collector inside a Docker container. Here's how you can achieve this:

  • docker run

    bash
    docker run --cap-add SYS_RAWIO --device /dev/sda:/dev/sda ...
    
  • docker-compose.yml

    yaml
    services:
      netdata:
        cap_add:
          - SYS_PTRACE
          - SYS_ADMIN
          - SYS_RAWIO # smartctl
        devices:
          - "/dev/sda:/dev/sda"
    

Multiple Devices: These examples only show mapping of one device (/dev/sda). You'll need to add additional --device options (in docker run) or entries in the devices list (in docker-compose.yml) for each storage device you want Netdata's smartctl collector to monitor.

NVMe Devices: Do not map NVMe devices using this method. Netdata uses a dedicated collector to monitor NVMe devices.

Configuration

Options

The following options can be defined globally: update_every.

<details open><summary>Config options</summary>
GroupOptionDescriptionDefaultRequired
Collectionupdate_everyNetdata chart update interval (seconds). Collector may use cached data if this is less than poll_devices_every.10no
timeoutsmartctl binary execution timeout (seconds).5no
scan_everyDevice discovery interval using smartctl --scan (seconds). Set 0 to scan only once at startup.900no
poll_devices_everyDevice polling interval (seconds). Data is cached for this interval.300no
Targetdevice_selectorPattern to match the 'info name' of devices as reported by smartctl --scan --json.*no
extra_devicesManually specify devices not auto-detected by smartctl --scan. Each entry must include both a name and a type.[]no
Performanceconcurrent_scansNumber of devices to scan concurrently. Set 0 for sequential scanning (default). Helps performance when monitoring many devices.0no
no_check_power_modeSkip data collection when device is in low-power mode (avoids unnecessary spin-up).standbyno

<a id="option-performance-no-check-power-mode"></a>

no_check_power_mode

Valid arguments:

ModeDescription
neverCheck the device always.
sleepSkip check if device is in SLEEP mode.
standbySkip check if device is in SLEEP or STANDBY mode (prevents spin-up).
idleSkip check if device is in SLEEP, STANDBY, or IDLE mode (not recommended since disks may still be spinning).
</details>

via UI

Configure the smartctl collector from the Netdata web interface:

  1. Go to Nodes.
  2. Select the node where you want the smartctl data-collection job to run and click the :gear: (Configure this node). That node will run the data collection.
  3. The Collectors → Jobs view opens by default.
  4. In the Search box, type smartctl (or scroll the list) to locate the smartctl collector.
  5. Click the + next to the smartctl collector to add a new job.
  6. Fill in the job fields, then click Test to verify the configuration and Submit to save.
    • Test runs the job with the provided settings and shows whether data can be collected.
    • If it fails, an error message appears with details (for example, connection refused, timeout, or command execution errors), so you can adjust and retest.

via File

The configuration file name for this integration is go.d/smartctl.conf.

The file format is YAML. Generally, the structure is:

yaml
update_every: 1
autodetection_retry: 0
jobs:
  - name: some_name1
  - name: some_name2

You can edit the configuration file using the edit-config script from the Netdata config directory.

bash
cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
sudo ./edit-config go.d/smartctl.conf
Examples
Custom devices poll interval

Allows you to override the default devices poll interval (data collection).

<details open><summary>Config</summary>
yaml
jobs:
  - name: smartctl
    devices_poll_interval: 60  # Collect S.M.A.R.T statistics every 60 seconds

</details>
Concurrent scanning for multiple devices

This example demonstrates enabling concurrent scanning to improve performance when monitoring many devices.

<details open><summary>Config</summary>
yaml
jobs:
  - name: smartctl
    concurrent_scans: 4  # Scan up to 4 devices concurrently

</details>
Extra devices

This example demonstrates using extra_devices to manually add a storage device (/dev/sdc) not automatically detected by smartctl --scan.

<details open><summary>Config</summary>
yaml
jobs:
  - name: smartctl
    extra_devices:
      - name: /dev/sdc
        type: jmb39x-q,3

</details>

Alerts

There are no alerts configured by default for this integration.

Metrics

Metrics grouped by scope.

The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.

Per controller

These metrics refer to the Storage Device.

Labels:

LabelDescription
device_nameDevice name
device_typeDevice type
model_nameModel name
serial_numberSerial number

Metrics:

MetricDimensionsUnit
smartctl.device_smart_statuspassed, failedstatus
smartctl.device_ata_smart_error_log_counterror_loglogs
smartctl.device_power_on_timepower_on_timeseconds
smartctl.device_temperaturetemperatureCelsius
smartctl.device_power_cycles_countpowercycles
smartctl.device_read_errors_ratecorrected, uncorrectederrors/s
smartctl.device_write_errors_ratecorrected, uncorrectederrors/s
smartctl.device_verify_errors_ratecorrected, uncorrectederrors/s
smartctl.device_smart_attr_{attribute_name}{attribute_name}{attribute_unit}
smartctl.device_smart_attr_{attribute_name}_normalized{attribute_name}value

Troubleshooting

Debug Mode

Important: Debug mode is not supported for data collection jobs created via the UI using the Dyncfg feature.

To troubleshoot issues with the smartctl collector, run the go.d.plugin with the debug option enabled. The output should give you clues as to why the collector isn't working.

  • Navigate to the plugins.d directory, usually at /usr/libexec/netdata/plugins.d/. If that's not the case on your system, open netdata.conf and look for the plugins setting under [directories].

    bash
    cd /usr/libexec/netdata/plugins.d/
    
  • Switch to the netdata user.

    bash
    sudo -u netdata -s
    
  • Run the go.d.plugin to debug the collector:

    bash
    ./go.d.plugin -d -m smartctl
    

    To debug a specific job:

    bash
    ./go.d.plugin -d -m smartctl -j jobName
    

Getting Logs

If you're encountering problems with the smartctl collector, follow these steps to retrieve logs and identify potential issues:

  • Run the command specific to your system (systemd, non-systemd, or Docker container).
  • Examine the output for any warnings or error messages that might indicate issues. These messages should provide clues about the root cause of the problem.

System with systemd

Use the following command to view logs generated since the last Netdata service restart:

bash
journalctl _SYSTEMD_INVOCATION_ID="$(systemctl show --value --property=InvocationID netdata)" --namespace=netdata --grep smartctl

System without systemd

Locate the collector log file, typically at /var/log/netdata/collector.log, and use grep to filter for collector's name:

bash
grep smartctl /var/log/netdata/collector.log

Note: This method shows logs from all restarts. Focus on the latest entries for troubleshooting current issues.

Docker Container

If your Netdata runs in a Docker container named "netdata" (replace if different), use this command:

bash
docker logs netdata 2>&1 | grep smartctl