docs/functions/processes.md
apps.plugin| Operating System | Traditional Tools |
|---|---|
| Linux | top, htop, ps aux, pidstat |
| Windows | Task Manager, tasklist, Get-Process (PowerShell) |
| FreeBSD | top, ps aux, procstat |
| macOS | Activity Monitor, top, ps aux |
The Netdata processes function provides several advantages over traditional tools:
The processes function is the drill-down companion to Apps (apps.plugin) charts, providing complete visibility into how system resources are broken down by individual processes and how they are aggregated into the categories shown in Netdata dashboards.
apps.plugin intelligently groups processes into categories to avoid extreme cardinality issues (millions of potential PIDs). It identifies spawn managers (systemd, containerd, init, etc.) and groups process trees by their top-most parent - the direct children of these spawn managers. This creates a manageable set of categories with accumulated metrics from entire process trees, including exited children.
When users see that "Application X" consumes significant resources in the charts, they need to understand:
The processes function answers these questions by showing:
Category (matching the chart instances)Category field| Field | Type | Description | Filterable | Sortable | Groupable | OS Availability |
|---|---|---|---|---|---|---|
| PID | Integer | Process ID | ✓ | ✓ | ✓ | All |
| Cmd | String | Process command name | ✓ | ✓ | - | All |
| Name | String | Process friendly name (if available) | ✓ | ✓ | - | Windows only |
| CmdLine | String | Full command line with arguments (requires elevated access) | ✓ | ✓ | - | Linux, FreeBSD, macOS |
| PPID | Integer | Parent process ID | ✓ | ✓ | ✓ | All |
| Category | String | Process category from apps_groups.conf | ✓ | ✓ | ✓ | All |
| User | String | User owner of the process | ✓ | ✓ | ✓ | All |
| Uid | Integer | User ID | ✓ | ✓ | - | Linux, FreeBSD, macOS |
| Group | String | Group owner | ✓ | ✓ | ✓ | Linux, FreeBSD, macOS |
| Gid | Integer | Group ID | ✓ | ✓ | - | Linux, FreeBSD, macOS |
| CPU | Percentage | Total CPU usage (100% = 1 core) | ✓ | ✓ | - | All |
| UserCPU | Percentage | User-space CPU time | ✓ | ✓ | - | All |
| SysCPU | Percentage | Kernel-space CPU time | ✓ | ✓ | - | All |
| GuestCPU | Percentage | Guest VM CPU time (if available) | ✓ | ✓ | - | Linux only |
| CUserCPU | Percentage | Children user CPU (accumulated from exited children) | ✓ | ✓ | - | Linux, FreeBSD |
| CSysCPU | Percentage | Children system CPU (accumulated from exited children) | ✓ | ✓ | - | Linux, FreeBSD |
| CGuestCPU | Percentage | Children guest CPU (accumulated from exited children) | ✓ | ✓ | - | Linux only |
| vCtxSwitch | Rate | Voluntary context switches per second | ✓ | ✓ | - | Linux, macOS |
| iCtxSwitch | Rate | Involuntary context switches per second | ✓ | ✓ | - | Linux only |
| Memory | Percentage | Memory usage as percentage of total system RAM | ✓ | ✓ | - | All |
| Resident | MiB | Resident Set Size (physical memory) | ✓ | ✓ | - | All |
| Estimated | MiB | Estimated memory using PSS scaling (visible by default when enabled) | ✓ | ✓ | - | Linux 4.14+ (with PSS) |
| Pss | MiB | Proportional Set Size (hidden by default) | ✓ | ✓ | - | Linux 4.14+ (with PSS) |
| PssAge | Seconds | Time since last smaps sample (hidden by default) | ✓ | ✓ | - | Linux 4.14+ (with PSS) |
| SharedRatio | Percentage | Shared memory ratio from PSS (hidden by default) | ✓ | ✓ | - | Linux 4.14+ (with PSS) |
| Shared | MiB | Shared memory pages | ✓ | ✓ | - | Linux only |
| Virtual | MiB | Virtual memory size | ✓ | ✓ | - | All |
| Swap | MiB | Swap memory usage | ✓ | ✓ | - | Linux, Windows |
| PReads | KiB/s | Physical disk read rate | ✓ | ✓ | - | Linux only |
| PWrites | KiB/s | Physical disk write rate | ✓ | ✓ | - | Linux only |
| LReads | KiB/s | Logical I/O read rate (includes cache) | ✓ | ✓ | - | All |
| LWrites | KiB/s | Logical I/O write rate (includes cache) | ✓ | ✓ | - | All |
| ROps | ops/s | Read operations per second | ✓ | ✓ | - | Linux, Windows |
| WOps | ops/s | Write operations per second | ✓ | ✓ | - | Linux, Windows |
| MinFlt | pgflts/s | Minor page faults per second | ✓ | ✓ | - | All |
| MajFlt | pgflts/s | Major page faults per second | ✓ | ✓ | - | Linux, FreeBSD, macOS |
| CMinFlt | pgflts/s | Children minor faults (accumulated) | ✓ | ✓ | - | Linux, FreeBSD |
| CMajFlt | pgflts/s | Children major faults (accumulated) | ✓ | ✓ | - | Linux, FreeBSD |
| FDsLimitPercent | Percentage | File descriptors usage vs limit | ✓ | ✓ | - | Linux only |
| FDs | Count | Total open file descriptors | ✓ | ✓ | - | Linux, FreeBSD, macOS |
| Files | Count | Open regular files | ✓ | ✓ | - | Linux, FreeBSD, macOS |
| Pipes | Count | Open pipes | ✓ | ✓ | - | Linux, FreeBSD, macOS |
| Sockets | Count | Open network sockets | ✓ | ✓ | - | Linux, FreeBSD, macOS |
| iNotiFDs | Count | iNotify file descriptors | ✓ | ✓ | - | Linux only |
| EventFDs | Count | Event file descriptors | ✓ | ✓ | - | Linux only |
| TimerFDs | Count | Timer file descriptors | ✓ | ✓ | - | Linux only |
| SigFDs | Count | Signal file descriptors | ✓ | ✓ | - | Linux only |
| EvPollFDs | Count | Event poll descriptors | ✓ | ✓ | - | Linux only |
| OtherFDs | Count | Other file descriptors | ✓ | ✓ | - | Linux, FreeBSD, macOS |
| Handles | Count | Open handles (Windows compatibility) | ✓ | ✓ | - | Windows only |
| Processes | Count | Number of processes (1 for single process, >1 for multi-process apps) | ✓ | ✓ | - | All |
| Threads | Count | Number of threads | ✓ | ✓ | - | All |
| Uptime | Seconds | Process uptime | ✓ | ✓ | - | All |
Estimated, Pss, PssAge, and SharedRatio fields for more accurate memory accounting in shared-memory workloads. The plugin uses adaptive sampling that prioritizes the largest memory consumers and processes with significant memory changes, refreshing them within seconds of detection. All processes are guaranteed to be refreshed within 2× the configured PSS refresh period (default: 600 seconds). Disable with --pss 0 to remove these fields and use traditional RSS measurements.The typical workflow for drilling down to individual processes looks like this:
apps.plugin chart category (e.g., "web" consuming 80% CPU)category:web filter to see only processes in that categoryThe processes function provides complete visibility into how system resources are distributed across all running processes, enabling comprehensive resource accounting and analysis.
Sort by CPU, Memory, or I/O metrics descending to see which processes consume the most resources. Group by Category to understand resource allocation across application groups. This provides a complete breakdown of system resource utilization at the process level.
Filter by category:[name] to see all processes that contribute to a specific apps.plugin chart instance. Group by Cmd within a category to understand which different executables are grouped together. This reveals exactly how Netdata's intelligent grouping works and what's included in each category.
Group processes by User or Group to understand resource consumption patterns across different users and system accounts. Sort by aggregate CPU or memory within each group to identify which users are consuming the most resources. This helps with multi-tenant resource accounting and fair-share analysis.
When apps.plugin charts show high resource usage in a category, the processes function enables precise identification of the specific processes responsible.
Filter by category:[name] and sort by CPU descending to find the exact processes causing high CPU usage in a chart category. Look at both own CPU (UserCPU, SysCPU) and children CPU (CUserCPU, CSysCPU) to understand whether the load comes from the process itself or its children.
Filter by specific categories and sort by Resident or Memory percentage to identify which processes within an application group consume the most RAM. Compare Virtual vs Resident to understand memory allocation patterns and potential over-provisioning.
On Linux 4.14+ with PSS enabled (default), use Estimated instead of Resident for more accurate memory accounting in shared-memory workloads (databases, cache servers, etc.). The Estimated field scales shared memory using PSS ratios to show true proportional memory usage. Check SharedRatio to see the scaling factor - values significantly below 100% indicate heavy shared memory usage where Resident would overstate consumption. The PssAge field shows seconds since the last PSS sample - expect low values (under 10s) for large memory consumers due to adaptive prioritization, while smaller processes may show higher ages (up to 600s by default) as they are refreshed less frequently.
Sort by PReads + PWrites for physical I/O or LReads + LWrites for logical I/O to find processes generating the most disk activity. Filter by category to drill down from chart-level I/O metrics to specific process-level I/O patterns.
The processes function excels at identifying various types of resource leaks by correlating resource usage with process uptime.
Filter processes with Uptime > 3600 (one hour) and sort by Resident (or Estimated on Linux with PSS enabled) memory descending. Look for processes where memory consumption is disproportionately high relative to their uptime. Track specific PIDs over time to observe continuously growing memory usage patterns. On shared-memory workloads, use Estimated to avoid false positives from shared pages that aren't actually leaking. Note that PSS samples for large memory consumers are refreshed within seconds, providing near real-time leak detection.
Sort by FDs count or filter for FDsLimitPercent > 50 to find processes approaching their file descriptor limits. Examine the breakdown of descriptor types (Files, Sockets, Pipes, etc.) to understand what type of resources are leaking. Correlate high FD counts with process uptime to identify gradual leaks.
Sort by Sockets count to identify processes with abnormally high network connections. Compare socket counts against expected application behavior and uptime to detect connection leaks. Group by Category to see if entire application groups are affected by socket exhaustion.
Sort by Threads count and correlate with Uptime to find processes creating threads without proper cleanup. Look for processes where thread count grows continuously over time. Filter by category to identify applications with thread pool management issues.
Process uptime tracking enables detection of crashes, restarts, and abnormal process lifecycle events.
Sort by Uptime ascending to immediately see which processes have recently started or restarted. Filter by specific categories or command names to monitor critical services for unexpected restarts. Compare process start times with known maintenance windows to identify unplanned restarts.
Group processes by Cmd and look for multiple PIDs with similar names but different uptimes, indicating repeated restarts. Track specific application categories over time to identify patterns of instability. Correlate low uptimes with high child CPU accumulation to detect crash loops.
Filter by category and examine uptime distribution to understand application stability. Look for processes that should be long-running but have short uptimes. Use PPID relationships to identify parent processes that frequently spawn short-lived children.
The processes function provides critical security visibility by exposing process ownership, privileges, and behavior patterns.
Filter by Category:other to find uncategorized processes that may be suspicious. Sort by User to identify processes running under unexpected accounts. Search for unusual command names or paths that don't match normal system behavior.
Filter by Uid:0 or User:root to track all processes running with root privileges. Group root processes by Cmd to understand what's running with elevated permissions. Look for unexpected processes running as root that shouldn't require privileges.
Sort by Sockets count to identify processes with unusual network activity. Filter by specific users or categories to detect abnormal network behavior patterns. Correlate high socket counts with process names to identify potential backdoors or data exfiltration.
Use full-text search in CmdLine to find processes launched with specific parameters or scripts. Group by command line patterns to identify potentially malicious execution patterns. Filter by user and examine command lines to detect privilege abuse or policy violations.
Child Process Accumulation: Uniquely captures resources from exited children - critical for accurate measurement of shell scripts and applications that spawn many short-lived processes (even 100+ commands/second)
PSS Memory Estimation (Linux 4.14+): Provides accurate memory accounting for shared-memory workloads by using Proportional Set Size (PSS) to scale shared pages. Enabled by default with adaptive sampling to minimize overhead while ensuring rapid response to memory changes. The plugin alternates between two prioritization strategies each iteration:
Both strategies sort candidates by priority and refresh the top N processes within the configured budget. This approach ensures that the biggest memory consumers (databases, cache servers, etc.) are refreshed within seconds of significant changes, while guaranteeing that even the smallest processes are refreshed within 2× the configured PSS refresh period (default: 600 seconds). Shows true memory consumption vs inflated RSS values for shared-memory workloads.
Category Correlation: The Category field directly matches the instance names in apps.plugin charts, enabling drill-down from chart to process level
Intelligent Grouping: Understands spawn managers (systemd, containerd, init) and groups by top-most parent to create manageable categories
Normalized Metrics: All per-process usage is normalized to accurately match total system resource usage
Real-time Updates: Data refreshes every few seconds showing current process state
Custom Grouping: apps_groups.conf allows defining custom spawn managers and individual processes of interest
Comprehensive FD Breakdown: Detailed categorization of all file descriptor types for leak detection
systemd-services: Aggregated view of processes grouped by systemd servicecontainers-vms: Container and VM-specific process informationnetwork-connections: Network connections per processsystemd-journal: Process logs and events