src/streaming/README.md
This guide covers Netdata's advanced streaming and replication capabilities, which allow you to build centralized observability points across your infrastructure.
Streaming and replication work together to send metrics data from one Netdata Agent (Child) to another Netdata Agent (Parent). Streaming sends metrics in real-time, while replication ensures historical data is copied, as well, maintaining complete data integrity even after connection interruptions.
:::tip
If you're new to Netdata streaming or prefer a guided approach, jump to our step-by-step guide at the end of this document. The guide will walk you through setting up a basic streaming configuration and then connecting to the comprehensive reference sections as needed.
For a quick reference on setting up the Parent-Child relationship, see the Configuration Examples or refer to our comprehensive Parents: Your Centralization Points documentation for more details.
:::
Before diving into configuration details, it's important to understand the key concepts behind Netdata's streaming architecture:
<details> <summary><strong>Click to see how streaming and replication work</strong></summary>flowchart TB
subgraph infrastructure["Your Infrastructure"]
direction TB
C1[C1]
C2[C2]
P[P]
C1("**Child 1**
Collects metrics")
C2("**Child 2**
Collects metrics")
P("**Parent**
Stores all metrics")
C1 -->|Streams real - time metrics| P
C1 -.->|Replicates historical data| P
C2 -->|Streams real - time metrics| P
C2 -.->|Replicates historical data| P
end
U[U]
U("**You**
Access unified dashboard")
P -->|Presents all data| U
classDef child fill: #e8f5e8, stroke: #27ae60, stroke-width: 2px, color: #2c3e50, rx: 10, ry: 10
classDef parent fill: #f3e8ff, stroke: #9b59b6, stroke-width: 2px, color: #2c3e50, rx: 10, ry: 10
classDef user fill: #fff2e8, stroke: #f39c12, stroke-width: 2px, color: #2c3e50, rx: 10, ry: 10
classDef subgraphStyle fill: #f8f9fa, stroke: #6c757d, stroke-width: 2px, color: #2c3e50, rx: 15, ry: 15
class C1 child
class C2 child
class P parent
class U user
class infrastructure subgraphStyle
Netdata streaming uses a custom binary protocol over TCP, not HTTP/HTTPS. This is an important distinction:
:SSL in the destination, it adds TLS encryption as a security layer on top of the custom streaming protocol (this is not HTTPS)| Task | Configuration | Example |
|---|---|---|
| Enable streaming on a Child | Set enabled = yes in [stream] section | [stream] |
enabled = yes | ||
destination = 192.168.1.5 | ||
| Configure a Parent to accept connections | Create an [API_KEY] section | [API_KEY] |
type = api | ||
enabled = yes | ||
allow from = * | ||
| Set up high availability | Configure multiple destinations on Child | [stream] |
destination = parent1:19999 parent2:19999 | ||
| Filter which metrics to send | Use send charts matching setting | send charts matching = system.* !system.uptime |
Netdata's streaming capabilities are configured through two key files:
stream.conf – Controls streaming behavior, including Parent and Child configurations.netdata.conf – Contains global settings that can impact streaming.To edit these files, navigate to your Netdata configuration directory (typically /etc/netdata) and run:
# Edit streaming configuration
sudo ./edit-config stream.conf
# Edit global Netdata settings
sudo ./edit-config netdata.conf
stream.confThe stream.conf file has three main sections:
[stream] – With these settings, you can configure how Child nodes send metrics.[API_KEY] – Here you can define settings for authentication and access control between Parents and Children.[MACHINE_GUID] – This area lets you customize settings for specific Child nodes by their unique ID.Each Netdata node has a unique identifier stored in:
/var/lib/netdata/registry/netdata.public.unique.id
This file is generated automatically the first time Netdata starts and remains unchanged.
For a production-ready streaming setup, consider the following best practices:
<details> <summary><strong>Click to see deployment best practices</strong></summary>flowchart TB
A[A]
B[B]
C[C]
D[D]
E[E]
B1[B1]
C1[C1]
D1[D1]
E1[E1]
A("**Recommended Strategies**")
B("Multiple Parent Nodes")
C("Optimized Data Retention")
D("Secure Communications")
E("Performance Monitoring")
B1("Improved redundancy
and resilience")
C1("Balance storage costs
and data availability")
D1("Enable encryption
and authentication")
E1("Regular log and
metric reviews")
A --> B
A --> C
A --> D
A --> E
B --> B1
C --> C1
D --> D1
E --> E1
classDef default fill: #f9f9f9, stroke: #333, stroke-width: 2px, color: #2c3e50, rx: 10, ry: 10
classDef strategies fill: #e8f5e8, stroke: #27ae60, stroke-width: 2px, color: #2c3e50, rx: 10, ry: 10
class A default
class B strategies
class C strategies
class D strategies
class E strategies
class B1 strategies
class C1 strategies
class D1 strategies
class E1 strategies
:::tip
Setting up multiple Parent nodes creates redundancy in your monitoring infrastructure. If one Parent fails, Child nodes can automatically switch to another available Parent. This approach:
Configure data retention settings based on your specific monitoring needs:
Protect your metrics data during transmission:
Regularly evaluate the health of your streaming setup:
By following these guidelines, you can set up a scalable and reliable Netdata streaming environment.
:::
stream.conf Detailed Reference[stream] Section (Child Node Settings)With these settings, you can configure how your Child nodes send metrics to Parent nodes.
| Setting | Default | Description |
|---|---|---|
enabled | no | Enables streaming. Set to yes to allow this node to send metrics. |
destination | (empty) | Defines one or more Parent nodes to send data to. |
ssl skip certificate verification | yes | Accepts self-signed or expired SSL certificates. |
CApath | /etc/ssl/certs/ | Directory for trusted SSL certificates. |
CAfile | /etc/ssl/certs/cert.pem | File containing trusted certificates. |
api key | (empty) | API key used by the Child to authenticate with the Parent. |
timeout | 1m | Connection timeout duration. |
default port | 19999 | Default port for streaming if not specified in destination. |
send charts matching | * | Filters which charts are streamed. |
buffer size bytes | 10485760 | Buffer size (10MB by default). Increase for higher latencies. |
reconnect delay | 5s | Time before retrying connection to the Parent. |
initial clock resync iterations | 60 | Syncs chart clocks during startup. |
parent using h2o | no | Set to yes if connecting to a Parent using the H2O web server. |
[API_KEY] Section (Parent Node Authentication)Here you can define settings for authentication and access control between Parents and Children.
| Setting | Default | Description |
|---|---|---|
enabled | no | Enables or disables this API key. |
type | api | Defines the section as an API key configuration. |
allow from | * | Specifies which Child nodes (IP addresses) can connect. |
retention | 1h | How long to keep Child node metrics in RAM-based storage. |
db | dbengine | Specifies the database type for this API key. |
health enabled | auto | Controls alerts and notifications (auto, yes, or no). |
postpone alerts on connect | 1m | Delay alerts for a period after the Child connects. |
health log retention | 5d | Duration (in seconds) to keep health log events. |
proxy enabled | (empty) | Enables routing metrics through a proxy. |
proxy destination | (empty) | IP and port of the proxy server. |
proxy api key | (empty) | API key for the proxy server. |
send charts matching | * | Defines which charts to stream. |
enable compression | yes | Enables or disables data compression. |
enable replication | yes | Enables or disables data replication. |
replication period | 1d | Maximum time window replicated from each Child. |
replication step | 10m | Time interval for each replication step. |
is ephemeral node | no | Marks the Child as ephemeral (removes it after inactivity). |
[MACHINE_GUID] Section (Per-Node Customization)This area lets you customize settings for specific Child nodes by their unique ID.
| Setting | Default | Description |
|---|---|---|
enabled | no | Enables or disables this specific node's configuration. |
type | machine | Defines the section as a machine-specific configuration. |
allow from | * | Lists IP addresses allowed to stream metrics. |
retention | 3600 | Retention period for Child metrics in RAM-based storage. |
db | dbengine | Database type for this node. |
health enabled | auto | Controls alerts (auto, yes, no). |
postpone alerts on connect | 1m | Delay alerts for a period after connection. |
health log retention | 5d | Duration to keep health log events. |
proxy enabled | (empty) | Routes metrics through a proxy if enabled. |
proxy destination | (empty) | Proxy server IP and port. |
proxy api key | (empty) | API key for the proxy. |
send charts matching | * | Filters streamed charts. |
enable compression | yes | Enables or disables compression. |
enable replication | yes | Enables or disables replication. |
replication period | 1d | Maximum replication window. |
replication step | 10m | Time interval for each replication step. |
is ephemeral node | no | Marks the node as ephemeral (removes after inactivity). |
destinationDefines Parent nodes for streaming using the format:
[PROTOCOL:]HOST[%INTERFACE][:PORT][:SSL]
tcp, udp, or unix (only tcp and unix are supported for Parents).[ ]), hostname, or Unix domain socket path.Example (TCP connection with SSL to 203.0.113.0 on port 20000):
[stream]
# Send metrics securely to the Parent at 203.0.113.0:20000
destination = tcp:203.0.113.0:20000:SSL
send charts matchingControls which charts are streamed.
* (default) – Streams all charts.
Specific charts:
[stream]
# Only send CPU application charts and all system charts
send charts matching = apps.cpu system.*
Exclude charts using !:
[stream]
# Send all charts except CPU application charts
send charts matching = !apps.cpu *
allow fromDefines which Child nodes (by IP) can connect.
Allow a single IP:
[API_KEY]
# Only allow connections from 203.0.113.10
allow from = 203.0.113.10
Allow a range but exclude one:
[API_KEY]
# Allow all 10.*.*.* addresses except 10.1.2.3
allow from = !10.1.2.3 10.*
dbDefines the database mode:
dbengine – Stores recent metrics in RAM and writes older data to disk.ram – Stores metrics only in RAM (lost on restart).none – No database.[API_KEY]
# Use disk-based database for all metrics
db = dbengine
netdata.conf Settings Affecting StreamingThe netdata.conf file is the primary configuration file for the Netdata agent. The following sections can impact streaming:
This section defines global settings for the Netdata agent.
ram or swap).Configure the web interface settings here.
yes to disable SSL support.Manage database settings for data storage and retention.
Parent node configuration (stream.conf):
# Generate a random UUID first: uuidgen
[11111111-2222-3333-4444-555555555555]
type = api
# Enable this API key
enabled = yes
# Allow all IPs to connect with this key
allow from = *
# Store data using dbengine for persistence
db = dbengine
Child node configuration (stream.conf):
[stream]
# Enable streaming on this node
enabled = yes
# Connect to Parent at 192.168.1.5 port 19999
destination = 192.168.1.5
# Use the same API key defined on the Parent
api key = 11111111-2222-3333-4444-555555555555
Parent nodes configuration (stream.conf on both Parents):
# Configuration for accepting metrics from Children
[11111111-2222-3333-4444-555555555555]
type = api
enabled = yes
allow from = *
db = dbengine
# Configuration for accepting metrics from other Parents
[22222222-3333-4444-5555-666666666666]
type = api
enabled = yes
# Only allow the other Parent's IP
allow from = 192.168.1.5 192.168.1.6
db = dbengine
First Parent node's configuration for streaming to the second Parent:
[stream]
enabled = yes
destination = 192.168.1.6
api key = 22222222-3333-4444-5555-666666666666
Second Parent node's configuration for streaming to the first Parent:
[stream]
enabled = yes
destination = 192.168.1.5
api key = 22222222-3333-4444-5555-666666666666
Child node configuration:
[stream]
enabled = yes
# List both Parents for failover
destination = 192.168.1.5 192.168.1.6
api key = 11111111-2222-3333-4444-555555555555
If the streaming configuration is working correctly, you'll see logs similar to the following.
On the Parent side:
2017-03-09 09:38:52: netdata: INFO : STREAM [receive from [10.11.12.86]:38564]: new client connection.
2017-03-09 09:38:52: netdata: INFO : STREAM xxx [10.11.12.86]:38564: receive thread created (task id 27721)
On the Child side:
2017-03-09 09:38:28: netdata: INFO : STREAM xxx [send to box:19999]: connecting...
2017-03-09 09:38:28: netdata: INFO : STREAM xxx [send to box:19999]: established communication - sending metrics...
Both Parent and Child nodes log information in /var/log/netdata/error.log.
Symptoms:
Child logs:
netdata ERROR : STREAM_SENDER[CHILD HOSTNAME] : STREAM CHILD HOSTNAME [send to PARENT IP:PARENT PORT]: too many data pending - buffer is X bytes long, Y unsent - we have sent Z bytes in total, W on this connection. Closing connection to flush the data.
Parent logs:
netdata ERROR : STREAM_PARENT[CHILD HOSTNAME,[CHILD IP]:CHILD PORT] : read failed: end of file
What's happening: Slow network connections or high-latency links can cause the streaming buffer to fill up faster than it can be transmitted. When the buffer reaches its maximum size, Netdata closes the connection to flush the pending data, then re-establishes the connection. This can lead to data gaps or inconsistencies if it happens frequently.
Solutions:
stream.conf: buffer size bytes = 20971520 (20MB)Symptoms:
Child logs:
ERROR : STREAM_SENDER[HOSTNAME] : Failed to connect to 'PARENT IP', port 'PARENT PORT' (errno 113, No route to host)
What's happening: This error indicates network connectivity problems between the Child and Parent nodes. It could be due to firewall rules, incorrect IP addresses, or the Parent node not running.
Solutions:
ping or telnetSymptoms:
Parent logs:
STREAM [receive from [child HOSTNAME]:child IP]: `API key 'VALUE' is not allowed`. Forbidding access.
What's happening:
The Parent node is rejecting the connection because the API key doesn't match or the Child's IP address is not allowed by the allow from setting.
Solutions:
allow from setting permits the Child's IP addressSymptoms:
Child logs:
ERROR : STREAM_SENDER[CHILD HOSTNAME] : STREAM child HOSTNAME [send to PARENT HOSTNAME:PARENT PORT]: server is not replying properly (is it a netdata?).
What's happening: The Child node is connecting to the destination, but the server is not responding with the expected Netdata streaming protocol. This commonly occurs when there's a mismatch in SSL/TLS settings or when the destination is not a Netdata server.
Solutions:
:SSL as needed)Symptoms:
What's happening: When the database settings between Parent and Child nodes don't match, it can cause inconsistencies in how data is stored and displayed. The most common cause is different memory modes or retention settings.
Solutions:
[db].db settings between the Parent and ChildNo, you can't stream to multiple Parents at the same time. However, you can configure multiple destinations for failover. Your Child node will connect to the first available Parent in the list.
</details> <details> <summary><strong>How does replication work with interrupted connections?</strong></summary>When you re-establish a connection, your Child node will replicate historical data based on the replication period setting. This ensures your Parent has a complete history even after interruptions.
Your streaming setup will be very efficient, especially with compression enabled. Typically, it uses about 10–20 KB/s for a moderately active node. The actual bandwidth depends on the number of metrics and collection frequency you've configured.
</details> <details> <summary><strong>Can I filter which metrics are sent to the Parent?</strong></summary>Yes, you can use the send charts matching setting to include or exclude specific metrics from streaming. This works with wildcard patterns, giving you precise control over what metrics are transferred.
You can enable SSL in the destination setting by adding :SSL at the end. Configure proper certificates using the CAfile and CApath settings for production environments to ensure your metric data is protected in transit.
Yes, you need to configure each Child node with its own streaming configuration. However, you can use configuration management tools to deploy a standard configuration across your infrastructure, making this process more efficient.
</details>This guide will walk you through setting up Netdata streaming between nodes. By following these sequential steps, you'll create a basic streaming configuration that you can later customize based on your needs.
<details> <summary><strong>Step 1: Prepare Your Environment</strong></summary>Before configuring streaming, ensure you have:
The API key is used to authenticate the connection between Parent and Child nodes.
On the Parent node, generate a UUID to use as your API key:
uuidgen
If the command isn't available, you can use an online UUID generator or create one with:
cat /proc/sys/kernel/random/uuid
Copy the generated UUID (it should look like 11111111-2222-3333-4444-555555555555)
The Parent node receives and stores metrics from Child nodes.
Open the stream configuration file for editing:
cd /etc/netdata
sudo ./edit-config stream.conf
Add a section for your API key (replace with your actual UUID):
[11111111-2222-3333-4444-555555555555]
type = api
enabled = yes
allow from = *
Save and close the file
Restart Netdata to apply changes:
sudo systemctl restart netdata
:::tip
Deployment Strategy For critical environments, consider setting up at least two Parent nodes for redundancy. Each Parent should have enough disk space for your required retention period.
:::
</details> <details> <summary><strong>Step 4: Configure the Child Node</strong></summary>The Child node streams its metrics to the Parent node.
Open the stream configuration file on the Child node:
cd /etc/netdata
sudo ./edit-config stream.conf
Find the [stream] section and update it (replace PARENT_IP with your Parent's actual IP address):
[stream]
enabled = yes
destination = PARENT_IP:19999
api key = 11111111-2222-3333-4444-555555555555
Save and close the file
Restart Netdata on the Child node:
sudo systemctl restart netdata
:::tip
Security
For production environments, enable SSL by adding :SSL to your destination. This encrypts the metric data in transit.
:::
</details> <details> <summary><strong>Step 5: Verify the Connection</strong></summary>Check that streaming is working properly between your nodes.
Check the Netdata logs on the Parent node:
tail -f /var/log/netdata/error.log | grep STREAM
You should see connection messages similar to:
STREAM [receive from [CHILD_IP]]: new client connection.
STREAM xxx [CHILD_IP]: receive thread created (task id xxxxx)
On the Child node, you should see:
STREAM xxx [send to PARENT_IP:19999]: connecting...
STREAM xxx [send to PARENT_IP:19999]: established communication - sending metrics...
Open the Netdata dashboard on the Parent node (http://PARENT_IP:19999) and look for the Child node's hostname in the menu
:::tip
Performance Monitor the connection logs for the first few hours to ensure there are no buffer overflow issues or frequent disconnections.
:::
</details> <details> <summary><strong>Step 6: Customize Your Setup (Optional)</strong></summary>Now that you have a working basic setup, you can customize it based on your deployment strategy:
Add the following to the Child's [stream] section:
[stream]
# Only send system and disk metrics, but not uptime
send charts matching = system.* disk.* !system.uptime
On the Child node, update the destination to include SSL:
[stream]
destination = PARENT_IP:19999:SSL
If using self-signed certificates, you may need to add:
[stream]
ssl skip certificate verification = yes
Configure multiple destinations on the Child:
[stream]
destination = PARENT1_IP:19999 PARENT2_IP:19999
The Child will connect to the first available Parent and automatically switch if that connection fails
On the Parent node, you can configure retention settings to control how long metrics are stored.
:::tip
Advanced For large-scale deployments, consider setting up Parent-to-Parent streaming to create a hierarchical architecture that balances local responsiveness with centralized monitoring.
:::
</details>Verify that:
Verify that:
allow from settingssend charts matchingIf you're using SSL encryption:
:SSL is added to the destination on the Child nodessl skip certificate verification = yes if using self-signed certificatesIf streaming is slow or unstable:
buffer size bytes = 20971520 (20MB)send charts matching