content/influxdb3/enterprise/admin/clustering.md
Optimize performance for specific workloads in your {{% product-name %}} cluster by configuring specialized nodes in distributed deployments. Assign specific modes and thread allocations to nodes to maximize cluster efficiency.
In an {{% product-name %}} cluster, you can dedicate nodes to specific tasks:
Pass the --mode parameter when starting the node to specify its capabilities:
# Single mode
influxdb3 serve --mode=ingest
# Multiple modes
influxdb3 serve --mode=ingest,query
# All modes (default)
influxdb3 serve --mode=all
Available modes:
all: All capabilities enabled (default)ingest: Data ingestion and line protocol parsingquery: Query execution and data retrievalcompact: Background compaction and optimizationprocess: Data processing and transformationsEvery node has two thread pools that must be properly configured:
[!Note] Even specialized nodes need both thread types. Ingest nodes use DataFusion threads for creating data snapshots that convert WAL data to Parquet files, and query nodes use IO threads for handling requests.
Ingest nodes handle high-volume data writes and require significant IO thread allocation for line protocol parsing.
influxdb3 \
--num-io-threads=12 \
serve \
--num-cores=32 \
--datafusion-num-threads=20 \
--exec-mem-pool-bytes=60% \
--mode=ingest \
--node-id=ingester-01
Configuration rationale:
Key metrics for ingest nodes:
# Monitor IO thread utilization
top -H -p $(pgrep influxdb3) | grep io_worker
# Check write request counts by endpoint
curl -s http://localhost:8181/metrics | grep 'http_requests_total.*write'
# Check overall HTTP request metrics
curl -s http://localhost:8181/metrics | grep 'http_requests_total'
# Monitor WAL size
du -sh /path/to/data/wal/
[!Important]
Scale IO threads with concurrent writers
If you see only 2 CPU cores at 100% on a large ingester, increase
--num-io-threads. Each concurrent writer can utilize approximately one IO thread.
Query nodes execute complex analytical queries and need maximum DataFusion threads.
influxdb3 \
--num-io-threads=4 \
serve \
--num-cores=64 \
--datafusion-num-threads=60 \
--exec-mem-pool-bytes=90% \
--parquet-mem-cache-size=8GB \
--mode=query \
--node-id=query-01 \
--cluster-id=prod-cluster
Configuration rationale:
influxdb3 \
--num-io-threads=6 \
serve \
--num-cores=32 \
--datafusion-num-threads=26 \
--exec-mem-pool-bytes=80% \
--parquet-mem-cache-size=4GB \
--mode=query \
--node-id=query-02
You can configure datafusion properties for additional tuning of query nodes:
influxdb3 serve \
--datafusion-config "datafusion.execution.batch_size:16384,datafusion.execution.target_partitions:60" \
--mode=query
Compactor nodes optimize stored data through background compaction processes.
influxdb3 \
--num-io-threads=2 \
serve \
--num-cores=32 \
--datafusion-num-threads=30 \
--compaction-gen2-duration=24h \
--compaction-check-interval=5m \
--mode=compact \
--node-id=compactor-01 \
--cluster-id=prod-cluster
# Note: --compaction-row-limit option is not yet released in v3.5.0
# Uncomment when available in a future release:
# --compaction-row-limit=2000000 \
Configuration rationale:
You can adjust compaction strategies to balance performance and resource usage:
# Configure compaction strategy
--compaction-multipliers=4,8,16 \
--compaction-max-num-files-per-plan=100 \
--compaction-cleanup-wait=10m
Process nodes handle data transformations and processing plugins.
Setting --plugin-dir automatically adds process mode to any node, so you don't need to explicitly set --mode=process.
If you do set --mode=process, you must also set --plugin-dir.
influxdb3 \
--num-io-threads=4 \
serve \
--num-cores=16 \
--datafusion-num-threads=12 \
--plugin-dir=/path/to/plugins \
--node-id=hybrid-01 \
--cluster-id=prod-cluster
To create a node that only handles processing (no ingest, query, or compaction), set --mode=process:
influxdb3 \
--num-io-threads=4 \
serve \
--num-cores=16 \
--datafusion-num-threads=12 \
--plugin-dir=/path/to/plugins \
--mode=process \
--node-id=processor-01 \
--cluster-id=prod-cluster
Some deployments benefit from nodes handling multiple responsibilities.
influxdb3 \
--num-io-threads=12 \
serve \
--num-cores=48 \
--datafusion-num-threads=36 \
--exec-mem-pool-bytes=75% \
--mode=ingest,query \
--node-id=hybrid-01
influxdb3 \
--num-io-threads=4 \
serve \
--num-cores=32 \
--datafusion-num-threads=28 \
--mode=query,compact \
--node-id=qc-01
# Node 1: All-in-one primary
mode: all
cores: 32
io_threads: 8
datafusion_threads: 24
# Node 2: All-in-one secondary
mode: all
cores: 32
io_threads: 8
datafusion_threads: 24
# Node 3: All-in-one tertiary
mode: all
cores: 32
io_threads: 8
datafusion_threads: 24
# Nodes 1-2: Ingesters
mode: ingest
cores: 48
io_threads: 16
datafusion_threads: 32
# Nodes 3-4: Query nodes
mode: query
cores: 48
io_threads: 4
datafusion_threads: 44
# Nodes 5-6: Compactor + Process
mode: compact,process
cores: 32
io_threads: 4
datafusion_threads: 28
# Nodes 1-4: High-throughput ingesters
mode: ingest
cores: 96
io_threads: 20
datafusion_threads: 76
# Nodes 5-8: Query nodes
mode: query
cores: 64
io_threads: 4
datafusion_threads: 60
# Nodes 9-10: Dedicated compactors
mode: compact
cores: 32
io_threads: 2
datafusion_threads: 30
# Nodes 11-12: Process nodes
mode: process
cores: 32
io_threads: 6
datafusion_threads: 26
{{< product-name >}} uses a shared-nothing architecture where ingest nodes handle all writes. To maximize ingest performance:
Query nodes can scale horizontally since they all access the same object store:
# Add query nodes as needed
for i in {1..10}; do
influxdb3 \
--num-io-threads=4 \
serve \
--num-cores=32 \
--datafusion-num-threads=28 \
--mode=query \
--node-id=query-$i &
done
Monitor specialized nodes differently based on their role:
-- Monitor write activity through parquet file creation
SELECT
table_name,
count(*) as files_created,
sum(row_count) as total_rows,
sum(size_bytes) as total_bytes
FROM system.parquet_files
WHERE max_time > extract(epoch from now() - INTERVAL '5 minutes') * 1000000000
GROUP BY table_name;
-- Monitor query performance
SELECT
count(*) as query_count,
avg(execute_duration) as avg_execute_time,
max(max_memory) as max_memory_bytes
FROM system.queries
WHERE issue_time > now() - INTERVAL '5 minutes'
AND success = true;
-- Monitor compaction progress
SELECT
event_type,
event_status,
count(*) as event_count,
avg(event_duration) as avg_duration
FROM system.compaction_events
WHERE event_time > now() - INTERVAL '1 hour'
GROUP BY event_type, event_status
ORDER BY event_count DESC;
# Check node health via HTTP endpoints
for node in ingester-01:8181 query-01:8181 compactor-01:8181; do
echo "Node: $node"
curl -s "http://$node/health"
done
# Monitor metrics from each node
for node in ingester-01:8181 query-01:8181 compactor-01:8181; do
echo "=== Metrics from $node ==="
curl -s "http://$node/metrics" | grep -E "(cpu_usage|memory_usage|http_requests_total)"
done
# Query system tables for cluster-wide monitoring
curl -X POST "http://query-01:8181/api/v3/query_sql" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_TOKEN" \
-d '{
"q": "SELECT * FROM system.queries WHERE issue_time > now() - INTERVAL '\''5 minutes'\'' ORDER BY issue_time DESC LIMIT 10",
"db": "sensors"
}'
[!Tip]
Extend monitoring with plugins
Enhance your cluster monitoring capabilities using the InfluxDB 3 processing engine. The InfluxDB 3 plugins library includes several monitoring and alerting plugins:
- System metrics collection: Collect CPU, memory, disk, and network statistics
- Threshold monitoring: Monitor metrics with configurable thresholds and alerting
- Multi-channel notifications: Send alerts via Slack, Discord, SMS, WhatsApp, and webhooks
- Anomaly detection: Identify unusual patterns in your data
- Deadman checks: Detect missing data streams
For complete plugin documentation and setup instructions, see Process data in InfluxDB 3 Enterprise.
Use the monitoring queries to identify the following patterns and their solutions:
Detection query:
-- Check for high failed query rate indicating parsing issues
SELECT
count(*) as total_queries,
sum(CASE WHEN success = true THEN 1 ELSE 0 END) as successful_queries,
sum(CASE WHEN success = false THEN 1 ELSE 0 END) as failed_queries
FROM system.queries
WHERE issue_time > now() - INTERVAL '5 minutes';
Symptoms:
Solution: Increase IO threads (see Ingest node issues)
Detection query:
-- Monitor queries with high memory usage or failures
SELECT
avg(max_memory) as avg_memory_bytes,
max(max_memory) as peak_memory_bytes,
sum(CASE WHEN success = false THEN 1 ELSE 0 END) as failed_queries
FROM system.queries
WHERE issue_time > now() - INTERVAL '5 minutes'
AND query_type = 'sql';
Symptoms:
Solution: Increase memory pool or optimize queries (see Query node issues)
Detection query:
-- Check compaction event frequency and success rate
SELECT
event_type,
count(*) as event_count,
sum(CASE WHEN event_status = 'success' THEN 1 ELSE 0 END) as successful_events
FROM system.compaction_events
WHERE event_time > now() - INTERVAL '1 hour'
GROUP BY event_type;
Symptoms:
Solution: Add compactor nodes or increase DataFusion threads (see Compactor node issues)
Problem: Low throughput despite available CPU
# Check: Are only 2 cores busy?
top -H -p $(pgrep influxdb3)
# Solution: Increase IO threads
--num-io-threads=16
Problem: Data snapshot creation affecting ingest
# Check: DataFusion threads at 100% during data snapshots to Parquet
# Solution: Reserve more DataFusion threads for snapshot operations
--datafusion-num-threads=40
Problem: Slow queries despite resources
# Check: Memory pressure
free -h
# Solution: Increase memory pool
--exec-mem-pool-bytes=90%
Problem: Poor cache hit rates
# Solution: Increase Parquet cache
--parquet-mem-cache-size=10GB
Problem: Compaction falling behind
# Check: Compaction queue length
# Solution: Add more compactor nodes or increase threads
--datafusion-num-threads=30
# Phase 1: Baseline (all nodes identical)
all nodes: --mode=all --num-io-threads=8
# Phase 2: Identify workload patterns
# Monitor which nodes handle most writes vs queries
# Phase 3: Gradual specialization
node1: --mode=ingest,query --num-io-threads=12
node2: --mode=query,compact --num-io-threads=4
# Phase 4: Full specialization
node1: --mode=ingest --num-io-threads=16
node2: --mode=query --num-io-threads=4
node3: --mode=compact --num-io-threads=2
# Set environment variables for node type
export INFLUXDB3_ENTERPRISE_MODE=ingest
export INFLUXDB3_NUM_IO_THREADS=20
export INFLUXDB3_DATAFUSION_NUM_THREADS=76
influxdb3 serve --node-id=$HOSTNAME --cluster-id=prod