home/docs/start/doris-init.md
Apache HertzBeat's historical data storage relies on the time series database, you can choose one of them to install and initialize, or not to install (note ⚠️ but it is strongly recommended to configure in the production environment)
It is recommended to use Greptime as metrics storage.
Apache Doris is an MPP-based real-time analytics database. In HertzBeat, Doris can be used to store:
hzb_history)hzb_log)⚠️ If you do not configure a time-series database, only the last hour of historical data is retained.
If you already have a Doris cluster, skip directly to the YML configuration section.
You can deploy Doris by package or Docker. For production, follow the official deployment guide:
For HertzBeat integration, ensure at least:
9030)8030)ext-lib directory in HertzBeat installation directory.9030) for metadata/query8030) for Stream Loadapplication.ymlEdit hertzbeat/config/application.yml.
For Docker deployment, mount the config file from host.
For package deployment, modify hertzbeat/config/application.yml directly.
Configure warehouse.store.doris (Production Environment Recommended using Stream Load Mode):
warehouse:
store:
doris:
enabled: true
# FE MySQL endpoint
url: jdbc:mysql://127.0.0.1:9030
username: root
password:
table-config:
# Enable dynamic partition for automatic expiration
enable-partition: true
# HOUR / DAY / MONTH
partition-time-unit: DAY
# Number of history partitions to keep
partition-retention-days: 30
# Number of future partitions to pre-create
partition-future-days: 3
buckets: 8
replication-num: 3
pool-config:
minimum-idle: 5
maximum-pool-size: 20
connection-timeout: 30000
write-config:
# Strongly recommend stream mode in production for high throughput
write-mode: stream
batch-size: 1000
flush-interval: 5
stream-load-config:
# FE HTTP port
http-port: ":8030"
timeout: 60
max-bytes-per-batch: 10485760
# For complex networks (K8s/cross-domain): direct / public / private
redirect-policy: ""
For production deployments, strongly recommend using Stream Load mode to ensure high-performance large-scale writes. Stream Load writes directly to Doris storage layer, providing better throughput improvement compared to JDBC mode.
Network Reachability
8030)Special Configuration for Complex Network Scenarios
In K8s, cross-domain, or load-balanced environments, Stream Load's redirect mechanism requires special attention:
redirect-policy:
direct: Direct BE IP connectionpublic: Use public IP (cloud environments)private: Use private IP (private networks)Reference: Doris Stream Load in Complex Networks
Modify Configuration File
Edit hertzbeat/config/application.yml and change write-mode to stream:
warehouse:
store:
doris:
write-config:
write-mode: stream # Change here: from jdbc to stream
stream-load-config:
http-port: ":8030"
timeout: 60
max-bytes-per-batch: 10485760
redirect-policy: "" # Configure if complex network
Restart HertzBeat Service
Verify Successful Switch
Check HertzBeat logs for Stream Load messages
Q: Do I need to rebuild tables after switching?
A: No. Stream Load and JDBC modes use the same table structure, fully compatible.
Q: Will data be lost when switching from JDBC to Stream Load?
A: No. Both write modes are independent, historical data remains unchanged.
Q: How do I rollback if Stream Load fails?
A: If the stream processing fails, it will automatically try to use the jdbc mode for fallback writing
Q: Still getting timeouts in cross-network setup with redirect-policy configured?
A: Possible causes:
redirect-policy setting is unreachableredirect-policy values (direct / public / private)| Parameter | Description |
|---|---|
enabled | Enable/disable Doris storage |
url | Doris FE MySQL JDBC endpoint |
table-config.enable-partition | Enable dynamic partition and automatic expiration |
table-config.partition-time-unit | Partition granularity: HOUR / DAY / MONTH |
table-config.partition-retention-days | Number of partitions retained |
table-config.partition-future-days | Number of future partitions pre-created |
table-config.buckets | Bucket count for table distribution |
table-config.replication-num | Replica count |
write-config.write-mode | jdbc or stream |
write-config.batch-size | Write batch size |
write-config.flush-interval | Flush interval in seconds |
stream-load-config.http-port | Doris FE HTTP port for Stream Load |
stream-load-config.timeout | Stream Load timeout in seconds |
stream-load-config.max-bytes-per-batch | Max bytes per stream-load batch |
stream-load-config.redirect-policy | Redirect policy for FE->BE endpoint selection: direct / public / private |
After configuration changes, restart HertzBeat to apply Doris storage settings.
SHOW CREATE TABLE hertzbeat.hzb_history;
SHOW CREATE TABLE hertzbeat.hzb_log;
SHOW DYNAMIC PARTITION TABLES FROM hertzbeat;
SHOW PARTITIONS FROM hertzbeat.hzb_history;
SHOW PARTITIONS FROM hertzbeat.hzb_log;
Do I need to enable partition to use bucket distribution?
No. Buckets work with or without dynamic partition.
enable-partitiononly controls dynamic partition and automatic expiration.
Can I use Doris for both metrics and logs at the same time?
Yes. HertzBeat writes metrics into
hzb_historyand logs intohzb_logwith the same Doris datasource configuration.
If I change partition/bucket settings in application.yml, will existing tables auto-update?
No. Existing Doris table DDL is not automatically altered. For schema-level changes, apply DDL manually or recreate tables.
Is stream load compression enabled?
Current implementation uses JSON stream load by default.