Back to Greptimedb

Overview

grafana/dashboards/metrics/standalone/dashboard.md

1.0.127.9 KB
Original Source

Overview

TitleQueryTypeDescriptionDatasourceUnitLegend Format
Uptimetime() - process_start_time_secondsstatThe start time of GreptimeDB.prometheuss__auto
VersionSELECT pkg_version FROM information_schema.build_infostatGreptimeDB version.mysql----
Total Ingestion Ratesum(rate(greptime_table_operator_ingest_rows[$__rate_interval]))statTotal ingestion rate.prometheusrowsps__auto
Total Storage Sizeselect SUM(disk_size) from information_schema.region_statistics;statTotal number of data file size.mysqldecbytes--
Total Rowsselect SUM(region_rows) from information_schema.region_statistics;statTotal number of data rows in the cluster. Calculated by sum of rows from each region.mysqlsishort--
DeploymentSELECT count(*) as datanode FROM information_schema.cluster_info WHERE peer_type = 'DATANODE';
SELECT count(*) as frontend FROM information_schema.cluster_info WHERE peer_type = 'FRONTEND';
SELECT count(*) as metasrv FROM information_schema.cluster_info WHERE peer_type = 'METASRV';
SELECT count(*) as flownode FROM information_schema.cluster_info WHERE peer_type = 'FLOWNODE';statThe deployment topology of GreptimeDB.mysql----
Database ResourcesSELECT COUNT(*) as databases FROM information_schema.schemata WHERE schema_name NOT IN ('greptime_private', 'information_schema')
SELECT COUNT(*) as tables FROM information_schema.tables WHERE table_schema != 'information_schema'
SELECT COUNT(region_id) as regions FROM information_schema.region_peers
SELECT COUNT(*) as flows FROM information_schema.flowsstatThe number of the key resources in GreptimeDB.mysql----
Data SizeSELECT SUM(memtable_size) * 0.42825 as WAL FROM information_schema.region_statistics;
SELECT SUM(index_size) as index FROM information_schema.region_statistics;
SELECT SUM(manifest_size) as manifest FROM information_schema.region_statistics;statThe data size of wal/index/manifest in the GreptimeDB.mysqldecbytes--

Ingestion

TitleQueryTypeDescriptionDatasourceUnitLegend Format
Total Ingestion Ratesum(rate(greptime_table_operator_ingest_rows{}[$__rate_interval]))timeseriesTotal ingestion rate.

Here we listed 3 primary protocols:

  • Prometheus remote write
  • Greptime's gRPC API (when using our ingest SDK)
  • Log ingestion http API | prometheus | rowsps | ingestion | | Ingestion Rate by Type | sum(rate(greptime_servers_http_logs_ingestion_counter[$__rate_interval])) sum(rate(greptime_servers_prometheus_remote_write_samples[$__rate_interval])) | timeseries | Total ingestion rate.

Here we listed 3 primary protocols:

  • Prometheus remote write
  • Greptime's gRPC API (when using our ingest SDK)
  • Log ingestion http API | prometheus | rowsps | http-logs |

Queries

TitleQueryTypeDescriptionDatasourceUnitLegend Format
Total Query Ratesum (rate(greptime_servers_mysql_query_elapsed_count{}[$__rate_interval]))
sum (rate(greptime_servers_postgres_query_elapsed_count{}[$__rate_interval]))
sum (rate(greptime_servers_http_promql_elapsed_counte{}[$__rate_interval]))timeseriesTotal rate of query API calls by protocol. This metric is collected from frontends.

Here we listed 3 main protocols:

  • MySQL
  • Postgres
  • Prometheus API

Note that there are some other minor query APIs like /sql are not included | prometheus | reqps | mysql |

Resources

TitleQueryTypeDescriptionDatasourceUnitLegend Format
Datanode Memory per Instancesum(process_resident_memory_bytes{}) by (instance, pod)
max(greptime_memory_limit_in_bytes{})timeseriesCurrent memory usage by instanceprometheusbytes[{{instance}}]-[{{ pod }}]
Datanode CPU Usage per Instancesum(rate(process_cpu_seconds_total{}[$__rate_interval]) * 1000) by (instance, pod)
max(greptime_cpu_limit_in_millicores{})timeseriesCurrent cpu usage by instanceprometheusnone[{{ instance }}]-[{{ pod }}]
Frontend Memory per Instancesum(process_resident_memory_bytes{}) by (instance, pod)
max(greptime_memory_limit_in_bytes{})timeseriesCurrent memory usage by instanceprometheusbytes[{{ instance }}]-[{{ pod }}]
Frontend CPU Usage per Instancesum(rate(process_cpu_seconds_total{}[$__rate_interval]) * 1000) by (instance, pod)
max(greptime_cpu_limit_in_millicores{})timeseriesCurrent cpu usage by instanceprometheusnone[{{ instance }}]-[{{ pod }}]-cpu
Metasrv Memory per Instancesum(process_resident_memory_bytes{}) by (instance, pod)
max(greptime_memory_limit_in_bytes{})timeseriesCurrent memory usage by instanceprometheusbytes[{{ instance }}]-[{{ pod }}]-resident
Metasrv CPU Usage per Instancesum(rate(process_cpu_seconds_total{}[$__rate_interval]) * 1000) by (instance, pod)
max(greptime_cpu_limit_in_millicores{})timeseriesCurrent cpu usage by instanceprometheusnone[{{ instance }}]-[{{ pod }}]
Flownode Memory per Instancesum(process_resident_memory_bytes{}) by (instance, pod)
max(greptime_memory_limit_in_bytes{})timeseriesCurrent memory usage by instanceprometheusbytes[{{ instance }}]-[{{ pod }}]
Flownode CPU Usage per Instancesum(rate(process_cpu_seconds_total{}[$__rate_interval]) * 1000) by (instance, pod)
max(greptime_cpu_limit_in_millicores{})timeseriesCurrent cpu usage by instanceprometheusnone[{{ instance }}]-[{{ pod }}]

Frontend Requests

TitleQueryTypeDescriptionDatasourceUnitLegend Format
HTTP QPS per Instancesum by(instance, pod, path, method, code) (rate(greptime_servers_http_requests_elapsed_count{path!~"/health|/metrics"}[$__rate_interval]))timeseriesHTTP QPS per Instance.prometheusreqps[{{instance}}]-[{{pod}}]-[{{path}}]-[{{method}}]-[{{code}}]
HTTP P99 per Instancehistogram_quantile(0.99, sum by(instance, pod, le, path, method, code) (rate(greptime_servers_http_requests_elapsed_bucket{path!~"/health|/metrics"}[$__rate_interval])))timeseriesHTTP P99 per Instance.prometheuss[{{instance}}]-[{{pod}}]-[{{path}}]-[{{method}}]-[{{code}}]-p99
gRPC QPS per Instancesum by(instance, pod, path, code) (rate(greptime_servers_grpc_requests_elapsed_count{}[$__rate_interval]))timeseriesgRPC QPS per Instance.prometheusreqps[{{instance}}]-[{{pod}}]-[{{path}}]-[{{code}}]
gRPC P99 per Instancehistogram_quantile(0.99, sum by(instance, pod, le, path, code) (rate(greptime_servers_grpc_requests_elapsed_bucket{}[$__rate_interval])))timeseriesgRPC P99 per Instance.prometheuss[{{instance}}]-[{{pod}}]-[{{path}}]-[{{method}}]-[{{code}}]-p99
MySQL QPS per Instancesum by(pod, instance)(rate(greptime_servers_mysql_query_elapsed_count{}[$__rate_interval]))timeseriesMySQL QPS per Instance.prometheusreqps[{{instance}}]-[{{pod}}]
MySQL P99 per Instancehistogram_quantile(0.99, sum by(pod, instance, le) (rate(greptime_servers_mysql_query_elapsed_bucket{}[$__rate_interval])))timeseriesMySQL P99 per Instance.prometheuss[{{ instance }}]-[{{ pod }}]-p99
PostgreSQL QPS per Instancesum by(pod, instance)(rate(greptime_servers_postgres_query_elapsed_count{}[$__rate_interval]))timeseriesPostgreSQL QPS per Instance.prometheusreqps[{{instance}}]-[{{pod}}]
PostgreSQL P99 per Instancehistogram_quantile(0.99, sum by(pod,instance,le) (rate(greptime_servers_postgres_query_elapsed_bucket{}[$__rate_interval])))timeseriesPostgreSQL P99 per Instance.prometheuss[{{instance}}]-[{{pod}}]-p99

Frontend to Datanode

TitleQueryTypeDescriptionDatasourceUnitLegend Format
Ingest Rows per Instancesum by(instance, pod)(rate(greptime_table_operator_ingest_rows{}[$__rate_interval]))timeseriesIngestion rate by row as in each frontendprometheusrowsps[{{instance}}]-[{{pod}}]
Region Call QPS per Instancesum by(instance, pod, request_type) (rate(greptime_grpc_region_request_count{}[$__rate_interval]))timeseriesRegion Call QPS per Instance.prometheusops[{{instance}}]-[{{pod}}]-[{{request_type}}]
Region Call P99 per Instancehistogram_quantile(0.99, sum by(instance, pod, le, request_type) (rate(greptime_grpc_region_request_bucket{}[$__rate_interval])))timeseriesRegion Call P99 per Instance.prometheuss[{{instance}}]-[{{pod}}]-[{{request_type}}]
Frontend Handle Bulk Insert Elapsed Timesum by(instance, pod, stage) (rate(greptime_table_operator_handle_bulk_insert_sum[$__rate_interval]))/sum by(instance, pod, stage) (rate(greptime_table_operator_handle_bulk_insert_count[$__rate_interval]))
histogram_quantile(0.99, sum by(instance, pod, stage, le) (rate(greptime_table_operator_handle_bulk_insert_bucket[$__rate_interval])))timeseriesPer-stage time for frontend to handle bulk insert requestsprometheuss[{{instance}}]-[{{pod}}]-[{{stage}}]-AVG

Mito Engine

TitleQueryTypeDescriptionDatasourceUnitLegend Format
Request OPS per Instancesum by(instance, pod, type) (rate(greptime_mito_handle_request_elapsed_count{}[$__rate_interval]))timeseriesRequest QPS per Instance.prometheusops[{{instance}}]-[{{pod}}]-[{{type}}]
Request P99 per Instancehistogram_quantile(0.99, sum by(instance, pod, le, type) (rate(greptime_mito_handle_request_elapsed_bucket{}[$__rate_interval])))timeseriesRequest P99 per Instance.prometheuss[{{instance}}]-[{{pod}}]-[{{type}}]
Write Buffer per Instancegreptime_mito_write_buffer_bytes{}timeseriesWrite Buffer per Instance.prometheusdecbytes[{{instance}}]-[{{pod}}]
Write Rows per Instancesum by (instance, pod) (rate(greptime_mito_write_rows_total{}[$__rate_interval]))timeseriesIngestion size by row counts.prometheusrowsps[{{instance}}]-[{{pod}}]
Flush OPS per Instancesum by(instance, pod, reason) (rate(greptime_mito_flush_requests_total{}[$__rate_interval]))timeseriesFlush QPS per Instance.prometheusops[{{instance}}]-[{{pod}}]-[{{reason}}]
Write Stall per Instancesum by(instance, pod) (greptime_mito_write_stall_total{})timeseriesWrite Stall per Instance.prometheus--[{{instance}}]-[{{pod}}]
Read Stage OPS per Instancesum by(instance, pod) (rate(greptime_mito_read_stage_elapsed_count{ stage="total"}[$__rate_interval]))timeseriesRead Stage OPS per Instance.prometheusops[{{instance}}]-[{{pod}}]
Read Stage P99 per Instancehistogram_quantile(0.99, sum by(instance, pod, le, stage) (rate(greptime_mito_read_stage_elapsed_bucket{}[$__rate_interval])))timeseriesRead Stage P99 per Instance.prometheuss[{{instance}}]-[{{pod}}]-[{{stage}}]
Write Stage P99 per Instancehistogram_quantile(0.99, sum by(instance, pod, le, stage) (rate(greptime_mito_write_stage_elapsed_bucket{}[$__rate_interval])))timeseriesWrite Stage P99 per Instance.prometheuss[{{instance}}]-[{{pod}}]-[{{stage}}]
Compaction OPS per Instancesum by(instance, pod) (rate(greptime_mito_compaction_total_elapsed_count{}[$__rate_interval]))timeseriesCompaction OPS per Instance.prometheusops[{{ instance }}]-[{{pod}}]
Compaction Elapsed Time per Instance by Stagehistogram_quantile(0.99, sum by(instance, pod, le, stage) (rate(greptime_mito_compaction_stage_elapsed_bucket{}[$__rate_interval])))
sum by(instance, pod, stage) (rate(greptime_mito_compaction_stage_elapsed_sum{}[$__rate_interval]))/sum by(instance, pod, stage) (rate(greptime_mito_compaction_stage_elapsed_count{}[$__rate_interval]))timeseriesCompaction latency by stageprometheuss[{{instance}}]-[{{pod}}]-[{{stage}}]-p99
Compaction P99 per Instancehistogram_quantile(0.99, sum by(instance, pod, le,stage) (rate(greptime_mito_compaction_total_elapsed_bucket{}[$__rate_interval])))timeseriesCompaction P99 per Instance.prometheuss[{{instance}}]-[{{pod}}]-[{{stage}}]-compaction
WAL write sizehistogram_quantile(0.95, sum by(le,instance, pod) (rate(raft_engine_write_size_bucket[$__rate_interval])))
histogram_quantile(0.99, sum by(le,instance,pod) (rate(raft_engine_write_size_bucket[$__rate_interval])))
sum by (instance, pod)(rate(raft_engine_write_size_sum[$__rate_interval]))timeseriesWrite-ahead logs write size as bytes. This chart includes stats of p95 and p99 size by instance, total WAL write rate.prometheusbytes[{{instance}}]-[{{pod}}]-req-size-p95
Cached Bytes per Instancegreptime_mito_cache_bytes{}timeseriesCached Bytes per Instance.prometheusdecbytes[{{instance}}]-[{{pod}}]-[{{type}}]
Inflight Compactiongreptime_mito_inflight_compaction_counttimeseriesOngoing compaction task countprometheusnone[{{instance}}]-[{{pod}}]
WAL sync duration secondshistogram_quantile(0.99, sum by(le, type, node, instance, pod) (rate(raft_engine_sync_log_duration_seconds_bucket[$__rate_interval])))timeseriesRaft engine (local disk) log store sync latency, p99prometheuss[{{instance}}]-[{{pod}}]-p99
Log Store op duration secondshistogram_quantile(0.99, sum by(le,logstore,optype,instance, pod) (rate(greptime_logstore_op_elapsed_bucket[$__rate_interval])))timeseriesWrite-ahead log operations latency at p99prometheuss[{{instance}}]-[{{pod}}]-[{{logstore}}]-[{{optype}}]-p99
Inflight Flushgreptime_mito_inflight_flush_counttimeseriesOngoing flush task countprometheusnone[{{instance}}]-[{{pod}}]
Compaction Input/Output Bytessum by(instance, pod) (greptime_mito_compaction_input_bytes)
sum by(instance, pod) (greptime_mito_compaction_output_bytes)timeseriesCompaction oinput output bytesprometheusbytes[{{instance}}]-[{{pod}}]-input
Region Worker Handle Bulk Insert Requestshistogram_quantile(0.95, sum by(le,instance, stage, pod) (rate(greptime_region_worker_handle_write_bucket[$__rate_interval])))
sum by(instance, stage, pod) (rate(greptime_region_worker_handle_write_sum[$__rate_interval]))/sum by(instance, stage, pod) (rate(greptime_region_worker_handle_write_count[$__rate_interval]))timeseriesPer-stage elapsed time for region worker to handle bulk insert region requests.prometheuss[{{instance}}]-[{{pod}}]-[{{stage}}]-P95
Active Series and Field Builders Countsum by(instance, pod) (greptime_mito_memtable_active_series_count)
sum by(instance, pod) (greptime_mito_memtable_field_builder_count)timeseriesCompaction oinput output bytesprometheusnone[{{instance}}]-[{{pod}}]-series
Region Worker Convert Requestshistogram_quantile(0.95, sum by(le, instance, stage, pod) (rate(greptime_datanode_convert_region_request_bucket[$__rate_interval])))
sum by(le,instance, stage, pod) (rate(greptime_datanode_convert_region_request_sum[$__rate_interval]))/sum by(le,instance, stage, pod) (rate(greptime_datanode_convert_region_request_count[$__rate_interval]))timeseriesPer-stage elapsed time for region worker to decode requests.prometheuss[{{instance}}]-[{{pod}}]-[{{stage}}]-P95
Cache Misssum by (instance,pod, type) (rate(greptime_mito_cache_miss{}[$__rate_interval]))timeseriesThe local cache miss of the datanode.prometheus--[{{instance}}]-[{{pod}}]-[{{type}}]

OpenDAL

TitleQueryTypeDescriptionDatasourceUnitLegend Format
QPS per Instancesum by(instance, pod, scheme, operation) (rate(opendal_operation_duration_seconds_count{}[$__rate_interval]))timeseriesQPS per Instance.prometheusops[{{instance}}]-[{{pod}}]-[{{scheme}}]-[{{operation}}]
Read QPS per Instancesum by(instance, pod, scheme, operation) (rate(opendal_operation_duration_seconds_count{ operation=~"read|Reader::read"}[$__rate_interval]))timeseriesRead QPS per Instance.prometheusops[{{instance}}]-[{{pod}}]-[{{scheme}}]-[{{operation}}]
Read P99 per Instancehistogram_quantile(0.99, sum by(instance, pod, le, scheme, operation) (rate(opendal_operation_duration_seconds_bucket{operation=~"read|Reader::read"}[$__rate_interval])))timeseriesRead P99 per Instance.prometheuss[{{instance}}]-[{{pod}}]-[{{scheme}}]-[{{operation}}]
Write QPS per Instancesum by(instance, pod, scheme, operation) (rate(opendal_operation_duration_seconds_count{ operation=~"write|Writer::write|Writer::close"}[$__rate_interval]))timeseriesWrite QPS per Instance.prometheusops[{{instance}}]-[{{pod}}]-[{{scheme}}]-[{{operation}}]
Write P99 per Instancehistogram_quantile(0.99, sum by(instance, pod, le, scheme, operation) (rate(opendal_operation_duration_seconds_bucket{ operation =~ "Writer::write|Writer::close|write"}[$__rate_interval])))timeseriesWrite P99 per Instance.prometheuss[{{instance}}]-[{{pod}}]-[{{scheme}}]-[{{operation}}]
List QPS per Instancesum by(instance, pod, scheme) (rate(opendal_operation_duration_seconds_count{ operation="list"}[$__rate_interval]))timeseriesList QPS per Instance.prometheusops[{{instance}}]-[{{pod}}]-[{{scheme}}]
List P99 per Instancehistogram_quantile(0.99, sum by(instance, pod, le, scheme) (rate(opendal_operation_duration_seconds_bucket{ operation="list"}[$__rate_interval])))timeseriesList P99 per Instance.prometheuss[{{instance}}]-[{{pod}}]-[{{scheme}}]
Other Requests per Instancesum by(instance, pod, scheme, operation) (rate(opendal_operation_duration_seconds_count{operation!~"read|write|list|stat"}[$__rate_interval]))timeseriesOther Requests per Instance.prometheusops[{{instance}}]-[{{pod}}]-[{{scheme}}]-[{{operation}}]
Other Request P99 per Instancehistogram_quantile(0.99, sum by(instance, pod, le, scheme, operation) (rate(opendal_operation_duration_seconds_bucket{ operation!~"read|write|list|Writer::write|Writer::close|Reader::read"}[$__rate_interval])))timeseriesOther Request P99 per Instance.prometheuss[{{instance}}]-[{{pod}}]-[{{scheme}}]-[{{operation}}]
Opendal trafficsum by(instance, pod, scheme, operation) (rate(opendal_operation_bytes_sum{}[$__rate_interval]))timeseriesTotal traffic as in bytes by instance and operationprometheusdecbytes[{{instance}}]-[{{pod}}]-[{{scheme}}]-[{{operation}}]
OpenDAL errors per Instancesum by(instance, pod, scheme, operation, error) (rate(opendal_operation_errors_total{ error!="NotFound"}[$__rate_interval]))timeseriesOpenDAL error counts per Instance.prometheus--[{{instance}}]-[{{pod}}]-[{{scheme}}]-[{{operation}}]-[{{error}}]

Remote WAL

TitleQueryTypeDescriptionDatasourceUnitLegend Format
Triggered region flush totalmeta_triggered_region_flush_totaltimeseriesTriggered region flush totalprometheusnone{{pod}}-{{topic_name}}
Triggered region checkpoint totalmeta_triggered_region_checkpoint_totaltimeseriesTriggered region checkpoint totalprometheusnone{{pod}}-{{topic_name}}
Topic estimated replay sizemeta_topic_estimated_replay_sizetimeseriesTopic estimated max replay sizeprometheusbytes{{pod}}-{{topic_name}}
Kafka logstore's bytes trafficrate(greptime_logstore_kafka_client_bytes_total[$__rate_interval])timeseriesKafka logstore's bytes trafficprometheusbytes{{pod}}-{{logstore}}

Metasrv

TitleQueryTypeDescriptionDatasourceUnitLegend Format
Region migration datanodegreptime_meta_region_migration_stat{datanode_type="src"}
greptime_meta_region_migration_stat{datanode_type="desc"}status-historyCounter of region migration by source and destinationprometheus--from-datanode-{{datanode_id}}
Region migration errorgreptime_meta_region_migration_errortimeseriesCounter of region migration errorprometheusnone{{pod}}-{{state}}-{{error_type}}
Datanode loadgreptime_datanode_loadtimeseriesGauge of load information of each datanode, collected via heartbeat between datanode and metasrv. This information is for metasrv to schedule workloads.prometheusbinBpsDatanode-{{datanode_id}}-writeload
Rate of SQL Executions (RDS)rate(greptime_meta_rds_pg_sql_execute_elapsed_ms_count[$__rate_interval])timeseriesDisplays the rate of SQL executions processed by the Meta service using the RDS backend.prometheusnone{{pod}} {{op}} {{type}} {{result}}
SQL Execution Latency (RDS)histogram_quantile(0.90, sum by(pod, op, type, result, le) (rate(greptime_meta_rds_pg_sql_execute_elapsed_ms_bucket[$__rate_interval])))timeseriesMeasures the response time of SQL executions via the RDS backend.prometheusms{{pod}} {{op}} {{type}} {{result}} p90
Handler Execution Latency`histogram_quantile(0.90, sum by(pod, le, name) (
rate(greptime_meta_handler_execute_bucket[$__rate_interval])
))`timeseriesShows latency of Meta handlers by pod and handler name, useful for monitoring handler performance and detecting latency spikes.
prometheuss{{pod}} {{name}} p90
Heartbeat Packet Sizehistogram_quantile(0.9, sum by(pod, le) (greptime_meta_heartbeat_stat_memory_size_bucket))timeseriesShows p90 heartbeat message sizes, helping track network usage and identify anomalies in heartbeat payload.
prometheusbytes{{pod}}
Meta Heartbeat Receive Raterate(greptime_meta_heartbeat_rate[$__rate_interval])timeseriesGauge of load information of each datanode, collected via heartbeat between datanode and metasrv. This information is for metasrv to schedule workloads.prometheuss{{pod}}
Meta KV Ops Latencyhistogram_quantile(0.99, sum by(pod, le, op, target) (greptime_meta_kv_request_elapsed_bucket))timeseriesGauge of load information of each datanode, collected via heartbeat between datanode and metasrv. This information is for metasrv to schedule workloads.prometheuss{{pod}}-{{op}} p99
Rate of meta KV Opsrate(greptime_meta_kv_request_elapsed_count[$__rate_interval])timeseriesGauge of load information of each datanode, collected via heartbeat between datanode and metasrv. This information is for metasrv to schedule workloads.prometheusnone{{pod}}-{{op}} p99
DDL Latencyhistogram_quantile(0.9, sum by(le, pod, step) (greptime_meta_procedure_create_tables_bucket))
histogram_quantile(0.9, sum by(le, pod, step) (greptime_meta_procedure_create_table))
histogram_quantile(0.9, sum by(le, pod, step) (greptime_meta_procedure_create_view))
histogram_quantile(0.9, sum by(le, pod, step) (greptime_meta_procedure_create_flow))
histogram_quantile(0.9, sum by(le, pod, step) (greptime_meta_procedure_drop_table))
histogram_quantile(0.9, sum by(le, pod, step) (greptime_meta_procedure_alter_table))timeseriesGauge of load information of each datanode, collected via heartbeat between datanode and metasrv. This information is for metasrv to schedule workloads.prometheussCreateLogicalTables-{{step}} p90
Reconciliation statsgreptime_meta_reconciliation_statstimeseriesReconciliation statsprometheuss{{pod}}-{{table_type}}-{{type}}
Reconciliation stepshistogram_quantile(0.9, greptime_meta_reconciliation_procedure_bucket)timeseriesElapsed of Reconciliation stepsprometheuss{{procedure_name}}-{{step}}-P90

Flownode

TitleQueryTypeDescriptionDatasourceUnitLegend Format
Flow Ingest / Output Ratesum by(instance, pod, direction) (rate(greptime_flow_processed_rows[$__rate_interval]))timeseriesFlow Ingest / Output Rate.prometheus--[{{pod}}]-[{{instance}}]-[{{direction}}]
Flow Ingest Latencyhistogram_quantile(0.95, sum(rate(greptime_flow_insert_elapsed_bucket[$__rate_interval])) by (le, instance, pod))
histogram_quantile(0.99, sum(rate(greptime_flow_insert_elapsed_bucket[$__rate_interval])) by (le, instance, pod))timeseriesFlow Ingest Latency.prometheus--[{{instance}}]-[{{pod}}]-p95
Flow Operation Latencyhistogram_quantile(0.95, sum(rate(greptime_flow_processing_time_bucket[$__rate_interval])) by (le,instance,pod,type))
histogram_quantile(0.99, sum(rate(greptime_flow_processing_time_bucket[$__rate_interval])) by (le,instance,pod,type))timeseriesFlow Operation Latency.prometheus--[{{instance}}]-[{{pod}}]-[{{type}}]-p95
Flow Buffer Size per Instancegreptime_flow_input_buf_sizetimeseriesFlow Buffer Size per Instance.prometheus--[{{instance}}]-[{{pod}}]
Flow Processing Error per Instancesum by(instance,pod,code) (rate(greptime_flow_errors[$__rate_interval]))timeseriesFlow Processing Error per Instance.prometheus--[{{instance}}]-[{{pod}}]-[{{code}}]

Trigger

TitleQueryTypeDescriptionDatasourceUnitLegend Format
Trigger Countgreptime_trigger_count{}timeseriesTotal number of triggers currently defined.prometheus--__auto
Trigger Eval Elapsed`histogram_quantile(0.99,
rate(greptime_trigger_evaluate_elapsed_bucket[$__rate_interval])
)`
`histogram_quantile(0.75,
rate(greptime_trigger_evaluate_elapsed_bucket[$__rate_interval])
)`timeseriesElapsed time for trigger evaluation, including query execution and condition evaluation.prometheuss[{{instance}}]-[{{pod}}]-p99
Trigger Eval Failure Raterate(greptime_trigger_evaluate_failure_count[$__rate_interval])timeseriesRate of failed trigger evaluations.prometheusnone__auto
Send Alert Elapsed`histogram_quantile(0.99,
rate(greptime_trigger_send_alert_elapsed_bucket[$__rate_interval])
)`
`histogram_quantile(0.75,
rate(greptime_trigger_send_alert_elapsed_bucket[$__rate_interval])
)`timeseriesElapsed time to send trigger alerts to notification channels.prometheuss[{{instance}}]-[{{pod}}]-[{{channel_type}}]-p99
Send Alert Failure Raterate(greptime_trigger_send_alert_failure_count[$__rate_interval])timeseriesRate of failures when sending trigger alerts.prometheusnone__auto
Save Alert Elapsed`histogram_quantile(0.99,
rate(greptime_trigger_save_alert_record_elapsed_bucket[$__rate_interval])
)`
`histogram_quantile(0.75,
rate(greptime_trigger_save_alert_record_elapsed_bucket[$__rate_interval])
)`timeseriesElapsed time to persist trigger alert records.prometheuss[{{instance}}]-[{{pod}}]-[{{storage_type}}]-p99
Save Alert Failure Raterate(greptime_trigger_save_alert_record_failure_count[$__rate_interval])timeseriesRate of failures when persisting trigger alert records.prometheusnone__auto