grafana/README.md
Polkadot nodes collect and produce Prometheus metrics and logs. These include health, performance and debug information such as last finalized block, height of the chain, and many other deeper implementation details of the Polkadot/Substrate node subsystems. These are crucial pieces of information that one needs to successfully monitor the liveliness and performance of a network and its validators.
Just import the dashboard JSON files from this folder in your Grafana installation. All dashboards are grouped in
folder percategory (like for example parachains). The files have been created by Grafana export functionality and
follow the data model specified here.
We aim to keep the dashboards here in sync with the implementation, except dashboards for development and testing.
Your contributions are most welcome!
Please make sure to follow the following design guidelines:
Before you continue make sure you have Grafana set up, or otherwise follow this guide.
You might also need to setup Loki.
Alerts are currently out of the scope of the dashboards, but their setup can be done manually or automated (see installing and configuring Alert Manager)
This section is a list of dashboards, their use case as well as the key metrics that are covered.
Useful for monitoring versions and logs of validator nodes. Includes time series panels that track node warning and error log rates. These can be further investigated in Grafana Loki.
Requires Loki for log aggregation and querying.
This dashboard allows you to see at a glance how fast are candidates approved, disputed and finalized. It was originally designed for observing liveliness after parachain deployment in Kusama/Polkadot, but can be useful generally in production or testing.
It includes panels covering key subsystems of the parachain node side implementation:
It is important to note that this dashboard applies only for validator nodes. The prometheus
queries assume the instance label value contains the string validator only for validator nodes.