docs/user-guide/monitoring.md
Two different methods to install and configure Prometheus and Grafana are described in this doc.
This tutorial will show you how to install Prometheus and Grafana for scraping the metrics of the Ingress-Nginx Controller.
!!! important
This example uses emptyDir volumes for Prometheus and Grafana. This means once the pod gets terminated you will lose all the data.
The Ingress-Nginx Controller should already be deployed according to the deployment instructions here.
The controller should be configured for exporting metrics. This requires 3 configurations to the controller. These configurations are:
helm upgrade ingress-nginx ingress-nginx \
--repo https://kubernetes.github.io/ingress-nginx \
--namespace ingress-nginx \
--set controller.metrics.enabled=true \
--set-string controller.podAnnotations."prometheus\.io/scrape"="true" \
--set-string controller.podAnnotations."prometheus\.io/port"="10254"
helm get values ingress-nginx --namespace ingress-nginx
..
controller:
metrics:
enabled: true
podAnnotations:
prometheus.io/port: "10254"
prometheus.io/scrape: "true"
..
apiVersion: v1
kind: Service
..
spec:
ports:
- name: prometheus
port: 10254
targetPort: prometheus
..
apiVersion: v1
kind: Deployment
..
spec:
template:
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "10254"
spec:
containers:
- name: controller
args:
..
- '--enable-metrics=true'
ports:
- name: prometheus
containerPort: 10254
..
Note that the kustomize bases used in this tutorial are stored in the deploy folder of the GitHub repository kubernetes/ingress-nginx.
The Prometheus server must be configured so that it can discover endpoints of services. If a Prometheus server is already running in the cluster and if it is configured in a way that it can find the ingress controller pods, no extra configuration is needed.
If there is no existing Prometheus server running, the rest of this tutorial will guide you through the steps needed to deploy a properly configured Prometheus server.
Running the following command deploys prometheus in Kubernetes:
kubectl apply --kustomize github.com/kubernetes/ingress-nginx/deploy/prometheus/
Open Prometheus dashboard in a web browser:
kubectl get svc -n ingress-nginx
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default-http-backend ClusterIP 10.103.59.201 <none> 80/TCP 3d
ingress-nginx NodePort 10.97.44.72 <none> 80:30100/TCP,443:30154/TCP,10254:32049/TCP 5h
prometheus-server NodePort 10.98.233.86 <none> 9090:32630/TCP 1m
kubectl get nodes -o wide
kubectl get nodes --selector=kubernetes.io/role!=master -o jsonpath={.items[*].status.addresses[?\(@.type==\"InternalIP\"\)].address}
10.192.0.2 10.192.0.3 10.192.0.4
Open your browser and visit the following URL: http://{node IP address}:{prometheus-svc-nodeport} to load the Prometheus Dashboard.
According to the above example, this URL will be http://10.192.0.3:32630
kubectl apply --kustomize github.com/kubernetes/ingress-nginx/deploy/grafana/
kubectl get svc -n ingress-nginx
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default-http-backend ClusterIP 10.103.59.201 <none> 80/TCP 3d
ingress-nginx NodePort 10.97.44.72 <none> 80:30100/TCP,443:30154/TCP,10254:32049/TCP 5h
prometheus-server NodePort 10.98.233.86 <none> 9090:32630/TCP 10m
grafana NodePort 10.98.233.87 <none> 3000:31086/TCP 10m
The username and password is admin
After the login you can import the Grafana dashboard from official dashboards, by following steps given below :
--metrics-per-host=false. You will lose labeling by hostname, but still have labeling by ingress.--metrics-per-undefined-host=true --metrics-per-host=true. You will get labeling by hostname even if the hostname is not explicitly defined on an ingress. Be warned that cardinality could explode due to many hostnames and CPU usage could also increase.kubectl -n ingress-nginx edit svc grafana
This document assumes you're using helm and using the kube-prometheus-stack package to install Prometheus and Grafana.
The Ingress-Nginx Controller should already be deployed according to the deployment instructions here.
To check if Ingress controller is deployed,
kubectl get pods -n ingress-nginx
The result should look something like:
NAME READY STATUS RESTARTS AGE
ingress-nginx-controller-7c489dc7b7-ccrf6 1/1 Running 0 19h
To check if Prometheus is already deployed, run the following command:
helm ls -A
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
ingress-nginx ingress-nginx 10 2022-01-20 18:08:55.267373 -0800 PST deployed ingress-nginx-4.0.16 1.1.1
prometheus prometheus 1 2022-01-20 16:07:25.086828 -0800 PST deployed kube-prometheus-stack-30.1.0 0.53.1
Notice that prometheus is installed in a differenet namespace than ingress-nginx
If prometheus is not installed, then you can install from here
The Ingress NGINX controller needs to be reconfigured for exporting metrics. This requires 3 additional configurations to the controller. These configurations are :
controller.metrics.enabled=true
controller.metrics.serviceMonitor.enabled=true
controller.metrics.serviceMonitor.additionalLabels.release="prometheus"
The easiest way of doing this is to helm upgrade
helm upgrade ingress-nginx ingress-nginx/ingress-nginx \
--namespace ingress-nginx \
--set controller.metrics.enabled=true \
--set controller.metrics.serviceMonitor.enabled=true \
--set controller.metrics.serviceMonitor.additionalLabels.release="prometheus"
Here controller.metrics.serviceMonitor.additionalLabels.release="prometheus" should match the name of the helm release of the kube-prometheus-stack
You can validate that the controller has been successfully reconfigured to export metrics by looking at the values of the installed release, like this:
helm get values ingress-nginx --namespace ingress-nginx
controller:
metrics:
enabled: true
serviceMonitor:
additionalLabels:
release: prometheus
enabled: true
serviceMonitorSelectorNilUsesHelmValues flag to false. By default, Prometheus only discovers PodMonitors within its own namespace. This should be disabled by setting podMonitorSelectorNilUsesHelmValues to falseprometheus.prometheusSpec.podMonitorSelectorNilUsesHelmValues=false
prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false
helm upgrade ...
helm upgrade prometheus prometheus-community/kube-prometheus-stack \
--namespace prometheus \
--set prometheus.prometheusSpec.podMonitorSelectorNilUsesHelmValues=false \
--set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false
helm get values prometheus --namespace prometheus
prometheus:
prometheusSpec:
podMonitorSelectorNilUsesHelmValues: false
serviceMonitorSelectorNilUsesHelmValues: false
Port forward to Prometheus service. Find out the name of the prometheus service by using the following command:
kubectl get svc -n prometheus
The result of this command would look like:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 7h46m
prometheus-grafana ClusterIP 10.106.28.162 <none> 80/TCP 7h46m
prometheus-kube-prometheus-alertmanager ClusterIP 10.108.125.245 <none> 9093/TCP 7h46m
prometheus-kube-prometheus-operator ClusterIP 10.110.220.1 <none> 443/TCP 7h46m
prometheus-kube-prometheus-prometheus ClusterIP 10.102.72.134 <none> 9090/TCP 7h46m
prometheus-kube-state-metrics ClusterIP 10.104.231.181 <none> 8080/TCP 7h46m
prometheus-operated ClusterIP None <none> 9090/TCP 7h46m
prometheus-prometheus-node-exporter ClusterIP 10.96.247.128 <none> 9100/TCP 7h46m
prometheus-kube-prometheus-prometheus is the service we want to port forward to. We can do so using the following command:
kubectl port-forward svc/prometheus-kube-prometheus-prometheus -n prometheus 9090:9090
When you run the above command, you should see something like:
Forwarding from 127.0.0.1:9090 -> 9090
Forwarding from [::1]:9090 -> 9090
Open your browser and visit the following URL http://localhost:{port-forwarded-port} according to the above example it would be, http://localhost:9090
Port forward to Grafana service. Find out the name of the Grafana service by using the following command:
kubectl get svc -n prometheus
The result of this command would look like:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 7h46m
prometheus-grafana ClusterIP 10.106.28.162 <none> 80/TCP 7h46m
prometheus-kube-prometheus-alertmanager ClusterIP 10.108.125.245 <none> 9093/TCP 7h46m
prometheus-kube-prometheus-operator ClusterIP 10.110.220.1 <none> 443/TCP 7h46m
prometheus-kube-prometheus-prometheus ClusterIP 10.102.72.134 <none> 9090/TCP 7h46m
prometheus-kube-state-metrics ClusterIP 10.104.231.181 <none> 8080/TCP 7h46m
prometheus-operated ClusterIP None <none> 9090/TCP 7h46m
prometheus-prometheus-node-exporter ClusterIP 10.96.247.128 <none> 9100/TCP 7h46m
prometheus-grafana is the service we want to port forward to. We can do so using the following command:
kubectl port-forward svc/prometheus-grafana 3000:80 -n prometheus
When you run the above command, you should see something like:
Forwarding from 127.0.0.1:3000 -> 3000
Forwarding from [::1]:3000 -> 3000
Open your browser and visit the following URL http://localhost:{port-forwarded-port} according to the above example it would be, http://localhost:3000 The default username/ password is admin/prom-operator
After the login you can import the Grafana dashboard from official dashboards, by following steps given below :
Prometheus metrics are exposed on port 10254.
nginx_ingress_controller_request_duration_seconds Histogram
The request processing (time elapsed between the first bytes were read from the client and the log write after the last bytes were sent to the client) time in seconds (affected by client speed).
nginx var: request_time
nginx_ingress_controller_response_duration_seconds Histogram
The time spent on receiving the response from the upstream server in seconds (affected by client speed when the response is bigger than proxy buffers).
Note: can be up to several millis bigger than the nginx_ingress_controller_request_duration_seconds because of the different measuring method.
nginx var: upstream_response_time
nginx_ingress_controller_header_duration_seconds Histogram
The time spent on receiving first header from the upstream server
nginx var: upstream_header_time
nginx_ingress_controller_connect_duration_seconds Histogram
The time spent on establishing a connection with the upstream server
nginx var: upstream_connect_time
nginx_ingress_controller_response_size Histogram
The response length (including request line, header, and request body)
nginx var: bytes_sent
nginx_ingress_controller_request_size Histogram
The request length (including request line, header, and request body)
nginx var: request_length
nginx_ingress_controller_requests Counter
The total number of client requests
nginx_ingress_controller_bytes_sent Histogram
The number of bytes sent to a client. Deprecated, use nginx_ingress_controller_response_size
nginx var: bytes_sent
# HELP nginx_ingress_controller_bytes_sent The number of bytes sent to a client. DEPRECATED! Use nginx_ingress_controller_response_size
# TYPE nginx_ingress_controller_bytes_sent histogram
# HELP nginx_ingress_controller_connect_duration_seconds The time spent on establishing a connection with the upstream server
# TYPE nginx_ingress_controller_connect_duration_seconds nginx_ingress_controller_connect_duration_seconds
* HELP nginx_ingress_controller_header_duration_seconds The time spent on receiving first header from the upstream server
# TYPE nginx_ingress_controller_header_duration_seconds histogram
# HELP nginx_ingress_controller_request_duration_seconds The request processing time in milliseconds
# TYPE nginx_ingress_controller_request_duration_seconds histogram
# HELP nginx_ingress_controller_request_size The request length (including request line, header, and request body)
# TYPE nginx_ingress_controller_request_size histogram
# HELP nginx_ingress_controller_requests The total number of client requests.
# TYPE nginx_ingress_controller_requests counter
# HELP nginx_ingress_controller_response_duration_seconds The time spent on receiving the response from the upstream server
# TYPE nginx_ingress_controller_response_duration_seconds histogram
# HELP nginx_ingress_controller_response_size The response length (including request line, header, and request body)
# TYPE nginx_ingress_controller_response_size histogram
# HELP nginx_ingress_controller_nginx_process_connections current number of client connections with state {active, reading, writing, waiting}
# TYPE nginx_ingress_controller_nginx_process_connections gauge
# HELP nginx_ingress_controller_nginx_process_connections_total total number of connections with state {accepted, handled}
# TYPE nginx_ingress_controller_nginx_process_connections_total counter
# HELP nginx_ingress_controller_nginx_process_cpu_seconds_total Cpu usage in seconds
# TYPE nginx_ingress_controller_nginx_process_cpu_seconds_total counter
# HELP nginx_ingress_controller_nginx_process_num_procs number of processes
# TYPE nginx_ingress_controller_nginx_process_num_procs gauge
# HELP nginx_ingress_controller_nginx_process_oldest_start_time_seconds start time in seconds since 1970/01/01
# TYPE nginx_ingress_controller_nginx_process_oldest_start_time_seconds gauge
# HELP nginx_ingress_controller_nginx_process_read_bytes_total number of bytes read
# TYPE nginx_ingress_controller_nginx_process_read_bytes_total counter
# HELP nginx_ingress_controller_nginx_process_requests_total total number of client requests
# TYPE nginx_ingress_controller_nginx_process_requests_total counter
# HELP nginx_ingress_controller_nginx_process_resident_memory_bytes number of bytes of memory in use
# TYPE nginx_ingress_controller_nginx_process_resident_memory_bytes gauge
# HELP nginx_ingress_controller_nginx_process_virtual_memory_bytes number of bytes of memory in use
# TYPE nginx_ingress_controller_nginx_process_virtual_memory_bytes gauge
# HELP nginx_ingress_controller_nginx_process_write_bytes_total number of bytes written
# TYPE nginx_ingress_controller_nginx_process_write_bytes_total counter
# HELP nginx_ingress_controller_build_info A metric with a constant '1' labeled with information about the build.
# TYPE nginx_ingress_controller_build_info gauge
# HELP nginx_ingress_controller_check_success Cumulative number of Ingress controller syntax check operations
# TYPE nginx_ingress_controller_check_success counter
# HELP nginx_ingress_controller_config_hash Running configuration hash actually running
# TYPE nginx_ingress_controller_config_hash gauge
# HELP nginx_ingress_controller_config_last_reload_successful Whether the last configuration reload attempt was successful
# TYPE nginx_ingress_controller_config_last_reload_successful gauge
# HELP nginx_ingress_controller_config_last_reload_successful_timestamp_seconds Timestamp of the last successful configuration reload.
# TYPE nginx_ingress_controller_config_last_reload_successful_timestamp_seconds gauge
# HELP nginx_ingress_controller_ssl_certificate_info Hold all labels associated to a certificate
# TYPE nginx_ingress_controller_ssl_certificate_info gauge
# HELP nginx_ingress_controller_success Cumulative number of Ingress controller reload operations
# TYPE nginx_ingress_controller_success counter
# HELP nginx_ingress_controller_orphan_ingress Gauge reporting status of ingress orphanity, 1 indicates orphaned ingress. 'namespace' is the string used to identify namespace of ingress, 'ingress' for ingress name and 'type' for 'no-service' or 'no-endpoint' of orphanity
# TYPE nginx_ingress_controller_orphan_ingress gauge
# HELP nginx_ingress_controller_admission_config_size The size of the tested configuration
# TYPE nginx_ingress_controller_admission_config_size gauge
# HELP nginx_ingress_controller_admission_render_duration The processing duration of ingresses rendering by the admission controller (float seconds)
# TYPE nginx_ingress_controller_admission_render_duration gauge
# HELP nginx_ingress_controller_admission_render_ingresses The length of ingresses rendered by the admission controller
# TYPE nginx_ingress_controller_admission_render_ingresses gauge
# HELP nginx_ingress_controller_admission_roundtrip_duration The complete duration of the admission controller at the time to process a new event (float seconds)
# TYPE nginx_ingress_controller_admission_roundtrip_duration gauge
# HELP nginx_ingress_controller_admission_tested_duration The processing duration of the admission controller tests (float seconds)
# TYPE nginx_ingress_controller_admission_tested_duration gauge
# HELP nginx_ingress_controller_admission_tested_ingresses The length of ingresses processed by the admission controller
# TYPE nginx_ingress_controller_admission_tested_ingresses gauge
You can configure buckets for histogram metrics using these command line options (here are their default values):
--time-buckets=[0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10]--length-buckets=[10, 20, 30, 40, 50, 60, 70, 80, 90, 100]--size-buckets=[10, 100, 1000, 10000, 100000, 1e+06, 1e+07]