Back to Charts

⚠️ DEPRECATED

stable/prometheus-operator/README.md

latest76.7 KB
Original Source

⚠️ DEPRECATED

Further development has moved to prometheus-community/helm-charts. The chart has been renamed kube-prometheus-stack to more clearly reflect that it installs the kube-prometheus project stack, within which Prometheus Operator is only one component.

prometheus-operator

Installs prometheus-operator to create/configure/manage Prometheus clusters atop Kubernetes. This chart includes multiple components and is suitable for a variety of use-cases.

The default installation is intended to suit monitoring a kubernetes cluster the chart is deployed onto. It closely matches the kube-prometheus project.

With the installation, the chart also includes dashboards and alerts.

The same chart can be used to run multiple prometheus instances in the same cluster if required. To achieve this, the other components need to be disabled - it is necessary to run only one instance of prometheus-operator and a pair of alertmanager pods for an HA configuration.

TL;DR;

console
$ helm install stable/prometheus-operator

Introduction

This chart bootstraps a prometheus-operator deployment on a Kubernetes cluster using the Helm package manager. The chart can be installed multiple times to create separate Prometheus instances managed by Prometheus Operator.

Prerequisites

Installing the Chart

To install the chart with the release name my-release:

console
$ helm install --name my-release stable/prometheus-operator

The command deploys prometheus-operator on the Kubernetes cluster in the default configuration. The configuration section lists the parameters that can be configured during installation.

The default installation includes Prometheus Operator, Alertmanager, Grafana, and configuration for scraping Kubernetes infrastructure.

Uninstalling the Chart

To uninstall/delete the my-release deployment:

console
$ helm delete my-release

The command removes all the Kubernetes components associated with the chart and deletes the release.

CRDs created by this chart are not removed by default and should be manually cleaned up:

console
kubectl delete crd prometheuses.monitoring.coreos.com
kubectl delete crd prometheusrules.monitoring.coreos.com
kubectl delete crd servicemonitors.monitoring.coreos.com
kubectl delete crd podmonitors.monitoring.coreos.com
kubectl delete crd alertmanagers.monitoring.coreos.com
kubectl delete crd thanosrulers.monitoring.coreos.com

Work-Arounds for Known Issues

Running on private GKE clusters

When Google configure the control plane for private clusters, they automatically configure VPC peering between your Kubernetes cluster’s network and a separate Google managed project. In order to restrict what Google are able to access within your cluster, the firewall rules configured restrict access to your Kubernetes pods. This means that in order to use the webhook component with a GKE private cluster, you must configure an additional firewall rule to allow the GKE control plane access to your webhook pod.

You can read more information on how to add firewall rules for the GKE control plane nodes in the GKE docs

Alternatively, you can disable the hooks by setting prometheusOperator.admissionWebhooks.enabled=false.

Helm fails to create CRDs

You should upgrade to Helm 2.14 + in order to avoid this issue. However, if you are stuck with an earlier Helm release you should instead use the following approach: Due to a bug in helm, it is possible for the 5 CRDs that are created by this chart to fail to get fully deployed before Helm attempts to create resources that require them. This affects all versions of Helm with a potential fix pending. In order to work around this issue when installing the chart you will need to make sure all 5 CRDs exist in the cluster first and disable their previsioning by the chart:

  1. Create CRDs
console
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.38/example/prometheus-operator-crd/monitoring.coreos.com_alertmanagers.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.38/example/prometheus-operator-crd/monitoring.coreos.com_podmonitors.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.38/example/prometheus-operator-crd/monitoring.coreos.com_prometheuses.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.38/example/prometheus-operator-crd/monitoring.coreos.com_prometheusrules.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.38/example/prometheus-operator-crd/monitoring.coreos.com_servicemonitors.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.38/example/prometheus-operator-crd/monitoring.coreos.com_thanosrulers.yaml

  1. Wait for CRDs to be created, which should only take a few seconds

  2. Install the chart, but disable the CRD provisioning by setting prometheusOperator.createCustomResource=false

console
$ helm install --name my-release stable/prometheus-operator --set prometheusOperator.createCustomResource=false

Upgrading an existing Release to a new major version

A major chart version change (like v1.2.3 -> v2.0.0) indicates that there is an incompatible breaking change needing manual actions.

Upgrading from 8.x.x to 9.x.x

Version 9 of the helm chart removes the existing additionalScrapeConfigsExternal in favour of additionalScrapeConfigsSecret. This change lets users specify the secret name and secret key to use for the additional scrape configuration of prometheus. This is useful for users that have prometheus-operator as a subchart and also have a template that creates the additional scrape configuration.

Upgrading from 7.x.x to 8.x.x

Due to new template functions being used in the rules in version 8.x.x of the chart, an upgrade to Prometheus Operator and Prometheus is necessary in order to support them. First, upgrade to the latest version of 7.x.x

sh
helm upgrade <your-release-name> stable/prometheus-operator --version 7.4.0

Then upgrade to 8.x.x

sh
helm upgrade <your-release-name> stable/prometheus-operator

Minimal recommended Prometheus version for this chart release is 2.12.x

Upgrading from 6.x.x to 7.x.x

Due to a change in grafana subchart, version 7.x.x now requires Helm >= 2.12.0.

Upgrading from 5.x.x to 6.x.x

Due to a change in deployment labels of kube-state-metrics, the upgrade requires helm upgrade --force in order to re-create the deployment. If this is not done an error will occur indicating that the deployment cannot be modified:

invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/name":"kube-state-metrics"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable

If this error has already been encountered, a helm history command can be used to determine which release has worked, then helm rollback to the release, then helm upgrade --force to this new one

prometheus.io/scrape

The prometheus operator does not support annotation-based discovery of services, using the serviceMonitor CRD in its place as it provides far more configuration options. For information on how to use servicemonitors, please see the documentation on the coreos/prometheus-operator documentation here: Running Exporters

By default, Prometheus discovers ServiceMonitors within its namespace, that are labeled with the same release tag as the prometheus-operator release. Sometimes, you may need to discover custom ServiceMonitors, for example used to scrape data from third-party applications. An easy way of doing this, without compromising the default ServiceMonitors discovery, is allowing Prometheus to discover all ServiceMonitors within its namespace, without applying label filtering. To do so, you can set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues to false.

Configuration

The following tables list the configurable parameters of the prometheus-operator chart and their default values.

General

ParameterDescriptionDefault
additionalPrometheusRulesMapMap of prometheusRule objects to create with the key used as the name of the rule spec. If defined, this will take precedence over additionalPrometheusRules. See https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#prometheusrulespec.nil
additionalPrometheusRulesDEPRECATED Will be removed in a future release. Please use additionalPrometheusRulesMap instead. List of prometheusRule objects to create. See https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#prometheusrulespec.[]
commonLabelsLabels to apply to all resources[]
defaultRules.annotationsAnnotations for default rules for monitoring the cluster{}
defaultRules.appNamespacesTargetSpecify target Namespaces for app alerts".*"
defaultRules.createCreate default rules for monitoring the clustertrue
defaultRules.labelsLabels for default rules for monitoring the cluster{}
defaultRules.runbookUrlURL prefix for default rule runbook_url annotationshttps://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#
defaultRules.rules.PrometheusOperatorCreate Prometheus Operator default rulestrue
defaultRules.rules.alertmanagerCreate default rules for Alert Managertrue
defaultRules.rules.etcdCreate default rules for ETCDtrue
defaultRules.rules.generalCreate General default rulestrue
defaultRules.rules.k8sCreate K8S default rulestrue
defaultRules.rules.kubeApiserverCreate Api Server default rulestrue
defaultRules.rules.kubeApiserverAvailabilityCreate Api Server Availability default rulestrue
defaultRules.rules.kubeApiserverErrorCreate Api Server Error default rulestrue
defaultRules.rules.kubeApiserverSlosCreate Api Server SLOs default rulestrue
defaultRules.rules.kubeletCreate kubelet default rulestrue
defaultRules.rules.kubePrometheusGeneralCreate general default rulestrue
defaultRules.rules.kubePrometheusNodeAlertingCreate Node Alerting default rulestrue
defaultRules.rules.kubePrometheusNodeRecordingCreate Node Recording default rulestrue
defaultRules.rules.kubeSchedulerCreate Kubernetes Scheduler default rulestrue
defaultRules.rules.kubernetesAbsentCreate Kubernetes Absent (example API Server down) default rulestrue
defaultRules.rules.kubernetesAppsCreate Kubernetes Apps default rulestrue
defaultRules.rules.kubernetesResourcesCreate Kubernetes Resources default rulestrue
defaultRules.rules.kubernetesStorageCreate Kubernetes Storage default rulestrue
defaultRules.rules.kubernetesSystemCreate Kubernetes System default rulestrue
defaultRules.rules.kubeStateMetricsCreate kube-state-metrics default rulestrue
defaultRules.rules.networkCreate networking default rulestrue
defaultRules.rules.nodeCreate Node default rulestrue
defaultRules.rules.prometheusCreate Prometheus default rulestrue
defaultRules.rules.timeCreate time default rulestrue
fullnameOverrideProvide a name to substitute for the full names of resources""
global.imagePullSecretsReference to one or more secrets to be used when pulling images[]
global.rbac.createCreate RBAC resourcestrue
global.rbac.pspEnabledCreate pod security policy resourcestrue
global.rbac.pspAnnotationsAdd annotations to the PSP configurations{}
kubeTargetVersionOverrideProvide a target gitVersion of K8S, in case .Capabilites.KubeVersion is not available (e.g. helm template)""
nameOverrideProvide a name in place of prometheus-operator""
namespaceOverrideOverride the deployment namespace"" (Release.Namespace)
kubeTargetVersionOverrideProvide a k8s version""

Prometheus Operator

ParameterDescriptionDefault
prometheusOperator.admissionWebhooks.enabledCreate PrometheusRules admission webhooks. Mutating webhook will patch PrometheusRules objects indicating they were validated. Validating webhook will check the rules syntax.true
prometheusOperator.admissionWebhooks.failurePolicyFailure policy for admission webhooksFail
prometheusOperator.admissionWebhooks.patch.enabledIf true, will use a pre and post install hooks to generate a CA and certificate to use for the prometheus operator tls proxy, and patch the created webhooks with the CA.true
prometheusOperator.admissionWebhooks.patch.image.pullPolicyImage pull policy for the webhook integration jobsIfNotPresent
prometheusOperator.admissionWebhooks.patch.image.repositoryRepository to use for the webhook integration jobsjettech/kube-webhook-certgen
prometheusOperator.admissionWebhooks.patch.image.tagTag to use for the webhook integration jobsv1.2.1
prometheusOperator.admissionWebhooks.patch.image.shaSha to use for the webhook integration jobs (optional)``
prometheusOperator.admissionWebhooks.patch.resourcesResource limits for admission webhook{}
prometheusOperator.admissionWebhooks.patch.nodeSelectorNode selector for running admission hook patch jobsnil
prometheusOperator.admissionWebhooks.patch.podAnnotationsAnnotations for the webhook job podsnil
prometheusOperator.admissionWebhooks.patch.priorityClassNamePriority class for the webhook integration jobsnil
prometheusOperator.affinityAssign custom affinity rules to the prometheus operator https://kubernetes.io/docs/concepts/configuration/assign-pod-node/{}
prometheusOperator.cleanupCustomResourceAttempt to delete CRDs when the release is removed. This option may be useful while testing but is not recommended, as deleting the CRD definition will delete resources and prevent the operator from being able to clean up resources that it managesfalse
prometheusOperator.configReloaderCpuSet the prometheus config reloader side-car CPU limit. If unset, uses the prometheus-operator project defaultnil
prometheusOperator.configReloaderMemorySet the prometheus config reloader side-car memory limit. If unset, uses the prometheus-operator project defaultnil
prometheusOperator.configmapReloadImage.repositoryRepository for configmapReload imagedocker.io/jimmidyson/configmap-reload
prometheusOperator.configmapReloadImage.tagTag for configmapReload imagev0.3.0
prometheusOperator.configmapReloadImage.shaSha for configmapReload image (optional)``
prometheusOperator.createCustomResourceCreate CRDs. Required if deploying anything besides the operator itself as part of the release. The operator will create / update these on startup. If your Helm version < 2.10 you will have to either create the CRDs first or deploy the operator first, then the rest of the resources. Regardless of value of this, Helm v3+ will install the CRDs if those are not present already. Use --skip-crds with helm install if you want to skip CRD creationtrue
prometheusOperator.namespacesNamespaces to scope the interaction of the Prometheus Operator and the apiserver (allow list). This is mutually exclusive with denyNamespaces. Setting this to an empty object will disable the configuration{}
prometheusOperator.namespaces.releaseNamespaceInclude the release namespacefalse
prometheusOperator.namespaces.additionalInclude additional namespaces besides the release namespace[]
prometheusOperator.manageCrdsIf true prometheus operator will create and update its CRDs on startup (for operator <v0.39.0))true
prometheusOperator.denyNamespacesNamespaces not to scope the interaction of the Prometheus Operator (deny list). This is mutually exclusive with namespaces[]
prometheusOperator.enabledDeploy Prometheus Operator. Only one of these should be deployed into the clustertrue
prometheusOperator.hyperkubeImage.pullPolicyImage pull policy for hyperkube image used to perform maintenance tasksIfNotPresent
prometheusOperator.hyperkubeImage.repositoryRepository for hyperkube image used to perform maintenance tasksk8s.gcr.io/hyperkube
prometheusOperator.hyperkubeImage.tagTag for hyperkube image used to perform maintenance tasksv1.16.12
prometheusOperator.hyperkubeImage.shaSha for hyperkube image used to perform maintenance tasks``
prometheusOperator.image.pullPolicyPull policy for prometheus operator imageIfNotPresent
prometheusOperator.image.repositoryRepository for prometheus operator imagequay.io/coreos/prometheus-operator
prometheusOperator.image.tagTag for prometheus operator imagev0.38.1
prometheusOperator.image.shaSha for prometheus operator image (optional)``
prometheusOperator.kubeletService.enabledIf true, the operator will create and maintain a service for scraping kubeletstrue
prometheusOperator.kubeletService.namespaceNamespace to deploy kubelet servicekube-system
prometheusOperator.logFormatOperator log output formatting"logfmt"
prometheusOperator.logLevelOperator log level. Possible values: "all", "debug", "info", "warn", "error", "none""info"
prometheusOperator.hostNetworkHost network for operator pods. Required for use in managed kubernetes clusters (such as AWS EKS) with custom CNI (such as calico)false
prometheusOperator.nodeSelectorPrometheus operator node selector https://kubernetes.io/docs/user-guide/node-selection/{}
prometheusOperator.podAnnotationsAnnotations to add to the operator pod{}
prometheusOperator.podLabelsLabels to add to the operator pod{}
prometheusOperator.priorityClassNameName of Priority Class to assign podsnil
prometheusOperator.prometheusConfigReloaderImage.repositoryRepository for config-reloader imagequay.io/coreos/prometheus-config-reloader
prometheusOperator.prometheusConfigReloaderImage.tagTag for config-reloader imagev0.38.1
prometheusOperator.prometheusConfigReloaderImage.shaSha for config-reloader image (optional)``
prometheusOperator.resourcesResource limits for prometheus operator{}
prometheusOperator.securityContextSecurityContext for prometheus operator{"fsGroup": 65534, "runAsGroup": 65534, "runAsNonRoot": true, "runAsUser": 65534}
prometheusOperator.service.annotationsAnnotations to be added to the prometheus operator service{}
prometheusOperator.service.clusterIPPrometheus operator service clusterIP IP""
prometheusOperator.service.externalIPsList of IP addresses at which the Prometheus Operator server service is available[]
prometheusOperator.service.labelsPrometheus Operator Service Labels{}
prometheusOperator.service.loadBalancerIPPrometheus Operator Loadbalancer IP""
prometheusOperator.service.loadBalancerSourceRangesPrometheus Operator Load Balancer Source Ranges[]
prometheusOperator.service.nodePortTlsTLS port to expose prometheus operator service on each node30443
prometheusOperator.service.nodePortPort to expose prometheus operator service on each node30080
prometheusOperator.service.typePrometheus operator service typeClusterIP
prometheusOperator.serviceAccount.createCreate a serviceaccount for the operatortrue
prometheusOperator.serviceAccount.nameOperator serviceAccount name""
prometheusOperator.serviceMonitor.intervalScrape interval. If not set, the Prometheus default scrape interval is usednil
prometheusOperator.serviceMonitor.metricRelabelingsThe metric_relabel_configs for scraping the operator instance.``
prometheusOperator.serviceMonitor.relabelingsThe relabel_configs for scraping the operator instance.``
prometheusOperator.serviceMonitor.selfMonitorEnable monitoring of prometheus operatortrue
prometheusOperator.tlsProxy.enabledEnable a TLS proxy container. Only the squareup/ghostunnel command line arguments are currently supported and the secret where the cert is loaded from is expected to be provided by the admission webhooktrue
prometheusOperator.tlsProxy.image.repositoryRepository for the TLS proxy containersquareup/ghostunnel
prometheusOperator.tlsProxy.image.tagRepository for the TLS proxy containerv1.5.2
prometheusOperator.tlsProxy.image.shaSha for the TLS proxy container (optional)``
prometheusOperator.tlsProxy.image.pullPolicyImage pull policy for the TLS proxy containerIfNotPresent
prometheusOperator.tlsProxy.resourcesResource requests and limits for the TLS proxy container{}
prometheusOperator.tolerationsTolerations for use with node taints https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/[]

Prometheus

ParameterDescriptionDefault
prometheus.additionalServiceMonitorsList of ServiceMonitor objects to create. See https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#servicemonitorspec[]
prometheus.enabledDeploy prometheustrue
prometheus.annotationsPrometheus annotations{}
prometheus.ingress.annotationsPrometheus Ingress annotations{}
prometheus.ingress.enabledIf true, Prometheus Ingress will be createdfalse
prometheus.ingress.hostsPrometheus Ingress hostnames[]
prometheus.ingress.labelsPrometheus Ingress additional labels{}
prometheus.ingress.pathsPrometheus Ingress paths[]
prometheus.ingress.tlsPrometheus Ingress TLS configuration (YAML)[]
prometheus.ingressPerReplica.annotationsPrometheus pre replica Ingress annotations{}
prometheus.ingressPerReplica.enabledIf true, create an Ingress for each Prometheus server replica in the StatefulSetfalse
prometheus.ingressPerReplica.hostPrefix""
prometheus.ingressPerReplica.hostDomain""
prometheus.ingressPerReplica.labelsPrometheus per replica Ingress additional labels{}
prometheus.ingressPerReplica.pathsPrometheus per replica Ingress paths[]
prometheus.ingressPerReplica.tlsSecretNameSecret name containing the TLS certificate for Prometheus per replica ingress[]
prometheus.ingressPerReplica.tlsSecretPerReplica.enabledIf true, create an secret for TLS certificate for each Ingressfalse
prometheus.ingressPerReplica.tlsSecretPerReplica.prefixSecret name prefix""
prometheus.podDisruptionBudget.enabledIf true, create a pod disruption budget for prometheus pods. The created resource cannot be modified once created - it must be deleted to perform a changefalse
prometheus.podDisruptionBudget.maxUnavailableMaximum number / percentage of pods that may be made unavailable""
prometheus.podDisruptionBudget.minAvailableMinimum number / percentage of pods that should remain scheduled1
prometheus.podSecurityPolicy.allowedCapabilitiesPrometheus Pod Security Policy allowed capabilities""
prometheus.prometheusSpec.additionalAlertManagerConfigsAdditionalAlertManagerConfigs allows for manual configuration of alertmanager jobs in the form as specified in the official Prometheus documentation: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#<alertmanager_config>. AlertManager configurations specified are appended to the configurations generated by the Prometheus Operator. As AlertManager configs are appended, the user is responsible to make sure it is valid. Note that using this feature may expose the possibility to break upgrades of Prometheus. It is advised to review Prometheus release notes to ensure that no incompatible AlertManager configs are going to break Prometheus after the upgrade.{}
prometheus.prometheusSpec.additionalAlertRelabelConfigsAdditionalAlertRelabelConfigs allows specifying additional Prometheus alert relabel configurations. Alert relabel configurations specified are appended to the configurations generated by the Prometheus Operator. Alert relabel configurations specified must have the form as specified in the official Prometheus documentation: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#alert_relabel_configs. As alert relabel configs are appended, the user is responsible to make sure it is valid. Note that using this feature may expose the possibility to break upgrades of Prometheus. It is advised to review Prometheus release notes to ensure that no incompatible alert relabel configs are going to break Prometheus after the upgrade.[]
prometheus.prometheusSpec.additionalScrapeConfigsSecret.enabledEnable additional scrape configs that are managed externally to this chart. Note that the prometheus will fail to provision if the correct secret does not exist.false
prometheus.prometheusSpec.additionalScrapeConfigsSecret.nameName of the secret that Prometheus should use for the additional scrape configuration.""
prometheus.prometheusSpec.additionalScrapeConfigsSecret.keyName of the key inside the secret specified under additionalScrapeConfigsSecret.name to be used for the additional scrape configuration.""
prometheus.prometheusSpec.additionalScrapeConfigsAdditionalScrapeConfigs allows specifying additional Prometheus scrape configurations. Scrape configurations are appended to the configurations generated by the Prometheus Operator. Job configurations must have the form as specified in the official Prometheus documentation: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#<scrape_config>. As scrape configs are appended, the user is responsible to make sure it is valid. Note that using this feature may expose the possibility to break upgrades of Prometheus. It is advised to review Prometheus release notes to ensure that no incompatible scrape configs are going to break Prometheus after the upgrade.[]
prometheus.prometheusSpec.additionalPrometheusSecretsAnnotationsadditionalPrometheusSecretsAnnotations allows to add annotations to the kubernetes secret. This can be useful when deploying via spinnaker to disable versioning on the secret, strategy.spinnaker.io/versioned: 'false'{}
prometheus.prometheusSpec.affinityAssign custom affinity rules to the prometheus instance https://kubernetes.io/docs/concepts/configuration/assign-pod-node/{}
prometheus.prometheusSpec.alertingEndpointsAlertmanagers to which alerts will be sent https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#alertmanagerendpoints Default configuration will connect to the alertmanager deployed as part of this release[]
prometheus.prometheusSpec.apiserverConfigCustom kubernetes_sd_config https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#apiserverconfig Default configuration will connect to current Kubernetes cluster{}
prometheus.prometheusSpec.configMapsConfigMaps is a list of ConfigMaps in the same namespace as the Prometheus object, which shall be mounted into the Prometheus Pods. The ConfigMaps are mounted into /etc/prometheus/configmaps/[]
prometheus.prometheusSpec.containersContainers allows injecting additional containers. This is meant to allow adding an authentication proxy to a Prometheus pod.[]
prometheus.prometheusSpec.initContainersInitContainers allows injecting specialized containers that run before app containers. This is meant to pre-configure and tune mounted volume permissions.[]
prometheus.prometheusSpec.disableCompactionIf true, pass --storage.tsdb.max-block-duration=2h to prometheus. This is already done if using Thanosfalse
prometheus.prometheusSpec.enableAdminAPIEnableAdminAPI enables Prometheus the administrative HTTP API which includes functionality such as deleting time series.false
prometheus.prometheusSpec.enforcedNamespaceLabelenforces adding a namespace label of origin for each alert and metric that is user created.""
prometheus.prometheusSpec.evaluationIntervalInterval between consecutive evaluations.""
prometheus.prometheusSpec.externalLabelsThe labels to add to any time series or alerts when communicating with external systems (federation, remote storage, Alertmanager).{}
prometheus.prometheusSpec.externalUrlThe external URL the Prometheus instances will be available under. This is necessary to generate correct URLs. This is necessary if Prometheus is not served from root of a DNS name.""
prometheus.prometheusSpec.image.repositoryBase image to use for a Prometheus deployment.quay.io/prometheus/prometheus
prometheus.prometheusSpec.image.tagTag of Prometheus container image to be deployed.v2.18.2
prometheus.prometheusSpec.image.shaSha of Prometheus container image to be deployed (optional).``
prometheus.prometheusSpec.listenLocalListenLocal makes the Prometheus server listen on loopback, so that it does not bind against the Pod IP.false
prometheus.prometheusSpec.logFormatLog format for Prometheus to be configured with.logfmt
prometheus.prometheusSpec.logLevelLog level for Prometheus to be configured with.info
prometheus.prometheusSpec.nodeSelectorDefine which Nodes the Pods are scheduled on.{}
prometheus.prometheusSpec.pausedWhen a Prometheus deployment is paused, no actions except for deletion will be performed on the underlying objects.false
prometheus.prometheusSpec.podAntiAffinityTopologyKeyIf anti-affinity is enabled sets the topologyKey to use for anti-affinity. This can be changed to, for example failure-domain.beta.kubernetes.io/zonekubernetes.io/hostname
prometheus.prometheusSpec.podAntiAffinityPod anti-affinity can prevent the scheduler from placing Prometheus replicas on the same node. The default value "soft" means that the scheduler should prefer to not schedule two replica pods onto the same node but no guarantee is provided. The value "hard" means that the scheduler is required to not schedule two replica pods onto the same node. The value "" will disable pod anti-affinity so that no anti-affinity rules will be configured.""
prometheus.prometheusSpec.podMetadataStandard object’s metadata. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/api-conventions.md#metadata Metadata Labels and Annotations gets propagated to the prometheus pods.{}
prometheus.prometheusSpec.priorityClassNamePriority class assigned to the Pods""
prometheus.prometheusSpec.prometheusExternalLabelNameClearIf true, the Operator won't add the external label used to denote Prometheus instance name.false
prometheus.prometheusSpec.prometheusExternalLabelNameName of the external label used to denote Prometheus instance name.""
prometheus.prometheusSpec.queryQuerySpec defines the query command line flags when starting Prometheus. Not all parameters are supported by the operator - see coreos documentation{}
prometheus.prometheusSpec.remoteReadIf specified, the remote_read spec. This is an experimental feature, it may change in any upcoming release in a breaking way.[]
prometheus.prometheusSpec.remoteWriteIf specified, the remote_write spec. This is an experimental feature, it may change in any upcoming release in a breaking way.[]
prometheus.prometheusSpec.remoteWriteDashboardsEnable/Disable Grafana dashboards provisioning for prometheus remote write featurefalse
prometheus.prometheusSpec.replicaExternalLabelNameClearIf true, the Operator won't add the external label used to denote replica name.false
prometheus.prometheusSpec.replicaExternalLabelNameName of the external label used to denote replica name.""
prometheus.prometheusSpec.replicasNumber of instances to deploy for a Prometheus deployment.1
prometheus.prometheusSpec.resourcesDefine resources requests and limits for single Pods.{}
prometheus.prometheusSpec.retentionSizeUsed Storage Prometheus shall retain data for. Example 50GiB (50 Gigabyte). Can be combined with prometheus.prometheusSpec.retention""
prometheus.prometheusSpec.walCompressionEnable compression of the write-ahead log using Snappy. This flag is only available in versions of Prometheus >= 2.11.0.false
prometheus.prometheusSpec.retentionTime duration Prometheus shall retain data for. Must match the regular expression [0-9]+(ms|s|m|h|d|w|y) (milliseconds seconds minutes hours days weeks years).10d
prometheus.prometheusSpec.routePrefixThe route prefix Prometheus registers HTTP handlers for. This is useful, if using ExternalURL and a proxy is rewriting HTTP routes of a request, and the actual ExternalURL is still true, but the server serves requests under a different route prefix. For example for use with kubectl proxy./
prometheus.prometheusSpec.ruleNamespaceSelectorNamespaces to be selected for PrometheusRules discovery. If nil, select own namespace. See namespaceSelector for usage{}
prometheus.prometheusSpec.ruleSelectorNilUsesHelmValuesIf true, a nil or {} value for prometheus.prometheusSpec.ruleSelector will cause the prometheus resource to be created with selectors based on values in the helm deployment, which will also match the PrometheusRule resources created.true
prometheus.prometheusSpec.ruleSelectorA selector to select which PrometheusRules to mount for loading alerting rules from. Until (excluding) Prometheus Operator v0.24.0 Prometheus Operator will migrate any legacy rule ConfigMaps to PrometheusRule custom resources selected by RuleSelector. Make sure it does not match any config maps that you do not want to be migrated. If {}, select all PrometheusRules{}
prometheus.prometheusSpec.scrapeIntervalInterval between consecutive scrapes.""
prometheus.prometheusSpec.secretsSecrets is a list of Secrets in the same namespace as the Prometheus object, which shall be mounted into the Prometheus Pods. The Secrets are mounted into /etc/prometheus/secrets/<secret-name>. Secrets changes after initial creation of a Prometheus object are not reflected in the running Pods. To change the secrets mounted into the Prometheus Pods, the object must be deleted and recreated with the new list of secrets.[]
prometheus.prometheusSpec.securityContextSecurityContext holds pod-level security attributes and common container settings. This defaults to non root user with uid 1000 and gid 2000 in order to support migration from operator version <0.26.{"runAsGroup": 2000, "runAsNonRoot": true, "runAsUser": 1000, "fsGroup": 2000}
prometheus.prometheusSpec.serviceMonitorNamespaceSelectorNamespaces to be selected for ServiceMonitor discovery. See metav1.LabelSelector for usage{}
prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValuesIf true, a nil or {} value for prometheus.prometheusSpec.serviceMonitorSelector will cause the prometheus resource to be created with selectors based on values in the helm deployment, which will also match the servicemonitors createdtrue
prometheus.prometheusSpec.serviceMonitorSelectorServiceMonitors to be selected for target discovery. If {}, select all ServiceMonitors{}
prometheus.additionalPodMonitorsList of PodMonitor objects to create. See https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#podmonitorspec[]
prometheus.prometheusSpec.podMonitorSelectorNilUsesHelmValuesIf true, a nil or {} value for prometheus.prometheusSpec.podMonitorSelector will cause the prometheus resource to be created with selectors based on values in the helm deployment, which will also match the podmonitors createdtrue
prometheus.prometheusSpec.podMonitorSelectorPodMonitors to be selected for target discovery. If {}, select all PodMonitors{}
prometheus.prometheusSpec.podMonitorNamespaceSelectorNamespaces to be selected for PodMonitor discovery. See metav1.LabelSelector for usage{}
prometheus.prometheusSpec.storageSpecStorage spec to specify how storage shall be used.{}
prometheus.prometheusSpec.thanosThanos configuration allows configuring various aspects of a Prometheus server in a Thanos environment. This section is experimental, it may change significantly without deprecation notice in any release.This is experimental and may change significantly without backward compatibility in any release. See https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#thanosspec{}
prometheus.prometheusSpec.tolerationsIf specified, the pod's tolerations.[]
prometheus.prometheusSpec.volumesAdditional Volumes on the output StatefulSet definition.[]
prometheus.prometheusSpec.volumeMountsAdditional VolumeMounts on the output StatefulSet definition.[]
prometheus.service.additionalPortsAdditional Prometheus Service ports to add for NodePort service type[]
prometheus.service.annotationsPrometheus Service Annotations{}
prometheus.service.clusterIPPrometheus service clusterIP IP""
prometheus.service.externalIPsList of IP addresses at which the Prometheus server service is available[]
prometheus.service.labelsPrometheus Service Labels{}
prometheus.service.loadBalancerIPPrometheus Loadbalancer IP""
prometheus.service.loadBalancerSourceRangesPrometheus Load Balancer Source Ranges[]
prometheus.service.nodePortPrometheus Service port for NodePort service type30090
prometheus.service.portPort for Prometheus Service to listen on9090
prometheus.service.sessionAffinityPrometheus Service Session Affinity""
prometheus.service.targetPortPrometheus Service internal port9090
prometheus.service.typePrometheus Service typeClusterIP
prometheus.serviceAccount.createCreate a default serviceaccount for prometheus to usetrue
prometheus.serviceAccount.nameName for prometheus serviceaccount""
prometheus.serviceAccount.annotationsAnnotations to add to the serviceaccount""
prometheus.serviceMonitor.intervalScrape interval. If not set, the Prometheus default scrape interval is used""
prometheus.serviceMonitor.schemeHTTP scheme to use for scraping. Can be used with tlsConfig for example if using istio mTLS.""
prometheus.serviceMonitor.tlsConfigTLS configuration to use when scraping the endpoint. For example if using istio mTLS. Of type: *TLSConfig.{}
prometheus.serviceMonitor.bearerTokenFileBearer token used to scrape the Prometheus servernil
prometheus.serviceMonitor.metricRelabelingsThe metric_relabel_configs for scraping the prometheus instance.``
prometheus.serviceMonitor.relabelingsThe relabel_configs for scraping the prometheus instance.``
prometheus.serviceMonitor.selfMonitorCreate a serviceMonitor to automatically monitor the prometheus instancetrue
prometheus.servicePerReplica.annotationsPrometheus per replica Service Annotations{}
prometheus.servicePerReplica.enabledIf true, create a Service for each Prometheus server replica in the StatefulSetfalse
prometheus.servicePerReplica.labelsPrometheus per replica Service Labels{}
prometheus.servicePerReplica.loadBalancerSourceRangesPrometheus per replica Service Loadbalancer Source Ranges[]
prometheus.servicePerReplica.nodePortPrometheus per replica Service port for NodePort Service type30091
prometheus.servicePerReplica.portPort for Prometheus per replica Service to listen on9090
prometheus.servicePerReplica.targetPortPrometheus per replica Service internal port9090
prometheus.servicePerReplica.typePrometheus per replica Service typeClusterIP
prometheus.thanosIngress.enabledEnable Ingress for Thanos Sidecar * ingress controller needs to support gRPCfalse
prometheus.thanosIngress.servicePortIngress Service Port for Thanos Sidecar10901
prometheus.thanosIngress.pathsIngress paths for Thanos Sidecar[]
prometheus.thanosIngress.annotationsIngress annotations for Thanos Sidecar{}
prometheus.thanosIngress.labelsIngress labels for Thanos Sidecar{}
`prometheus.thanosIngress.hostsIngress hosts for Thanos Sidecar[]
`prometheus.thanosIngress.tlsIngress tls for Thanos Sidecar[]

Alertmanager

ParameterDescriptionDefault
alertmanager.alertmanagerSpec.additionalPeersAdditionalPeers allows injecting a set of additional Alertmanagers to peer with to form a highly available cluster.[]
alertmanager.alertmanagerSpec.affinityAssign custom affinity rules to the alertmanager instance https://kubernetes.io/docs/concepts/configuration/assign-pod-node/{}
alertmanager.alertmanagerSpec.configMapsConfigMaps is a list of ConfigMaps in the same namespace as the Alertmanager object, which shall be mounted into the Alertmanager Pods. The ConfigMaps are mounted into /etc/alertmanager/configmaps/[]
alertmanager.alertmanagerSpec.configSecretConfigSecret is the name of a Kubernetes Secret in the same namespace as the Alertmanager object, which contains configuration for this Alertmanager instance. Defaults to 'alertmanager-' The secret is mounted into /etc/alertmanager/config.""
alertmanager.alertmanagerSpec.containersContainers allows injecting additional containers. This is meant to allow adding an authentication proxy to an Alertmanager pod.[]
alertmanager.alertmanagerSpec.externalUrlThe external URL the Alertmanager instances will be available under. This is necessary to generate correct URLs. This is necessary if Alertmanager is not served from root of a DNS name.""
alertmanager.alertmanagerSpec.image.repositoryBase image that is used to deploy pods, without tag.quay.io/prometheus/alertmanager
alertmanager.alertmanagerSpec.image.tagTag of Alertmanager container image to be deployed.v0.21.0
alertmanager.alertmanagerSpec.image.shaSha of Alertmanager container image to be deployed (optional).``
alertmanager.alertmanagerSpec.listenLocalListenLocal makes the Alertmanager server listen on loopback, so that it does not bind against the Pod IP. Note this is only for the Alertmanager UI, not the gossip communication.false
alertmanager.alertmanagerSpec.logFormatLog format for Alertmanager to be configured with.logfmt
alertmanager.alertmanagerSpec.logLevelLog level for Alertmanager to be configured with.info
alertmanager.alertmanagerSpec.nodeSelectorDefine which Nodes the Pods are scheduled on.{}
alertmanager.alertmanagerSpec.pausedIf set to true all actions on the underlying managed objects are not going to be performed, except for delete actions.false
alertmanager.alertmanagerSpec.podAntiAffinityTopologyKeyIf anti-affinity is enabled sets the topologyKey to use for anti-affinity. This can be changed to, for example failure-domain.beta.kubernetes.io/zonekubernetes.io/hostname
alertmanager.alertmanagerSpec.podAntiAffinityPod anti-affinity can prevent the scheduler from placing Prometheus replicas on the same node. The default value "soft" means that the scheduler should prefer to not schedule two replica pods onto the same node but no guarantee is provided. The value "hard" means that the scheduler is required to not schedule two replica pods onto the same node. The value "" will disable pod anti-affinity so that no anti-affinity rules will be configured.""
alertmanager.alertmanagerSpec.podMetadataStandard object’s metadata. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/api-conventions.md#metadata Metadata Labels and Annotations gets propagated to the prometheus pods.{}
alertmanager.alertmanagerSpec.priorityClassNamePriority class assigned to the Pods""
alertmanager.alertmanagerSpec.replicasSize is the expected size of the alertmanager cluster. The controller will eventually make the size of the running cluster equal to the expected size.1
alertmanager.alertmanagerSpec.resourcesDefine resources requests and limits for single Pods.{}
alertmanager.alertmanagerSpec.retentionTime duration Alertmanager shall retain data for. Value must match the regular expression [0-9]+(ms|s|m|h) (milliseconds seconds minutes hours).120h
alertmanager.alertmanagerSpec.routePrefixThe route prefix Alertmanager registers HTTP handlers for. This is useful, if using ExternalURL and a proxy is rewriting HTTP routes of a request, and the actual ExternalURL is still true, but the server serves requests under a different route prefix. For example for use with kubectl proxy./
alertmanager.alertmanagerSpec.secretsSecrets is a list of Secrets in the same namespace as the Alertmanager object, which shall be mounted into the Alertmanager Pods. The Secrets are mounted into /etc/alertmanager/secrets/<secret-name>.[]
alertmanager.alertmanagerSpec.securityContextSecurityContext holds pod-level security attributes and common container settings. This defaults to non root user with uid 1000 and gid 2000 in order to support migration from operator version < 0.26{"runAsGroup": 20000, "runAsNonRoot": true, "runAsUser": 1000, "fsGroup": 2000}
alertmanager.alertmanagerSpec.storageStorage is the definition of how storage will be used by the Alertmanager instances.{}
alertmanager.alertmanagerSpec.tolerationsIf specified, the pod's tolerations.[]
alertmanager.alertmanagerSpec.useExistingSecretUse an existing secret for configuration (all defined config from values.yaml will be ignored)false
alertmanager.alertmanagerSpec.volumesVolumes allows configuration of additional volumes on the output StatefulSet definition. Volumes specified will be appended to other volumes that are generated as a result of StorageSpec objects.
alertmanager.alertmanagerSpec.volumeMountsVolumeMounts allows configuration of additional VolumeMounts on the output StatefulSet definition. VolumeMounts specified will be appended to other VolumeMounts in the alertmanager container, that are generated as a result of StorageSpec objects.
alertmanager.apiVersionApi that prometheus will use to communicate with alertmanager. Possible values are v1, v2v2
alertmanager.configProvide YAML to configure Alertmanager. See https://prometheus.io/docs/alerting/configuration/#configuration-file. The default provided works to suppress the Watchdog alert from defaultRules.create{"global":{"resolve_timeout":"5m"},"route":{"group_by":["job"],"group_wait":"30s","group_interval":"5m","repeat_interval":"12h","receiver":"null","routes":[{"match":{"alertname":"Watchdog"},"receiver":"null"}]},"receivers":[{"name":"null"}]}
alertmanager.enabledDeploy alertmanagertrue
alertmanager.ingress.annotationsAlertmanager Ingress annotations{}
alertmanager.ingress.enabledIf true, Alertmanager Ingress will be createdfalse
alertmanager.ingress.hostsAlertmanager Ingress hostnames[]
alertmanager.ingress.labelsAlertmanager Ingress additional labels{}
alertmanager.ingress.pathsAlertmanager Ingress paths[]
alertmanager.ingress.tlsAlertmanager Ingress TLS configuration (YAML)[]
alertmanager.ingressPerReplica.annotationsAlertmanager pre replica Ingress annotations{}
alertmanager.ingressPerReplica.enabledIf true, create an Ingress for each Alertmanager replica in the StatefulSetfalse
alertmanager.ingressPerReplica.hostPrefix""
alertmanager.ingressPerReplica.hostDomain""
alertmanager.ingressPerReplica.labelsAlertmanager per replica Ingress additional labels{}
alertmanager.ingressPerReplica.pathsAlertmanager per replica Ingress paths[]
alertmanager.ingressPerReplica.tlsSecretNameSecret name containing the TLS certificate for Alertmanager per replica ingress[]
alertmanager.ingressPerReplica.tlsSecretPerReplica.enabledIf true, create an secret for TLS certificate for each Ingressfalse
alertmanager.ingressPerReplica.tlsSecretPerReplica.prefixSecret name prefix""
alertmanager.podDisruptionBudget.enabledIf true, create a pod disruption budget for Alertmanager pods. The created resource cannot be modified once created - it must be deleted to perform a changefalse
alertmanager.podDisruptionBudget.maxUnavailableMaximum number / percentage of pods that may be made unavailable""
alertmanager.podDisruptionBudget.minAvailableMinimum number / percentage of pods that should remain scheduled1
alertmanager.secret.annotationsAlertmanager Secret annotations{}
alertmanager.service.annotationsAlertmanager Service annotations{}
alertmanager.service.clusterIPAlertmanager service clusterIP IP""
alertmanager.service.externalIPsList of IP addresses at which the Alertmanager server service is available[]
alertmanager.service.labelsAlertmanager Service Labels{}
alertmanager.service.loadBalancerIPAlertmanager Loadbalancer IP""
alertmanager.service.loadBalancerSourceRangesAlertmanager Load Balancer Source Ranges[]
alertmanager.service.nodePortAlertmanager Service port for NodePort service type30903
alertmanager.service.portPort for Alertmanager Service to listen on9093
alertmanager.service.targetPortAlertManager Service internal port9093
alertmanager.service.typeAlertmanager Service typeClusterIP
alertmanager.servicePerReplica.annotationsAlertmanager per replica Service Annotations{}
alertmanager.servicePerReplica.enabledIf true, create a Service for each Alertmanager replica in the StatefulSetfalse
alertmanager.servicePerReplica.labelsAlertmanager per replica Service Labels{}
alertmanager.servicePerReplica.loadBalancerSourceRangesAlertmanager per replica Service Loadbalancer Source Ranges[]
alertmanager.servicePerReplica.nodePortAlertmanager per replica Service port for NodePort Service type30904
alertmanager.servicePerReplica.portPort for Alertmanager per replica Service to listen on9093
alertmanager.servicePerReplica.targetPortAlertmanager per replica Service internal port9093
alertmanager.servicePerReplica.typeAlertmanager per replica Service typeClusterIP
alertmanager.serviceAccount.createCreate a serviceAccount for alertmanagertrue
alertmanager.serviceAccount.nameName for Alertmanager service account""
alertmanager.serviceAccount.annotationsAnnotations to add to the serviceaccount""
alertmanager.serviceMonitor.intervalScrape interval. If not set, the Prometheus default scrape interval is usednil
alertmanager.serviceMonitor.metricRelabelingsThe metric_relabel_configs for scraping the alertmanager instance.``
alertmanager.serviceMonitor.relabelingsThe relabel_configs for scraping the alertmanager instance.``
alertmanager.serviceMonitor.selfMonitorCreate a serviceMonitor to automatically monitor the alartmanager instancetrue
alertmanager.tplConfigPass the Alertmanager configuration directives through Helm's templating engine. If the Alertmanager configuration contains Alertmanager templates, they'll need to be properly escaped so that they are not interpreted by Helmfalse

Grafana

This is not a full list of the possible values.

For a full list of configurable values please refer to the Grafana chart.

ParameterDescriptionDefault
grafana.additionalDataSourcesConfigure additional grafana datasources (passed through tpl)[]
grafana.adminPasswordAdmin password to log into the grafana UI"prom-operator"
grafana.defaultDashboardsEnabledDeploy default dashboards. These are loaded using the sidecartrue
grafana.enabledIf true, deploy the grafana sub-charttrue
grafana.extraConfigmapMountsAdditional grafana server configMap volume mounts[]
grafana.grafana.iniGrafana's primary configuration{}
grafana.image.tagImage tag. (Must be >= 5.0.0)6.2.5
grafana.ingress.annotationsIngress annotations for Grafana{}
grafana.ingress.enabledEnables Ingress for Grafanafalse
grafana.ingress.hostsIngress accepted hostnames for Grafana[]
grafana.ingress.labelsCustom labels for Grafana Ingress{}
grafana.ingress.tlsIngress TLS configuration for Grafana[]
grafana.namespaceOverrideOverride the deployment namespace of grafana"" (Release.Namespace)
grafana.rbac.pspUseAppArmorEnforce AppArmor in created PodSecurityPolicy (requires rbac.pspEnabled)true
grafana.service.portNameAllow to customize Grafana service portname. Will be used by servicemonitor as wellservice
grafana.serviceMonitor.metricRelabelingsThe metric_relabel_configs for scraping the grafana instance.``
grafana.serviceMonitor.relabelingsThe relabel_configs for scraping the grafana instance.``
grafana.serviceMonitor.selfMonitorCreate a serviceMonitor to automatically monitor the grafana instancetrue
grafana.sidecar.dashboards.enabledEnable the Grafana sidecar to automatically load dashboards with a label {{ grafana.sidecar.dashboards.label }}=1true
grafana.sidecar.dashboards.annotationsCreate annotations on dashboard configmaps{}
grafana.sidecar.dashboards.labelIf the sidecar is enabled, configmaps with this label will be loaded into Grafana as dashboardsgrafana_dashboard
grafana.sidecar.datasources.annotationsCreate annotations on datasource configmaps{}
grafana.sidecar.datasources.createPrometheusReplicasDatasourcesCreate datasource for each Pod of Prometheus StatefulSet i.e. Prometheus-0, Prometheus-1false
grafana.sidecar.datasources.defaultDatasourceEnabledEnable Grafana Prometheus default datasourcetrue
grafana.sidecar.datasources.enabledEnable the Grafana sidecar to automatically load datasources with a label {{ grafana.sidecar.datasources.label }}=1true
grafana.sidecar.datasources.labelIf the sidecar is enabled, configmaps with this label will be loaded into Grafana as datasources configurationsgrafana_datasource

Exporters

ParameterDescriptionDefault
coreDns.enabledDeploy coreDns scraping components. Use either this or kubeDnstrue
coreDns.service.portCoreDns port9153
coreDns.service.selectorCoreDns service selector{"k8s-app" : "kube-dns" }
coreDns.service.targetPortCoreDns targetPort9153
coreDns.serviceMonitor.intervalScrape interval. If not set, the Prometheus default scrape interval is usednil
coreDns.serviceMonitor.metricRelabelingsThe metric_relabel_configs for scraping CoreDns.``
coreDns.serviceMonitor.relabelingsThe relabel_configs for scraping CoreDNS.``
kube-state-metrics.namespaceOverrideOverride the deployment namespace of kube-state-metrics"" (Release.Namespace)
kube-state-metrics.podSecurityPolicy.enabledCreate pod security policy resource for kube-state-metrics.true
kube-state-metrics.rbac.createCreate RBAC components in kube-state-metrics. See global.rbac.createtrue
kubeApiServer.enabledDeploy serviceMonitor to scrape the Kubernetes API servertrue
kubeApiServer.relabelingsRelablings for the API Server ServiceMonitor[]
kubeApiServer.serviceMonitor.intervalScrape interval. If not set, the Prometheus default scrape interval is usednil
kubeApiServer.serviceMonitor.jobLabelThe name of the label on the target service to use as the job name in prometheuscomponent
kubeApiServer.serviceMonitor.metricRelabelingsThe metric_relabel_configs for scraping the Kubernetes API server.``
kubeApiServer.serviceMonitor.relabelingsThe relabel_configs for scraping the Kubernetes API server.``
kubeApiServer.serviceMonitor.selectorThe service selector{"matchLabels":{"component":"apiserver","provider":"kubernetes"}}
kubeApiServer.tlsConfig.insecureSkipVerifySkip TLS certificate validation when scrapingfalse
kubeApiServer.tlsConfig.serverNameName of the server to use when validating TLS certificatekubernetes
kubeControllerManager.enabledDeploy a service and serviceMonitor to scrape the Kubernetes controller-managertrue
kubeControllerManager.endpointsEndpoints where Controller-manager runs. Provide this if running Controller-manager outside the cluster[]
kubeControllerManager.service.portController-manager port for the service runs on10252
kubeControllerManager.service.selectorController-manager service selector{"component" : "kube-controller-manager" }
kubeControllerManager.service.targetPortController-manager targetPort for the service runs on10252
kubeControllerManager.serviceMonitor.httpsController-manager service scrape over httpsfalse
kubeControllerManager.serviceMonitor.insecureSkipVerifySkip TLS certificate validation when scrapingnull
kubeControllerManager.serviceMonitor.intervalScrape interval. If not set, the Prometheus default scrape interval is usednil
kubeControllerManager.serviceMonitor.metricRelabelingsThe metric_relabel_configs for scraping the scheduler.``
kubeControllerManager.serviceMonitor.relabelingsThe relabel_configs for scraping the scheduler.``
kubeControllerManager.serviceMonitor.serverNameName of the server to use when validating TLS certificatenull
kubeDns.enabledDeploy kubeDns scraping components. Use either this or coreDnsfalse
kubeDns.service.dnsmasq.portDnsmasq service port10054
kubeDns.service.dnsmasq.targetPortDnsmasq service targetPort10054
kubeDns.service.skydns.portSkydns service port10055
kubeDns.service.skydns.targetPortSkydns service targetPort10055
kubeDns.service.selectorkubeDns service selector{"k8s-app" : "kube-dns" }
kubeDns.serviceMonitor.dnsmasqMetricRelabelingsThe metric_relabel_configs for scraping dnsmasq kubeDns.``
kubeDns.serviceMonitor.dnsmasqRelabelingsThe relabel_configs for scraping dnsmasq kubeDns.``
kubeDns.serviceMonitor.intervalScrape interval. If not set, the Prometheus default scrape interval is usednil
kubeDns.serviceMonitor.metricRelabelingsThe metric_relabel_configs for scraping kubeDns.``
kubeDns.serviceMonitor.relabelingsThe relabel_configs for scraping kubeDns.``
kubeEtcd.enabledDeploy components to scrape etcdtrue
kubeEtcd.endpointsEndpoints where etcd runs. Provide this if running etcd outside the cluster[]
kubeEtcd.service.portEtcd port4001
kubeEtcd.service.selectorSelector for etcd if running inside the cluster{"component":"etcd"}
kubeEtcd.service.targetPortEtcd targetPort4001
kubeEtcd.serviceMonitor.caFileCertificate authority file to use when connecting to etcd. See prometheus.prometheusSpec.secrets""
kubeEtcd.serviceMonitor.certFileClient certificate file to use when connecting to etcd. See prometheus.prometheusSpec.secrets""
kubeEtcd.serviceMonitor.insecureSkipVerifySkip validating etcd TLS certificate when scrapingfalse
kubeEtcd.serviceMonitor.intervalScrape interval. If not set, the Prometheus default scrape interval is usednil
kubeEtcd.serviceMonitor.keyFileClient key file to use when connecting to etcd. See prometheus.prometheusSpec.secrets""
kubeEtcd.serviceMonitor.metricRelabelingsThe metric_relabel_configs for scraping Etcd.``
kubeEtcd.serviceMonitor.relabelingsThe relabel_configs for scraping Etcd.``
kubeEtcd.serviceMonitor.schemeEtcd servicemonitor schemehttp
kubeEtcd.serviceMonitor.serverNameEtcd server name to validate certificate against when scraping""
kubeProxy.enabledDeploy a service and serviceMonitor to scrape the Kubernetes proxytrue
kubeProxy.endpointsEndpoints where proxy runs. Provide this if running proxy outside the cluster[]
kubeProxy.service.portKubernetes proxy port for the service runs on10249
kubeProxy.service.selectorKubernetes proxy service selector{"k8s-app" : "kube-proxy" }
kubeProxy.service.targetPortKubernetes proxy targetPort for the service runs on10249
kubeProxy.serviceMonitor.httpsKubernetes proxy service scrape over httpsfalse
kubeProxy.serviceMonitor.intervalScrape interval. If not set, the Prometheus default scrape interval is usednil
kubeProxy.serviceMonitor.metricRelabelingsThe metric_relabel_configs for scraping the Kubernetes proxy.``
kubeProxy.serviceMonitor.relabelingsThe relabel_configs for scraping the Kubernetes proxy.``
kubeScheduler.enabledDeploy a service and serviceMonitor to scrape the Kubernetes schedulertrue
kubeScheduler.endpointsEndpoints where scheduler runs. Provide this if running scheduler outside the cluster[]
kubeScheduler.service.portScheduler port for the service runs on10251
kubeScheduler.service.selectorScheduler service selector{"component" : "kube-scheduler" }
kubeScheduler.service.targetPortScheduler targetPort for the service runs on10251
kubeScheduler.serviceMonitor.httpsScheduler service scrape over httpsfalse
kubeScheduler.serviceMonitor.insecureSkipVerifySkip TLS certificate validation when scrapingnull
kubeScheduler.serviceMonitor.intervalScrape interval. If not set, the Prometheus default scrape interval is usednil
kubeScheduler.serviceMonitor.metricRelabelingsThe metric_relabel_configs for scraping the Kubernetes scheduler.``
kubeScheduler.serviceMonitor.relabelingsThe relabel_configs for scraping the Kubernetes scheduler.``
kubeScheduler.serviceMonitor.serverNameName of the server to use when validating TLS certificatenull
kubeStateMetrics.enabledDeploy the kube-state-metrics chart and configure a servicemonitor to scrapetrue
kubeStateMetrics.serviceMonitor.intervalScrape interval. If not set, the Prometheus default scrape interval is usednil
kubeStateMetrics.serviceMonitor.metricRelabelingsMetric relablings for the kube-state-metrics ServiceMonitor[]
kubeStateMetrics.serviceMonitor.relabelingsThe relabel_configs for scraping kube-state-metrics.``
kubelet.enabledDeploy servicemonitor to scrape the kubelet service. See also prometheusOperator.kubeletServicetrue
kubelet.namespaceNamespace where the kubelet is deployed. See also prometheusOperator.kubeletService.namespacekube-system
kubelet.serviceMonitor.cAdvisorEnable scraping /metrics/cadvisor from kubelet's servicetrue
kubelet.serviceMonitor.cAdvisorMetricRelabelingsThe metric_relabel_configs for scraping cAdvisor.``
kubelet.serviceMonitor.cAdvisorRelabelingsThe relabel_configs for scraping cAdvisor.[{"sourceLabels":["__metrics_path__"], "targetLabel":"metrics_path"}]
kubelet.serviceMonitor.probesEnable scraping /metrics/probes from kubelet's servicetrue
kubelet.serviceMonitor.probesMetricRelabelingsThe metric_relabel_configs for scraping kubelet.``
kubelet.serviceMonitor.probesRelabelingsThe relabel_configs for scraping kubelet.[{"sourceLabels":["__metrics_path__"], "targetLabel":"metrics_path"}]
kubelet.serviceMonitor.resourceEnable scraping /metrics/resource/v1alpha1 from kubelet's servicetrue
kubelet.serviceMonitor.resourceMetricRelabelingsThe metric_relabel_configs for scraping /metrics/resource/v1alpha1.``
kubelet.serviceMonitor.resourceRelabelingsThe relabel_configs for scraping cAdvisor.[{"sourceLabels":["__metrics_path__"], "targetLabel":"metrics_path"}]
kubelet.serviceMonitor.httpsEnable scraping of the kubelet over HTTPS. For more information, see https://github.com/coreos/prometheus-operator/issues/926true
kubelet.serviceMonitor.intervalScrape interval. If not set, the Prometheus default scrape interval is usednil
kubelet.serviceMonitor.scrapeTimeoutScrape timeout. If not set, the Prometheus default scrape timeout is usednil
kubelet.serviceMonitor.metricRelabelingsThe metric_relabel_configs for scraping kubelet.``
kubelet.serviceMonitor.relabelingsThe relabel_configs for scraping kubelet.[{"sourceLabels":["__metrics_path__"], "targetLabel":"metrics_path"}]
nodeExporter.enabledDeploy the prometheus-node-exporter and scrape ittrue
nodeExporter.jobLabelThe name of the label on the target service to use as the job name in prometheus. See prometheus-node-exporter.podLabels.jobLabel=node-exporter defaultjobLabel
nodeExporter.serviceMonitor.intervalScrape interval. If not set, the Prometheus default scrape interval is usednil
nodeExporter.serviceMonitor.scrapeTimeoutHow long until a scrape request times out. If not set, the Prometheus default scape timeout is usednil
nodeExporter.serviceMonitor.metricRelabelingsMetric relablings for the prometheus-node-exporter ServiceMonitor[]
nodeExporter.serviceMonitor.relabelingsThe relabel_configs for scraping the prometheus-node-exporter.``
prometheus-node-exporter.extraArgsAdditional arguments for the node exporter container`["--collector.filesystem.ignored-mount-points=^/(dev
prometheus-node-exporter.namespaceOverrideOverride the deployment namespace of node exporter"" (Release.Namespace)
prometheus-node-exporter.podLabelsAdditional labels for pods in the DaemonSet{"jobLabel":"node-exporter"}

Specify each parameter using the --set key=value[,key=value] argument to helm install. For example,

console
$ helm install --name my-release stable/prometheus-operator --set prometheusOperator.enabled=true

Alternatively, one or more YAML files that specify the values for the above parameters can be provided while installing the chart. For example,

console
$ helm install --name my-release stable/prometheus-operator -f values1.yaml,values2.yaml

Tip: You can use the default values.yaml

PrometheusRules Admission Webhooks

With Prometheus Operator version 0.30+, the core Prometheus Operator pod exposes an endpoint that will integrate with the validatingwebhookconfiguration Kubernetes feature to prevent malformed rules from being added to the cluster.

How the Chart Configures the Hooks

A validating and mutating webhook configuration requires the endpoint to which the request is sent to use TLS. It is possible to set up custom certificates to do this, but in most cases, a self-signed certificate is enough. The setup of this component requires some more complex orchestration when using helm. The steps are created to be idempotent and to allow turning the feature on and off without running into helm quirks.

  1. A pre-install hook provisions a certificate into the same namespace using a format compatible with provisioning using end-user certificates. If the certificate already exists, the hook exits.
  2. The prometheus operator pod is configured to use a TLS proxy container, which will load that certificate.
  3. Validating and Mutating webhook configurations are created in the cluster, with their failure mode set to Ignore. This allows rules to be created by the same chart at the same time, even though the webhook has not yet been fully set up - it does not have the correct CA field set.
  4. A post-install hook reads the CA from the secret created by step 1 and patches the Validating and Mutating webhook configurations. This process will allow a custom CA provisioned by some other process to also be patched into the webhook configurations. The chosen failure policy is also patched into the webhook configurations

Alternatives

It should be possible to use jetstack/cert-manager if a more complete solution is required, but it has not been tested.

Limitations

Because the operator can only run as a single pod, there is potential for this component failure to cause rule deployment failure. Because this risk is outweighed by the benefit of having validation, the feature is enabled by default.

Developing Prometheus Rules and Grafana Dashboards

This chart Grafana Dashboards and Prometheus Rules are just a copy from coreos/prometheus-operator and other sources, synced (with alterations) by scripts in hack folder. In order to introduce any changes you need to first add them to the original repo and then sync there by scripts.

Further Information

For more in-depth documentation of configuration options meanings, please see

Migrating from coreos/prometheus-operator chart

The multiple charts have been combined into a single chart that installs prometheus operator, prometheus, alertmanager, grafana as well as the multitude of exporters necessary to monitor a cluster.

There is no simple and direct migration path between the charts as the changes are extensive and intended to make the chart easier to support.

The capabilities of the old chart are all available in the new chart, including the ability to run multiple prometheus instances on a single cluster - you will need to disable the parts of the chart you do not wish to deploy.

You can check out the tickets for this change here and here.

High-level overview of Changes

The chart has 3 dependencies, that can be seen in the chart's requirements file: https://github.com/helm/charts/blob/master/stable/prometheus-operator/requirements.yaml

Node-Exporter, Kube-State-Metrics

These components are loaded as dependencies into the chart. The source for both charts is found in the same repository. They are relatively simple components.

Grafana

The Grafana chart is more feature-rich than this chart - it contains a sidecar that is able to load data sources and dashboards from configmaps deployed into the same cluster. For more information check out the documentation for the chart

Coreos CRDs

The CRDs are provisioned using crd-install hooks, rather than relying on a separate chart installation. If you already have these CRDs provisioned and don't want to remove them, you can disable the CRD creation by these hooks by passing prometheusOperator.createCustomResource=false (not required if using Helm v3).

Kubelet Service

Because the kubelet service has a new name in the chart, make sure to clean up the old kubelet service in the kube-system namespace to prevent counting container metrics twice.

Persistent Volumes

If you would like to keep the data of the current persistent volumes, it should be possible to attach existing volumes to new PVCs and PVs that are created using the conventions in the new chart. For example, in order to use an existing Azure disk for a helm release called prometheus-migration the following resources can be created:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pvc-prometheus-migration-prometheus-0
spec:
  accessModes:
  - ReadWriteOnce
  azureDisk:
    cachingMode: None
    diskName: pvc-prometheus-migration-prometheus-0
    diskURI: /subscriptions/f5125d82-2622-4c50-8d25-3f7ba3e9ac4b/resourceGroups/sample-migration-resource-group/providers/Microsoft.Compute/disks/pvc-prometheus-migration-prometheus-0
    fsType: ""
    kind: Managed
    readOnly: false
  capacity:
    storage: 1Gi
  persistentVolumeReclaimPolicy: Delete
  storageClassName: prometheus
  volumeMode: Filesystem
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    app: prometheus
    prometheus: prometheus-migration-prometheus
  name: prometheus-prometheus-migration-prometheus-db-prometheus-prometheus-migration-prometheus-0
  namespace: monitoring
spec:
  accessModes:
  - ReadWriteOnce
  dataSource: null
  resources:
    requests:
      storage: 1Gi
  storageClassName: prometheus
  volumeMode: Filesystem
  volumeName: pvc-prometheus-migration-prometheus-0
status:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 1Gi

The PVC will take ownership of the PV and when you create a release using a persistent volume claim template it will use the existing PVCs as they match the naming convention used by the chart. For other cloud providers similar approaches can be used.

KubeProxy

The metrics bind address of kube-proxy is default to 127.0.0.1:10249 that prometheus instances cannot access to. You should expose metrics by changing metricsBindAddress field value to 0.0.0.0:10249 if you want to collect them.

Depending on the cluster, the relevant part config.conf will be in ConfigMap kube-system/kube-proxy or kube-system/kube-proxy-config. For example:

kubectl -n kube-system edit cm kube-proxy
apiVersion: v1
data:
  config.conf: |-
    apiVersion: kubeproxy.config.k8s.io/v1alpha1
    kind: KubeProxyConfiguration
    # ...
    # metricsBindAddress: 127.0.0.1:10249
    metricsBindAddress: 0.0.0.0:10249
    # ...
  kubeconfig.conf: |-
    # ...
kind: ConfigMap
metadata:
  labels:
    app: kube-proxy
  name: kube-proxy
  namespace: kube-system