Back to Pipeline

Pipeline Controller Metrics

docs/metrics.md

1.12.08.5 KB
Original Source
<!-- --- linkTitle: "Pipeline Metrics" weight: 304 --- -->

Pipeline Controller Metrics

The following pipeline metrics are available at controller-service on port 9090.

Metrics are exported using OpenTelemetry and can be configured via the observability ConfigMap. By default, Prometheus export is enabled. OTLP (gRPC and HTTP) export is also available for sending metrics to an OpenTelemetry Collector or compatible backend.

Core Tekton Metrics

NameTypeLabels/TagsStatus
tekton_pipelines_controller_pipelinerun_duration_seconds_[bucket, sum, count]Histogram/LastValue(Gauge)*pipeline=<pipeline_name>
*pipelinerun=<pipelinerun_name>
status=<status>
namespace=<pipelinerun-namespace>
*reason=<reason>experimental
tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_[bucket, sum, count]Histogram/LastValue(Gauge)*pipeline=<pipeline_name>
*pipelinerun=<pipelinerun_name>
status=<status>
*task=<task_name>
*taskrun=<taskrun_name>
namespace=<pipelineruns-taskruns-namespace>
*reason=<reason>experimental
tekton_pipelines_controller_pipelinerun_totalCounterstatus=<status>experimental
tekton_pipelines_controller_running_pipelinerunsGaugeexperimental
tekton_pipelines_controller_taskrun_duration_seconds_[bucket, sum, count]Histogram/LastValue(Gauge)status=<status>
*task=<task_name>
*taskrun=<taskrun_name>
namespace=<pipelineruns-taskruns-namespace>
*reason=<reason>experimental
tekton_pipelines_controller_taskrun_totalCounterstatus=<status>experimental
tekton_pipelines_controller_running_taskrunsGaugeexperimental
tekton_pipelines_controller_running_taskruns_throttled_by_quotaGaugenamespace=<pipelinerun-namespace>experimental
tekton_pipelines_controller_running_taskruns_throttled_by_nodeGaugenamespace=<pipelinerun-namespace>experimental
tekton_pipelines_controller_running_pipelineruns_waiting_on_pipeline_resolutionGaugeexperimental
tekton_pipelines_controller_running_pipelineruns_waiting_on_task_resolutionGaugeexperimental
tekton_pipelines_controller_running_taskruns_waiting_on_task_resolution_countGaugeexperimental
tekton_pipelines_controller_taskruns_pod_latency_millisecondsHistogramnamespace=<namespace> *task=<task_name> *taskrun=<taskrun_name> (unbounded cardinality, see #9393)experimental

The Labels/Tags marked as "*" are optional. There is a choice between Histogram and LastValue(Gauge) for pipelinerun and taskrun duration metrics.

Note: All metrics now carry an otel_scope_name label identifying the instrumentation package. This label is informational and transparent to most PromQL queries.

Infrastructure Metrics

These metrics are provided by the Knative and Go runtime infrastructure. Their names changed as part of the OpenCensus to OpenTelemetry migration (see migration guide for full details).

NameTypeDescription
kn_workqueue_adds_totalCounterWorkqueue additions
kn_workqueue_depthGaugeCurrent workqueue depth
kn_workqueue_queue_duration_secondsHistogramTime items spend in queue
kn_workqueue_process_duration_secondsHistogramTime to process items
kn_workqueue_retries_totalCounterWorkqueue retries
kn_workqueue_unfinished_work_secondsGaugeSeconds of work in progress
http_client_request_duration_secondsHistogramK8s API client request duration
kn_k8s_client_http_response_status_code_totalCounterK8s API response status codes
go_goroutinesGaugeNumber of goroutines
go_memstats_alloc_bytesGaugeBytes allocated and still in use

Configuring Metrics using config-observability ConfigMap

A sample ConfigMap has been provided as config-observability.

Metrics and tracing protocol

The metrics-protocol key controls how metrics are exported:

ValueDescription
prometheusStarts an HTTP server on port 9090 serving /metrics (default)
grpcExports via OTLP gRPC to the configured metrics-endpoint
http/protobufExports via OTLP HTTP to the configured metrics-endpoint
noneDisables metrics export

The tracing-protocol key controls distributed tracing:

ValueDescription
noneDisables tracing (default)
grpcExports traces via OTLP gRPC to tracing-endpoint
http/protobufExports traces via OTLP HTTP to tracing-endpoint
stdoutPrints traces to stdout (for debugging)

Note: The previous OpenCensus configuration keys (metrics.backend-destination, metrics.stackdriver-project-id, etc.) are no longer supported. See the migration guide for details on upgrading.

Tekton-specific metrics settings

By default, taskrun and pipelinerun metrics have these values:

yaml
    metrics.taskrun.level: "task"
    metrics.taskrun.duration-type: "histogram"
    metrics.pipelinerun.level: "pipeline"
    metrics.running-pipelinerun.level: ""
    metrics.pipelinerun.duration-type: "histogram"
    metrics.count.enable-reason: "false"

Following values are available in the ConfigMap:

ConfigMap datavaluedescription
metrics.taskrun.leveltaskrunLevel of metrics is taskrun
metrics.taskrun.leveltaskLevel of metrics is task and taskrun label isn't present in the metrics
metrics.taskrun.levelnamespaceLevel of metrics is namespace, and task and taskrun label isn't present in the metrics
metrics.pipelinerun.levelpipelinerunLevel of metrics is pipelinerun
metrics.pipelinerun.levelpipelineLevel of metrics is pipeline and pipelinerun label isn't present in the metrics
metrics.pipelinerun.levelnamespaceLevel of metrics is namespace, pipeline and pipelinerun label isn't present in the metrics
metrics.running-pipelinerun.levelpipelinerunLevel of running-pipelinerun metrics is pipelinerun
metrics.running-pipelinerun.levelpipelineLevel of running-pipelinerun metrics is pipeline and pipelinerun label isn't present in the metrics
metrics.running-pipelinerun.levelnamespaceLevel of running-pipelinerun metrics is namespace, pipeline and pipelinerun label isn't present in the metrics
metrics.running-pipelinerun.level``Level of running-pipelinerun metrics is cluster, namespace, pipeline and pipelinerun label isn't present in the metrics.
metrics.taskrun.duration-typehistogramtekton_pipelines_controller_pipelinerun_taskrun_duration_seconds and tekton_pipelines_controller_taskrun_duration_seconds is of type histogram
metrics.taskrun.duration-typelastvaluetekton_pipelines_controller_pipelinerun_taskrun_duration_seconds and tekton_pipelines_controller_taskrun_duration_seconds is of type gauge or lastvalue
metrics.pipelinerun.duration-typehistogramtekton_pipelines_controller_pipelinerun_duration_seconds is of type histogram
metrics.pipelinerun.duration-typelastvaluetekton_pipelines_controller_pipelinerun_duration_seconds is of type gauge or lastvalue
metrics.count.enable-reasonfalseSets if the reason label should be included on duration metrics (*_duration_seconds); never affects total counters (*_total)
metrics.taskrun.throttle.enable-namespacefalseSets if the namespace label should be included on the tekton_pipelines_controller_running_taskruns_throttled_by_quota metric

Histogram value isn't available when pipelinerun or taskrun labels are selected. The Lastvalue or Gauge will be provided. Histogram would serve no purpose because it would generate a single bar. TaskRun and PipelineRun level metrics aren't recommended because they lead to an unbounded cardinality which degrades the observability database.

Verifying metrics

shell
kubectl port-forward -n tekton-pipelines service/tekton-pipelines-controller 9090

Then check that changes have been applied to metrics coming from http://127.0.0.1:9090/metrics