Back to Alluxio

Introduction

integration/kubernetes/helm-chart/monitor/README.md

3139.6 KB
Original Source

Introduction

This chart bootstraps a monitoring system on a Kubernetes cluster using the Helm package manager. This monitor system can be used to monitor an Alluxio cluster started on Kubernetes cluster.

Pre-requisites

Kubernetes

Kubernetes 1.11+ with Beta APIs enabled

Install the Chart

To install the Monitor Chart into your Kubernetes cluster:

$ helm install --namespace "alluxio" "alluxio-monitor" monitor

After installation succeeds, you can get a status of Chart

$ helm status "alluxio-monitor"

Uninstall the Chart

If you want to delete your Chart, use this command:

$ helm delete --purge "alluxio-monitor"

Configuration

The monitor system is implemented based on Prometheus + Grafana, the resource files are placed in the monitor/source directory. Before installing the monitor chart, you may make some appropriate modifications to the configuration.

1. source/grafana/datasource.yaml

This grafana datasource url domain name is [MONITORNAME]-prometheus, for example: our monitor installation name is alluxio-monitor, then it will be 'alluxio-monitor-prometheus'

datasources:
  - name: Prometheus
    ...
    url: http://alluxio-monitor-prometheus:9090 

2. source/prometheus/prometheus.yaml

Change each prometheus job's namespace, For example, if the alluxio cluster we want to monitor is installed in alluxio namespace, then edit the prometheus.yaml:

scrape_configs:
  - job_name: 'alluxio master'
    kubernetes_sd_configs:
      - role: pod
        namespaces:
          names:
            - alluxio

3. Enable the alluxio metrics

To use the monitor, we need the alluxio prometheus podAnnotations defined in the '../alluxio/values.yaml' metrics part, so it is necessary to enable metrics before installing the alluxio chart. After that, the monitor can keep track of the target alluxio cluster.

metrics:
  enabled: true
  ...
  PrometheusMetricsServlet:
    enabled: true
  # Pod annotations for Prometheus
  podAnnotations:
     prometheus.io/scrape: "true"
     prometheus.io/port: "19999"
     prometheus.io/jobPort: "20002"
     prometheus.io/workerPort: "30000"
     prometheus.io/path: "/metrics/prometheus/"

4. Download the alluxio dashboard

Download the alluxio dashboard from Alluxio grafana dashboard V1, then move the dashboard file to monitor/source/grafana/dashboard directory.

Helm Chart Values

Full documentation can be found in the comments of the values.yaml file, but a high level overview is provided here.

Common Values:

ParameterDescriptionDefault
fullnameOverrideTo replace the generated namealluxio-monitor
imagePullPolicyDocker image pull policyIfNotPresent
grafanaConfig.name[0]Grafana dashboard config namegrafana-dashboard-config
grafanaConfig.path[0]Grafana dashboard config path in the image container/etc/grafana/provisioning/dashboards
grafanaConfig.name[1]Grafana datasource config namegrafana-datasource-config
grafanaConfig.path[1]Grafana datasource config path in the image container/etc/grafana/provisioning/datasources
prometheusConfig.namePrometheus config nameprometheus-config
prometheusConfig.pathPrometheus config path in the image container/etc/prometheus

Prometheus values:

ParameterDescriptionDefault
imageInfo.imageThe prometheus docker imageprom/prometheus
imageInfo.tagThe prometheus image taglatest
port.TCPThe prometheus default listen address9090
argsThe prometheus config args, see values.yaml for detail explanation--config.file=/etc/prometheus/prometheus.yml --storage.tsdb.path=/prometheus --storage.tsdb.retention=72h --web.listen-address=:9090
hostNetworkControls whether the pod may use the node network namespacefalse
dnsPolicydnsPolicy will be ClusterFirstWithHostNet if hostNetwork: true and ClusterFirst if hostNetwork: falseClusterFirst
resources.limits.cpuCPU Limit4
resources.limits.memoryMemory Limit4G
resources.requests.cpuCPU Request1
resources.requests.memoryMemory Request1G

Grafana values:

ParameterDescriptionDefault
imageInfo.imageThe grafana docker imagegrafana/grafana
imageInfo.tagThe grafana image taglatest
env.GF_AUTH_BASIC_ENABLEDEnvironment variable of grafana to enable basic authenticationtrue
env.GF_AUTH_ANONYMOUS_ENABLEDEnvironment variable of grafana to disable anonymous authenticationfalse
port.webThe grafana web port9090
port.hostPortThe hostPort export node port to visit the grafana web8081
hostNetworkControls whether the pod may use the node network namespacefalse
dnsPolicydnsPolicy will be ClusterFirstWithHostNet if hostNetwork: true and ClusterFirst if hostNetwork: falseClusterFirst
resources.limits.cpuCPU Limit2
resources.limits.memoryMemory Limit2G
resources.requests.cpuCPU Request0.5
resources.requests.memoryMemory Request1G