Back to Charts

⚠️ Repo Archive Notice

stable/spark/README.md

latest9.1 KB
Original Source

⚠️ Repo Archive Notice

As of Nov 13, 2020, charts in this repo will no longer be updated. For more information, see the Helm Charts Deprecation and Archive Notice, and Update.

Apache Spark Helm Chart

Apache Spark is a fast and general-purpose cluster computing system including Apache Zeppelin.

Inspired from Helm Classic chart https://github.com/helm/charts

DEPRECATION NOTICE

This chart is deprecated and no longer supported.

Chart Details

This chart will do the following:

  • 1 x Spark Master with port 8080 exposed on an external LoadBalancer
  • 3 x Spark Workers with HorizontalPodAutoscaler to scale to max 10 pods when CPU hits 50% of 100m
  • 1 x Zeppelin with port 8080 exposed on an external LoadBalancer
  • All using Kubernetes Deployments

Prerequisites

Installing the Chart

To install the chart with the release name my-release:

bash
$ helm install --name my-release stable/spark

Configuration

The following table lists the configurable parameters of the Spark chart and their default values.

Spark Master

ParameterDescriptionDefault
Master.NameSpark master namespark-master
Master.ImageContainer image namek8s.gcr.io/spark
Master.ImageTagContainer image tag1.5.1_v3
Master.Replicask8s deployment replicas1
Master.Componentk8s selector keyspark-master
Master.Cpucontainer requested cpu100m
Master.Memorycontainer requested memory512Mi
Master.ServicePortk8s service port7077
Master.ContainerPortContainer listening port7077
Master.DaemonMemoryMaster JVM Xms and Xmx option1g
Master.ServiceType Kubernetes Service typeLoadBalancer

Spark WebUi

ParameterDescriptionDefault
WebUi.NameSpark webui namespark-webui
WebUi.ServicePortk8s service port8080
WebUi.ContainerPortContainer listening port8080

Spark Worker

ParameterDescriptionDefault
Worker.NameSpark worker namespark-worker
Worker.ImageContainer image namek8s.gcr.io/spark
Worker.ImageTagContainer image tag1.5.1_v3
Worker.Replicask8s hpa and deployment replicas3
Worker.ReplicasMaxk8s hpa max replicas10
Worker.Componentk8s selector keyspark-worker
Worker.Cpucontainer requested cpu100m
Worker.Memorycontainer requested memory512Mi
Worker.ContainerPortContainer listening port7077
Worker.CpuTargetPercentagek8s hpa cpu targetPercentage50
Worker.DaemonMemoryWorker JVM Xms and Xmx setting1g
Worker.ExecutorMemoryWorker memory available for executor1g
Worker.AutoscalingEnable horizontal pod autoscalingfalse

Zeppelin

ParameterDescriptionDefault
Zeppelin.NameZeppelin namezeppelin-controller
Zeppelin.ImageContainer image nameapache/zeppelin
Zeppelin.ImageTagContainer image tag0.7.3
Zeppelin.Replicask8s deployment replicas1
Zeppelin.Componentk8s selector keyzeppelin
Zeppelin.Cpucontainer requested cpu100m
Zeppelin.ServicePortk8s service port8080
Zeppelin.ContainerPortContainer listening port8080
Zeppelin.Ingress.Enabledif true, an ingress is createdfalse
Zeppelin.Ingress.Annotationsannotations for the ingress{}
Zeppelin.Ingress.Paththe ingress path/
Zeppelin.Ingress.Hostsa list of ingress hosts[zeppelin.example.com]
Zeppelin.Ingress.Tlsa list of IngressTLS items[]
Zeppelin.ServiceType Kubernetes Service typeLoadBalancer
Zeppelin.Persistence.Config.EnabledEnable Persistence for configurationfalse
Zeppelin.Persistence.Config.StorageClassVolume storageClassName- (no dynamic provisioning)
Zeppelin.Persistence.Config.SizeConfiguration Persistence Size10G
Zeppelin.Persistence.Config.AccessModeConfiguration Persistence AccessModeReadWriteOnce
Zeppelin.Persistence.Notebook.EnabledEnable Persistence for notebookfalse
Zeppelin.Persistence.Notebook.StorageClassVolume storageClassName- (no dynamic provisioning)
Zeppelin.Persistence.Notebook.SizeNotebook Persistence Size10G
Zeppelin.Persistence.Notebook.AccessModeNotebook Persistence AccessModeReadWriteOnce

Specify each parameter using the --set key=value[,key=value] argument to helm install.

Alternatively, a YAML file that specifies the values for the parameters can be provided while installing the chart. For example,

bash
$ helm install --name my-release -f values.yaml stable/spark

Tip: You can use the default values.yaml