install/kubernetes/helm/meshery/HEALTHCHECKS.md
This document provides guidance on configuring Kubernetes health checks for Meshery deployments using the enhanced /healthz/live and /healthz/ready endpoints.
Meshery implements Kubernetes-compliant health check endpoints that follow best practices from the Kubernetes API server:
/healthz/live - Liveness probe endpoint/healthz/ready - Readiness probe endpointBoth endpoints support a ?verbose=1 query parameter for detailed health check information.
/healthz/live)The liveness probe checks if Meshery server is running and responsive. Returns:
"ok" when the server is alive/healthz/ready)The readiness probe checks if Meshery is ready to accept traffic. It performs:
Returns:
Note: Extension status is informational and does not affect the readiness state.
The Meshery Helm chart includes pre-configured health checks in values.yaml:
probe:
livenessProbe:
enabled: true
initialDelaySeconds: 80
periodSeconds: 12
failureThreshold: 4
readinessProbe:
enabled: true
initialDelaySeconds: 10
periodSeconds: 4
failureThreshold: 4
When installing Meshery for the first time:
Liveness Probe: Set initialDelaySeconds to allow time for the server to start and load configurations
80-120 seconds for first startupReadiness Probe: Can have a shorter delay as it will automatically retry
10-30 secondsperiodSeconds (e.g., 4-5) for faster readiness detectionExample for new installations:
probe:
livenessProbe:
enabled: true
initialDelaySeconds: 120 # Allow more time for initial setup
periodSeconds: 15
failureThreshold: 5
readinessProbe:
enabled: true
initialDelaySeconds: 20
periodSeconds: 5
failureThreshold: 4
When upgrading an existing Meshery deployment:
Recommended settings for upgrades:
probe:
livenessProbe:
enabled: true
initialDelaySeconds: 60 # Shorter than initial install
periodSeconds: 12
failureThreshold: 4
readinessProbe:
enabled: true
initialDelaySeconds: 15 # Allow time for capability reload
periodSeconds: 5
failureThreshold: 6 # Higher threshold during upgrades
During troubleshooting, you can manually check health status with verbose output:
# Check liveness with details
kubectl exec -n meshery deployment/meshery -- curl -s "http://localhost:8080/healthz/live?verbose=1"
# Check readiness with details
kubectl exec -n meshery deployment/meshery -- curl -s "http://localhost:8080/healthz/ready?verbose=1"
Example verbose output:
[+]capabilities ok
[i]extension extension package found
healthz check passed
Where:
[+] indicates a passing health check[-] indicates a failing health check[i] indicates informational status (doesn't affect health)For specific deployment scenarios, you can customize probe settings:
probe:
livenessProbe:
enabled: true
initialDelaySeconds: 90
periodSeconds: 10
failureThreshold: 3
timeoutSeconds: 5
readinessProbe:
enabled: true
initialDelaySeconds: 15
periodSeconds: 3
failureThreshold: 3
successThreshold: 1
timeoutSeconds: 3
probe:
livenessProbe:
enabled: true
initialDelaySeconds: 150 # More time for slower startup
periodSeconds: 20
failureThreshold: 5
readinessProbe:
enabled: true
initialDelaySeconds: 30
periodSeconds: 10
failureThreshold: 6
For better handling of slow-starting containers, consider adding a startup probe:
# Add to deployment.yaml template
{{- if .Values.probe.startupProbe.enabled }}
startupProbe:
httpGet:
path: /healthz/live
port: http
initialDelaySeconds: {{ .Values.probe.startupProbe.initialDelaySeconds }}
periodSeconds: {{ .Values.probe.startupProbe.periodSeconds }}
failureThreshold: {{ .Values.probe.startupProbe.failureThreshold }}
{{- end }}
And in values.yaml:
probe:
startupProbe:
enabled: true
initialDelaySeconds: 0
periodSeconds: 10
failureThreshold: 30 # 30 * 10s = 5 minutes max startup time
Pods stuck in CrashLoopBackOff
initialDelaySeconds is too shortPods not becoming ready
Frequent restarts
failureThreshold to tolerate temporary issuesperiodSeconds to reduce probe frequency# Check pod status
kubectl get pods -n meshery
# View pod events
kubectl describe pod -n meshery <pod-name>
# Check health endpoint directly
kubectl port-forward -n meshery deployment/meshery 8080:8080
curl http://localhost:8080/healthz/ready?verbose=1
# View container logs
kubectl logs -n meshery deployment/meshery -f
If upgrading from a version without enhanced health checks: