SERVICE-DEFAULTS-CONFIGURATION-GUIDE.md
This guide explains how to configure service-defaults config entries in Consul, specifically focusing on outlier detection (passive health checking).
service-defaults is a Consul config entry that defines default settings for a service, including:
Create a file web-defaults.hcl:
Kind = "service-defaults"
Name = "web"
Protocol = "http"
UpstreamConfig {
Defaults {
# Apply to all upstreams of this service
PassiveHealthCheck {
Interval = "30s"
MaxFailures = 5
EnforcingConsecutive5xx = 100
MaxEjectionPercent = 10
BaseEjectionTime = "30s"
}
}
Overrides = [
{
# Override for specific upstream
Name = "database"
PassiveHealthCheck {
Interval = "10s"
MaxFailures = 3
EnforcingConsecutive5xx = 100
MaxEjectionPercent = 50
BaseEjectionTime = "60s"
}
}
]
}
Apply it:
consul config write web-defaults.hcl
curl -X PUT http://localhost:8500/v1/config \
-H "Content-Type: application/json" \
-d '{
"Kind": "service-defaults",
"Name": "web",
"Protocol": "http",
"UpstreamConfig": {
"Defaults": {
"PassiveHealthCheck": {
"Interval": "30s",
"MaxFailures": 5,
"EnforcingConsecutive5xx": 100,
"MaxEjectionPercent": 10,
"BaseEjectionTime": "30s"
}
}
}
}'
In your service registration file:
services {
name = "web"
port = 8080
connect {
sidecar_service {
proxy {
upstreams = [
{
destination_name = "database"
local_bind_port = 5432
config {
passive_health_check {
interval = "22s"
max_failures = 4
enforcing_consecutive_5xx = 99
max_ejection_percent = 50
base_ejection_time = "60s"
}
}
}
]
}
}
}
}
Register it:
consul services register web-service.hcl
package main
import (
"github.com/hashicorp/consul/api"
)
func main() {
client, _ := api.NewClient(api.DefaultConfig())
interval := 30 * time.Second
maxFailures := uint32(5)
enforcing := uint32(100)
maxEjection := uint32(10)
baseEjection := 30 * time.Second
entry := &api.ServiceConfigEntry{
Kind: api.ServiceDefaults,
Name: "web",
Protocol: "http",
UpstreamConfig: &api.UpstreamConfiguration{
Defaults: &api.UpstreamConfig{
PassiveHealthCheck: &api.PassiveHealthCheck{
Interval: interval,
MaxFailures: maxFailures,
EnforcingConsecutive5xx: &enforcing,
MaxEjectionPercent: &maxEjection,
BaseEjectionTime: &baseEjection,
},
},
},
}
_, _, err := client.ConfigEntries().Set(entry, nil)
if err != nil {
panic(err)
}
}
| Parameter | Type | Description | Default |
|---|---|---|---|
Interval | duration | Time between health check analysis sweeps | - |
MaxFailures | uint32 | Consecutive failures before ejection | - |
EnforcingConsecutive5xx | uint32 | % chance of ejection (0-100) | 100 |
MaxEjectionPercent | uint32 | Max % of cluster that can be ejected | 10 |
BaseEjectionTime | duration | Base ejection duration (multiplied by ejection count) | 30s |
Consul applies outlier detection configuration in this order (highest to lowest priority):
Kind = "service-defaults"
Name = "api"
Protocol = "http"
UpstreamConfig {
Defaults {
PassiveHealthCheck {
Interval = "10s"
MaxFailures = 3
}
}
}
This enables outlier detection with:
Kind = "service-defaults"
Name = "payment-service"
Protocol = "http"
UpstreamConfig {
Defaults {
PassiveHealthCheck {
Interval = "5s"
MaxFailures = 2
EnforcingConsecutive5xx = 100
MaxEjectionPercent = 50
BaseEjectionTime = "120s"
}
}
}
This configuration:
Kind = "service-defaults"
Name = "frontend"
Protocol = "http"
UpstreamConfig {
# Default for all upstreams
Defaults {
PassiveHealthCheck {
Interval = "30s"
MaxFailures = 5
}
}
# Stricter settings for critical database
Overrides = [
{
Name = "postgres"
PassiveHealthCheck {
Interval = "10s"
MaxFailures = 2
MaxEjectionPercent = 30
}
},
{
Name = "redis"
PassiveHealthCheck {
Interval = "5s"
MaxFailures = 3
}
}
]
}
consul config read -kind service-defaults -name web
# Get cluster configuration
curl http://localhost:19000/config_dump | jq '.configs[1].dynamic_active_clusters[] | select(.cluster.name=="db") | .cluster.outlier_detection'
Expected output:
{
"interval": "22s",
"consecutive_5xx": 4,
"enforcing_consecutive_5xx": 99,
"max_ejection_percent": 50,
"base_ejection_time": "60s"
}
# Check Envoy stats for outlier detection
curl http://localhost:19000/stats | grep outlier_detection
# Example output:
# cluster.db.outlier_detection.ejections_active: 0
# cluster.db.outlier_detection.ejections_consecutive_5xx: 2
# cluster.db.outlier_detection.ejections_total: 2
Apply to all services:
Kind = "service-defaults"
Name = "*"
UpstreamConfig {
Defaults {
PassiveHealthCheck {
Interval = "30s"
MaxFailures = 5
}
}
}
Set all values to 0 or very high:
Kind = "service-defaults"
Name = "legacy-service"
UpstreamConfig {
Defaults {
PassiveHealthCheck {
MaxFailures = 999999
EnforcingConsecutive5xx = 0
}
}
}
Start with low enforcement, increase gradually:
# Week 1: 25% enforcement
PassiveHealthCheck {
EnforcingConsecutive5xx = 25
}
# Week 2: 50% enforcement
PassiveHealthCheck {
EnforcingConsecutive5xx = 50
}
# Week 3: 100% enforcement
PassiveHealthCheck {
EnforcingConsecutive5xx = 100
}
Check:
consul config read)/config_dump)Solution: Increase MaxEjectionPercent or MaxFailures:
PassiveHealthCheck {
MaxFailures = 10
MaxEjectionPercent = 30
}
Solution: Decrease BaseEjectionTime:
PassiveHealthCheck {
BaseEjectionTime = "10s"
}
MaxFailures and low MaxEjectionPercentInterval based on request volume