OUTLIER-DETECTION-EDS-RELATIONSHIP.md
This document explains the relationship between Outlier Detection (passive health checking) and EDS (Endpoint Discovery Service) in Consul's service mesh implementation.
EDS (Endpoint Discovery Service) is part of Envoy's xDS (discovery service) protocol. It dynamically provides Envoy with the list of healthy endpoints (IP addresses and ports) for upstream services.
From agent/xds/clusters.go:480-482:
// Endpoints are managed separately by EDS
// Having an empty config enables outlier detection with default config.
OutlierDetection: &envoy_cluster_v3.OutlierDetection{},
Why? Outlier detection monitors the health of individual endpoints. It needs:
From agent/xds/clusters.go:1344-1365:
useEDS := true
if _, ok := cfgSnap.ConnectProxy.PeerUpstreamEndpointsUseHostnames[uid]; ok {
// If we're using local mesh gw, the fact that upstreams use hostnames don't matter.
// If we're not using local mesh gw, then resort to CDS.
if upstreamConfig.MeshGateway.Mode != structs.MeshGatewayModeLocal {
useEDS = false
}
}
// If none of the service instances are addressed by a hostname we
// provide the endpoint IP addresses via EDS
if useEDS {
c.ClusterDiscoveryType = &envoy_cluster_v3.Cluster_Type{Type: envoy_cluster_v3.Cluster_EDS}
c.EdsClusterConfig = &envoy_cluster_v3.Cluster_EdsClusterConfig{
EdsConfig: &envoy_core_v3.ConfigSource{
InitialFetchTimeout: cfgSnap.GetXDSCommonConfig(s.Logger).GetXDSFetchTimeout(),
ResourceApiVersion: envoy_core_v3.ApiVersion_V3,
ConfigSourceSpecifier: &envoy_core_v3.ConfigSource_Ads{
Ads: &envoy_core_v3.AggregatedConfigSource{},
},
},
}
}
EDS is used when:
EDS is NOT used when:
From agent/xds/endpoints.go:223-224:
// Also skip gateways with a hostname as their address. EDS cannot resolve hostnames,
// so we provide them through CDS instead.
Reason: EDS expects IP addresses, not DNS names. For hostname-based services, Consul embeds the endpoints directly in the CDS (Cluster Discovery Service) configuration.
Configuration Phase
PassiveHealthCheck in service-defaults config entryUpstreamConfig.PassiveHealthCheckCluster Generation (CDS)
ClusterDiscoveryType: &envoy_cluster_v3.Cluster_Type{Type: envoy_cluster_v3.Cluster_EDS}
OutlierDetection: config.ToOutlierDetection(passiveHealthCheck, override, allowZero)
Endpoint Generation (EDS)
Runtime Monitoring
From agent/xds/delta.go:583-589:
// 3. EDS updates (if any) must arrive after CDS updates for the respective clusters.
{TypeUrl: xdscommon.EndpointType, Upsert: true},
// 4. LDS updates must arrive after corresponding CDS/EDS updates.
{TypeUrl: xdscommon.ListenerType, Upsert: true, Remove: true},
Critical: EDS updates must come after CDS to ensure Envoy has the cluster configuration (including outlier detection settings) before receiving endpoints.
From agent/xds/clusters.go:1316-1320:
outlierDetection := config.ToOutlierDetection(upstreamConfig.PassiveHealthCheck, nil, true)
// We can't rely on health checks for services on cluster peers because they
// don't take into account service resolvers, splitters and routers. Setting
// MaxEjectionPercent to 100% gives outlier detection the power to eject the
// entire cluster.
Special behavior: For peered services, MaxEjectionPercent is set to 100% because:
From agent/xds/clusters.go:1198-1205:
// Configure the outlier detector for upstream service
var override *structs.PassiveHealthCheck
if svc != nil {
override = svc.PassiveHealthCheck
}
outlierDetection := config.ToOutlierDetection(cfgSnap.IngressGateway.Defaults.PassiveHealthCheck, override, false)
Two-level configuration:
consul agent -dev -log-level=debug
# View cluster configuration (includes outlier detection)
curl http://localhost:19000/config_dump | jq '.configs[1].dynamic_active_clusters'
# View endpoint health status
curl http://localhost:19000/clusters | grep -A 5 "outlier_detection"
Look for ClusterDiscoveryType: EDS in cluster config:
curl http://localhost:19000/config_dump | jq '.configs[1].dynamic_active_clusters[] | select(.cluster.name=="db") | .cluster.type'
# Check Envoy stats for ejections
curl http://localhost:19000/stats | grep outlier_detection
Set breakpoints in:
agent/xds/clusters.go:1316 - Where outlier detection is configuredagent/xds/config/config.go:207 - ToOutlierDetection conversionagent/xds/endpoints.go:32 - Endpoint generationOutlier Detection and EDS are tightly coupled:
Key Takeaway: For outlier detection to work properly, services must use IP-based discovery (EDS), not hostname-based discovery (CDS with embedded endpoints).