Dapr 1.15.4

This update includes bug fixes:

Fix degradation of Workflow runtime performance over time
Fix remote Actor invocation 500 retry
Fix Global Actors Enabled Configuration
Prevent panic of reminder operations on slow Actor Startup
Remove client-side rate limiter from Sentry
Allow Service Account for MetalBear mirrord operator in sidecar injector
Fix Scheduler Client connection pruning

Fix degradation of Workflow runtime performance over time

Problem

Running a Workflow app multiple times would cause the performance of the Workflow runtime to degrade significantly over multiple runs.

Impact

Workflow applications would not complete in a timely manner.

Root cause

There was an issue whereby Scheduler client (daprd) connections where not properly pruned from the connection pool for a given Namespace's appID/actorTypes set. This would lead to jobs/actor reminders being sent to stale client connections that were no longer active. This caused Jobs to fail, and enter failure policy retry loops.

Solution

Refactor the Scheduler connection pool logic to properly prune stale connections to prevent job execution occurring on stale connections and causing failure policy loops.

Fix remote Actor invocation 500 retry

Problem

An actor invocation across hosts which result in a 500 HTTP header response code would result in the request being retried 5 times.

Impact

Services which return a 500 HTTP header response code would result in requests under normal operation to return slowly, and request the service on the same request multiple times.

Root cause

The Actor engine considered a 500 HTTP header response code to be a retriable error, rather than a successful request which returned a non-200 status code.

Solution

Remove the 500 HTTP header response code from the list of retriable errors.

Problem

Fix Global Actors Enabled Configuration

Problem

When global.actors.enabled was set to false via Helm or the environment variable ACTORS_ENABLED=false, the Dapr sidecar would still attempt to connect to the placement service, causing readiness probe failures and repeatedly logged errors about failing to connect to placement. Fixes this issue.

Impact

Dapr sidecars would fail their readiness probes and log errors like:

Failed to connect to placement dns:///dapr-placement-server.dapr-system.svc.cluster.local:50005: failed to create placement client: rpc error: code = Unavailable desc = last resolver error: produced zero addresses

Root cause

The sidecar injector was not properly respecting the global actors enabled configuration when setting up the placement service connection.

Solution

The sidecar injector now properly respects the global.actors.enabled helm configuration and ACTORS_ENABLED environment variable. When set to false, it will not attempt to connect to the placement service, allowing the sidecar to start successfully without actor functionality.

Prevent panic of reminder operations on slow Actor Startup

Problem

The Dapr runtime HTTP server would panic if a reminder operation timed out while an Actor was starting up.

Impact

The HTTP server would panic, causing degraded performance.

Root cause

The Dapr runtime would attempt to use the reminder service before it was initialized.

Solution

Correctly return an errors that the actor runtime was not ready in time for the reminder operation.

Remove client-side rate limiter from Sentry

Problem

A cold start of many Dapr deployments would take a long time, and even cause some crash loops.

Impact

A large Dapr deployment would take a non-linear more amount of time that a smaller one to completely roll out.

Root cause

The Sentry Kubernetes client was configured with a rate limiter which would be exhausted when services all new Dapr deployment at once, cause many client to wait significantly.

Solution

Remove the client-side rate limiting from the Sentry Kubernetes client.

Allow Service Account for MetalBear mirrord operator in sidecar injector

Problem

Mirrord Operator is not on the allow list of Service Accounts for the dapr sidecar injector.

Impact

Running mirrord in copy_target mode would cause the pod to initalise without the dapr container.

Root cause

Mirrord Operator is not on the allow list of Service Accounts for the dapr sidecar injector.

Solution

Add the Mirrord Operator into the allow list of Service Accounts for the dapr sidecar injector.

Fix Scheduler Client connection pruning

Problem

Daprd would attempt to connect to stale Scheduler addresses.

Impact

Network resource usage and error reporting from service mesh sidecars.

Root cause

Daprd would not close Scheduler gRPC connections to hosts which no longer exist.

Solution

Daprd now closes connections to Scheduler hosts when they are no longer in the list of active hosts.