docs/release_notes/v1.15.4.md
This update includes bug fixes:
Running a Workflow app multiple times would cause the performance of the Workflow runtime to degrade significantly over multiple runs.
Workflow applications would not complete in a timely manner.
There was an issue whereby Scheduler client (daprd) connections where not properly pruned from the connection pool for a given Namespace's appID/actorTypes set. This would lead to jobs/actor reminders being sent to stale client connections that were no longer active. This caused Jobs to fail, and enter failure policy retry loops.
Refactor the Scheduler connection pool logic to properly prune stale connections to prevent job execution occurring on stale connections and causing failure policy loops.
An actor invocation across hosts which result in a 500 HTTP header response code would result in the request being retried 5 times.
Services which return a 500 HTTP header response code would result in requests under normal operation to return slowly, and request the service on the same request multiple times.
The Actor engine considered a 500 HTTP header response code to be a retriable error, rather than a successful request which returned a non-200 status code.
Remove the 500 HTTP header response code from the list of retriable errors.
When global.actors.enabled was set to false via Helm or the environment variable ACTORS_ENABLED=false, the Dapr sidecar would still attempt to connect to the placement service, causing readiness probe failures and repeatedly logged errors about failing to connect to placement.
Fixes this issue.
Dapr sidecars would fail their readiness probes and log errors like:
Failed to connect to placement dns:///dapr-placement-server.dapr-system.svc.cluster.local:50005: failed to create placement client: rpc error: code = Unavailable desc = last resolver error: produced zero addresses
The sidecar injector was not properly respecting the global actors enabled configuration when setting up the placement service connection.
The sidecar injector now properly respects the global.actors.enabled helm configuration and ACTORS_ENABLED environment variable. When set to false, it will not attempt to connect to the placement service, allowing the sidecar to start successfully without actor functionality.
The Dapr runtime HTTP server would panic if a reminder operation timed out while an Actor was starting up.
The HTTP server would panic, causing degraded performance.
The Dapr runtime would attempt to use the reminder service before it was initialized.
Correctly return an errors that the actor runtime was not ready in time for the reminder operation.
A cold start of many Dapr deployments would take a long time, and even cause some crash loops.
A large Dapr deployment would take a non-linear more amount of time that a smaller one to completely roll out.
The Sentry Kubernetes client was configured with a rate limiter which would be exhausted when services all new Dapr deployment at once, cause many client to wait significantly.
Remove the client-side rate limiting from the Sentry Kubernetes client.
Mirrord Operator is not on the allow list of Service Accounts for the dapr sidecar injector.
Running mirrord in copy_target mode would cause the pod to initalise without the dapr container.
Mirrord Operator is not on the allow list of Service Accounts for the dapr sidecar injector.
Add the Mirrord Operator into the allow list of Service Accounts for the dapr sidecar injector.
Daprd would attempt to connect to stale Scheduler addresses.
Network resource usage and error reporting from service mesh sidecars.
Daprd would not close Scheduler gRPC connections to hosts which no longer exist.
Daprd now closes connections to Scheduler hosts when they are no longer in the list of active hosts.