docs/strategies/retry.md
AddRetryThe retry reactive resilience strategy re-executes the same callback method if its execution fails. Failure can be either an Exception or a result object indicating unsuccessful processing. Between the retry attempts the retry strategy waits a specified amount of time. You have fine-grained control over how to calculate the next delay. The retry strategy stops invoking the same callback when it reaches the maximum allowed number of retry attempts or an unhandled exception is thrown / result object indicating a failure is returned.
// Retry using the default options.
// See https://www.pollydocs.org/strategies/retry#defaults for defaults.
var optionsDefaults = new RetryStrategyOptions();
// For instant retries with no delay
var optionsNoDelay = new RetryStrategyOptions
{
Delay = TimeSpan.Zero
};
// For advanced control over the retry behavior, including the number of attempts,
// delay between retries, and the types of exceptions to handle.
var optionsComplex = new RetryStrategyOptions
{
ShouldHandle = new PredicateBuilder().Handle<SomeExceptionType>(),
BackoffType = DelayBackoffType.Exponential,
UseJitter = true, // Adds a random factor to the delay
MaxRetryAttempts = 4,
Delay = TimeSpan.FromSeconds(3),
};
// To use a custom function to generate the delay for retries
var optionsDelayGenerator = new RetryStrategyOptions
{
MaxRetryAttempts = 2,
DelayGenerator = static args =>
{
var delay = args.AttemptNumber switch
{
0 => TimeSpan.Zero,
1 => TimeSpan.FromSeconds(1),
_ => TimeSpan.FromSeconds(5)
};
// This example uses a synchronous delay generator,
// but the API also supports asynchronous implementations.
return new ValueTask<TimeSpan?>(delay);
}
};
// To extract the delay from the result object
var optionsExtractDelay = new RetryStrategyOptions<HttpResponseMessage>
{
DelayGenerator = static args =>
{
if (args.Outcome.Result is HttpResponseMessage responseMessage &&
TryGetDelay(responseMessage, out TimeSpan delay))
{
return new ValueTask<TimeSpan?>(delay);
}
// Returning null means the retry strategy will use its internal delay for this attempt.
return new ValueTask<TimeSpan?>((TimeSpan?)null);
}
};
// To get notifications when a retry is performed
var optionsOnRetry = new RetryStrategyOptions
{
MaxRetryAttempts = 2,
OnRetry = static args =>
{
Console.WriteLine("OnRetry, Attempt: {0}", args.AttemptNumber);
// Event handlers can be asynchronous; here, we return an empty ValueTask.
return default;
}
};
// To keep retrying indefinitely or until success use int.MaxValue.
var optionsIndefiniteRetry = new RetryStrategyOptions
{
MaxRetryAttempts = int.MaxValue,
};
// Add a retry strategy with a RetryStrategyOptions{<TResult>} instance to the pipeline
new ResiliencePipelineBuilder().AddRetry(optionsDefaults);
new ResiliencePipelineBuilder<HttpResponseMessage>().AddRetry(optionsExtractDelay);
| Property | Default Value | Description |
|---|---|---|
ShouldHandle | Any exceptions other than OperationCanceledException. | Defines a predicate to determine what results and/or exceptions are handled by the retry strategy. |
MaxRetryAttempts | 3 | The maximum number of retry attempts to use, in addition to the original call. |
BackoffType | Constant | The back-off algorithm type to generate the delay(s) between retry attempts. |
Delay | 2 seconds | The base delay between retry attempts. See the next section for more details. |
MaxDelay | null | If provided then the strategy caps the calculated retry delay to this value. |
UseJitter | False | If set to true, a jitter (random value) is added to retry delays. See the next section for more details. |
DelayGenerator | null | This optional delegate allows you to dynamically calculate the retry delay by utilizing information that is only available at runtime (like the attempt number). |
OnRetry | null | If provided then it will be invoked before the strategy delays the next attempt. |
The retry strategy reports the following telemetry events:
| Event Name | Event Severity | When? |
|---|---|---|
ExecutionAttempt | Information / Warning / Error | Just before the strategy calculates the next delay |
OnRetry | Warning | Just before the strategy calls the OnRetry delegate |
Here are some sample events:
If the retry strategy does not perform any retries then the reported telemetry events' severity will be Information:
Execution attempt. Source: 'MyPipeline/MyPipelineInstance/MyRetryStrategy', Operation Key: 'MyRetryableOperation', Result: '1', Handled: 'False', Attempt: '0', Execution Time: 110.952ms
Execution attempt. Source: 'MyPipeline/MyPipelineInstance/MyRetryStrategy', Operation Key: 'MyRetryableOperation', Result: 'Failed', Handled: 'False', Attempt: '0', Execution Time: 5.2194ms
System.Exception: Failed
at Program.<>c.<Main>b__0_1(ResilienceContext ctx)
...
at Polly.ResiliencePipeline.<>c.<<ExecuteAsync>b__1_0>d.MoveNext() in /_/src/Polly.Core/ResiliencePipeline.Async.cs:line 67
If the retry strategy performs some retries then the reported telemetry events' severity will be Warning. If the retry strategy runs out of retry attempts then the last event's severity will be Error:
Execution attempt. Source: 'MyPipeline/MyPipelineInstance/MyRetryStrategy', Operation Key: 'MyRetryableOperation', Result: 'Failed', Handled: 'True', Attempt: '0', Execution Time: 5.0397ms
System.Exception: Failed
at Program.<>c.<Main>b__0_1(ResilienceContext ctx)
...
at Polly.ResiliencePipeline.<>c.<<ExecuteAsync>b__1_0>d.MoveNext() in /_/src/Polly.Core/ResiliencePipeline.Async.cs:line 67
Resilience event occurred. EventName: 'OnRetry', Source: 'MyPipeline/MyPipelineInstance/MyRetryStrategy', Operation Key: 'MyRetryableOperation', Result: 'Failed'
System.Exception: Failed
at Program.<>c.<Main>b__0_1(ResilienceContext ctx)
...
at Polly.ResiliencePipeline.<>c.<<ExecuteAsync>b__1_0>d.MoveNext() in /_/src/Polly.Core/ResiliencePipeline.Async.cs:line 67
Execution attempt. Source: 'MyPipeline/MyPipelineInstance/MyRetryStrategy', Operation Key: 'MyRetryableOperation', Result: 'Failed', Handled: 'True', Attempt: '1', Execution Time: 0.1159ms
System.Exception: Failed
at Program.<>c.<Main>b__0_1(ResilienceContext ctx)
...
at Polly.ResiliencePipeline.<>c.<<ExecuteAsync>b__1_0>d.MoveNext() in /_/src/Polly.Core/ResiliencePipeline.Async.cs:line 67
[!NOTE] Please note that the
OnRetrytelemetry event will be reported only if the retry strategy performs any retry attempts.On the other hand the
Execution attemptevent will be always reported regardless whether the strategy has to perform any retries.Also remember that
Attempt: '0'relates to the original execution attempt.Only the last error event will have
Errorseverity if it was unsuccessful.
For further information please check out the telemetry page.
If the ShouldHandle predicate returns true and the next attempt number is not greater than MaxRetryAttempts then the retry strategy calculates the next delay.
There are many properties that may contribute to this calculation:
BackoffType: Specifies which calculation algorithm should run.Delay: If only this property is specified then it will be used as-is. If others are also specified then this will be used as a base delay.DelayGenerator: If specified, overrides other property-based calculations, except if it returns null or a negative TimeSpan, in which case the other property-based calculations are used.MaxDelay: If specified, caps the delay if the calculated delay is greater than this value, except if DelayGenerator is used, where no capping is applied.UseJitter: If enabled, adds a random value between -25% and +25% of the calculated Delay, except if BackoffType is Exponential, where a DecorrelatedJitterBackoffV2 formula is used for jitter calculation.
[!IMPORTANT] The summarized description below is an implementation detail. It may change in the future without notice.
The BackoffType property's data type is the DelayBackoffType enumeration. This primarily controls how the calculation is done.
stateDiagram-v2
state if_state_step1 <<choice>>
state if_state_step2 <<choice>>
state if_state_step3 <<choice>>
constant: Delay
constantWJitter: Delay + Random
compare: MaxDelay < BaseDelay
setBase: Set BaseDelay
setNormalized: Set NormalizedDelay
setNext: Set NextDelay
UseJitter --> if_state_step1
if_state_step1 --> constantWJitter:true
if_state_step1 --> constant: false
constantWJitter --> setBase
constant --> setBase
setBase --> compare
compare --> if_state_step2
if_state_step2 --> MaxDelay: true
if_state_step2 --> BaseDelay: false
MaxDelay --> setNormalized
BaseDelay --> setNormalized
setNormalized --> DelayGenerator
DelayGenerator --> if_state_step3
if_state_step3 --> GeneratedDelay: positive
if_state_step3 --> NormalizedDelay: null or negative
GeneratedDelay --> setNext
NormalizedDelay --> setNext
setNext --> [*]
The delays column contains an example series of five values to depict the patterns.
| Settings | Delays in milliseconds |
|---|---|
Delay: 1sec | [ 1000, 1000, 1000, 1000, 1000 ] |
Delay: 1sec, UseJitter: true | [ 986, 912, 842, 972, 1007 ] |
Delay: 1sec, UseJitter: true, MaxDelay: 1100ms | [ 1100, 978, 1100, 1041, 916 ] |
stateDiagram-v2
state if_state_step1 <<choice>>
state if_state_step2 <<choice>>
state if_state_step3 <<choice>>
linear: Delay * AttemptNumber
linearWJitter: (Delay * AttemptNumber) + Random
compare: MaxDelay < BaseDelay
setBase: Set BaseDelay
setNormalized: Set NormalizedDelay
setNext: Set NextDelay
UseJitter --> if_state_step1
if_state_step1 --> linearWJitter:true
if_state_step1 --> linear: false
linearWJitter --> setBase
linear --> setBase
setBase --> compare
compare --> if_state_step2
if_state_step2 --> MaxDelay: true
if_state_step2 --> BaseDelay: false
MaxDelay --> setNormalized
BaseDelay --> setNormalized
setNormalized --> DelayGenerator
DelayGenerator --> if_state_step3
if_state_step3 --> GeneratedDelay: positive
if_state_step3 --> NormalizedDelay: null or negative
GeneratedDelay --> setNext
NormalizedDelay --> setNext
setNext --> [*]
The delays column contains an example series of five values to depict the patterns.
[!NOTE] Because the jitter calculation is based on the newly calculated delay, the new delay could be less than the previous value.
| Settings | Delays in milliseconds |
|---|---|
Delay: 1sec | [ 1000, 2000, 3000, 4000, 5000 ] |
Delay: 1sec, UseJitter: true | [ 1129, 2147, 2334, 4894, 4102 ] |
Delay: 1sec, UseJitter: true, MaxDelay: 4500ms | [ 907, 2199, 2869, 4500, 4500 ] |
stateDiagram-v2
state if_state_step1 <<choice>>
state if_state_step2 <<choice>>
state if_state_step3 <<choice>>
exponential: Delay * 2^AttemptNumber
exponentialWJitter: Decorrelated Jitter Backoff V2
compare: MaxDelay < BaseDelay
setBase: Set BaseDelay
setNormalized: Set NormalizedDelay
setNext: Set NextDelay
UseJitter --> if_state_step1
if_state_step1 --> exponentialWJitter:true
if_state_step1 --> exponential: false
exponentialWJitter --> setBase
exponential --> setBase
setBase --> compare
compare --> if_state_step2
if_state_step2 --> MaxDelay: true
if_state_step2 --> BaseDelay: false
MaxDelay --> setNormalized
BaseDelay --> setNormalized
setNormalized --> DelayGenerator
DelayGenerator --> if_state_step3
if_state_step3 --> GeneratedDelay: positive
if_state_step3 --> NormalizedDelay: null or negative
GeneratedDelay --> setNext
NormalizedDelay --> setNext
setNext --> [*]
The delays column contains an example series of five values to depict the patterns.
[!NOTE] Because the jitter calculation is based on the newly calculated delay, the new delay could be less than the previous value.
| Settings | Delays in milliseconds |
|---|---|
Delay: 1sec | [ 1000, 2000, 4000, 8000, 16000 ] |
Delay: 1sec, UseJitter: true | [ 393, 1453, 4235, 5369, 16849 ] |
Delay: 1sec, UseJitter: true, MaxDelay: 15000ms | [ 477, 793, 2227, 5651, 15000 ] |
[!TIP] For more details please check out the
RetryHelperand theRetryResilienceStrategyclasses.
Let's suppose we have a retry strategy with MaxRetryAttempts: 2.
sequenceDiagram
actor C as Caller
participant P as Pipeline
participant R as Retry
participant D as DecoratedUserCallback
C->>P: Calls ExecuteAsync
P->>R: Calls ExecuteCore
Note over R,D: Initial attempt
R->>+D: Invokes
D->>-R: Fails
R-->>R: Sleeps
Note over R,D: 1st retry attempt
R->>+D: Invokes
D->>-R: Returns result
R->>P: Returns result
P->>C: Returns result
sequenceDiagram
actor C as Caller
participant P as Pipeline
participant R as Retry
participant D as DecoratedUserCallback
C->>P: Calls ExecuteAsync
P->>R: Calls ExecuteCore
Note over R,D: Initial attempt
R->>+D: Invokes
D->>-R: Fails
R-->>R: Sleeps
Note over R,D: 1st retry attempt
R->>+D: Invokes
D->>-R: Fails
R-->>R: Sleeps
Note over R,D: 2nd retry attempt
R->>+D: Invokes
D->>-R: Fails
R->>P: Propagates failure
P->>C: Propagates failure
In some cases, you might want to set a limit on the calculated delay. This is beneficial when multiple retries are anticipated, and you wish to prevent excessive wait times between these retries.
Consider the following example of a long-running background job:
<!-- snippet: retry-pattern-max-delay -->ResiliencePipeline pipeline = new ResiliencePipelineBuilder()
.AddRetry(new()
{
Delay = TimeSpan.FromSeconds(2),
MaxRetryAttempts = int.MaxValue,
BackoffType = DelayBackoffType.Exponential,
// Initially, we aim for an exponential backoff, but after a certain number of retries, we set a maximum delay of 15 minutes.
MaxDelay = TimeSpan.FromMinutes(15),
UseJitter = true
})
.Build();
// Background processing
while (!cancellationToken.IsCancellationRequested)
{
await pipeline.ExecuteAsync(async token =>
{
// In the event of a prolonged service outage, we can afford to wait for a successful retry since this is a background task.
await SynchronizeDataAsync(token);
},
cancellationToken);
await Task.Delay(TimeSpan.FromMinutes(30)); // The sync runs every 30 minutes.
}
Over the years, many developers have used Polly in various ways. Some of these recurring patterns may not be ideal. The sections below highlight anti-patterns to avoid.
❌ DON'T
Overuse Handle/HandleResult:
var retry = new ResiliencePipelineBuilder()
.AddRetry(new()
{
ShouldHandle = new PredicateBuilder()
.Handle<HttpRequestException>()
.Handle<BrokenCircuitException>()
.Handle<TimeoutRejectedException>()
.Handle<SocketException>()
.Handle<RateLimitRejectedException>(),
MaxRetryAttempts = 3,
})
.Build();
Reasoning:
Using multiple Handle/HandleResult methods is redundant. Instead of specifying to retry if the decorated code throws a certain exception repeatedly, it's more efficient to state that retries should occur if any of the retryable exceptions are thrown.
✅ DO
Use collections and simple predicate functions:
<!-- snippet: retry-pattern-overusing-builder -->ImmutableArray<Type> networkExceptions =
[
typeof(SocketException),
typeof(HttpRequestException),
];
ImmutableArray<Type> strategyExceptions =
[
typeof(TimeoutRejectedException),
typeof(BrokenCircuitException),
typeof(RateLimitRejectedException),
];
ImmutableArray<Type> retryableExceptions = [.. networkExceptions.Union(strategyExceptions)];
var retry = new ResiliencePipelineBuilder()
.AddRetry(new()
{
ShouldHandle = args =>
ValueTask.FromResult(args.Outcome.Exception is not null &&
retryableExceptions.Contains(args.Outcome.Exception.GetType())),
MaxRetryAttempts = 3,
})
.Build();
Reasoning:
Grouping exceptions simplifies the configuration and improves reusability. For example, the networkExceptions array can be reused in various strategies such as retry, circuit breaker, and more.
❌ DON'T
Use a retry strategy to run indefinitely at a specified interval:
<!-- snippet: retry-anti-pattern-periodic-execution -->var retry = new ResiliencePipelineBuilder()
.AddRetry(new()
{
ShouldHandle = _ => ValueTask.FromResult(true),
Delay = TimeSpan.FromHours(24),
})
.Build();
Reasoning:
The waiting period can be either blocking or non-blocking, based on the defined strategy/pipeline. Even when used not used in a blocking manner, it unnecessarily consumes memory that can't be reclaimed by the garbage collector.
✅ DO
Use a suitable tool to schedule recurring tasks, such as Quartz.Net, Hangfire, or others.
Reasoning:
❌ DON'T
Mix increasing values with constant ones:
<!-- snippet: retry-anti-pattern-sleeping-strategies -->var retry = new ResiliencePipelineBuilder()
.AddRetry(new()
{
DelayGenerator = args =>
{
var delay = args.AttemptNumber switch
{
<= 5 => TimeSpan.FromSeconds(Math.Pow(2, args.AttemptNumber)),
_ => TimeSpan.FromMinutes(3)
};
return new ValueTask<TimeSpan?>(delay);
}
})
.Build();
Reasoning:
Using this approach essentially turns the logic into a state machine. Although this offers a concise way to express sleep durations, it has several disadvantages:
AttemptNumber.✅ DO
Use two distinct retry strategy options and combine them:
<!-- snippet: retry-pattern-sleeping-strategies -->var slowRetries = new RetryStrategyOptions
{
MaxRetryAttempts = 5,
Delay = TimeSpan.FromMinutes(3),
BackoffType = DelayBackoffType.Constant
};
var quickRetries = new RetryStrategyOptions
{
MaxRetryAttempts = 5,
Delay = TimeSpan.FromSeconds(1),
UseJitter = true,
BackoffType = DelayBackoffType.Exponential
};
var retry = new ResiliencePipelineBuilder()
.AddRetry(slowRetries)
.AddRetry(quickRetries)
.Build();
Reasoning:
Suppose you have an HttpClient and you want to add a retry only for specific endpoints.
❌ DON'T
Use ResiliencePipeline.Empty and the ?: operator:
var retry =
IsRetryable(request.RequestUri)
? new ResiliencePipelineBuilder<HttpResponseMessage>().AddRetry(new()).Build()
: ResiliencePipeline<HttpResponseMessage>.Empty;
Reasoning:
The triggering conditions and logic are spread across different sections. This design is not ideal for extensibility since adding more conditions can make the code less readable.
✅ DO
Use the ShouldHandle clause to define the triggering logic:
var retry = new ResiliencePipelineBuilder<HttpResponseMessage>()
.AddRetry(new()
{
ShouldHandle = _ => ValueTask.FromResult(IsRetryable(request.RequestUri))
})
.Build();
Reasoning:
❌ DON'T
Call a specific method before Execute/ExecuteAsync:
var retry = new ResiliencePipelineBuilder()
.AddRetry(new()
{
OnRetry = args =>
{
BeforeEachAttempt();
return ValueTask.CompletedTask;
},
})
.Build();
BeforeEachAttempt();
await retry.ExecuteAsync(DoSomething);
Reasoning:
OnRetry function is triggered before each retry attempt, but it doesn't activate before the initial attempt since it's not considered a retry.BeforeEachAttempt call before every Execute.Execute.✅ DO
Group the two method calls:
<!-- snippet: retry-pattern-calling-method-before -->var retry = new ResiliencePipelineBuilder()
.AddRetry(new())
.Build();
await retry.ExecuteAsync(ct =>
{
BeforeEachAttempt();
return DoSomething(ct);
});
Reasoning:
If DoSomething and BeforeEachAttempt are interdependent, group them or declare a simple wrapper to invoke them in the correct sequence.
Suppose we have an HttpClient that issues a request and then we try to parse a large JSON response.
❌ DON'T
Use a single strategy for everything:
<!-- snippet: retry-anti-pattern-multiple-failures -->var builder = new ResiliencePipelineBuilder()
.AddRetry(new()
{
ShouldHandle = new PredicateBuilder().Handle<HttpRequestException>(),
MaxRetryAttempts = 3
});
builder.AddTimeout(TimeSpan.FromMinutes(1));
var pipeline = builder.Build();
await pipeline.ExecuteAsync(static async (httpClient, ct) =>
{
var stream = await httpClient.GetStreamAsync(new Uri("endpoint"), ct);
var foo = await JsonSerializer.DeserializeAsync<Foo>(stream, cancellationToken: ct);
},
httpClient);
Reasoning:
Previously, it was suggested that you should combine X and Y only if they are part of the same failure domain. In simpler terms, a pipeline should address only one type of failure.
✅ DO
Define a strategy for each failure domain:
<!-- snippet: retry-pattern-multiple-failures -->var retry = new ResiliencePipelineBuilder()
.AddRetry(new()
{
ShouldHandle = new PredicateBuilder().Handle<HttpRequestException>(),
MaxRetryAttempts = 3
})
.Build();
var stream = await retry.ExecuteAsync(
static async (httpClient, ct) =>
await httpClient.GetStreamAsync(new Uri("endpoint"), ct),
httpClient);
var timeout = new ResiliencePipelineBuilder<Foo>()
.AddTimeout(TimeSpan.FromMinutes(1))
.Build();
var foo = await timeout.ExecuteAsync((ct) => JsonSerializer.DeserializeAsync<Foo>(stream, cancellationToken: ct));
Reasoning:
The failure domain of a network call is different from that of deserialization. Using dedicated strategies makes the application more resilient to various transient failures.
If you encounter a TimeoutException, you may not want to retry the operation.
❌ DON'T
Embed cancellation logic within OnRetry:
var ctsKey = new ResiliencePropertyKey<CancellationTokenSource>("cts");
var retry = new ResiliencePipelineBuilder()
.AddRetry(new()
{
OnRetry = async args =>
{
if (args.Outcome.Exception is TimeoutException)
{
if (args.Context.Properties.TryGetValue(ctsKey, out var cts))
{
await cts.CancelAsync();
}
}
}
})
.Build();
Reasoning:
Conditions for triggering retries should be located in ShouldHandle. Bypassing the strategy from within a user-defined delegate—either through an Exception or a CancellationToken—unnecessarily complicates the control flow.
✅ DO
Set the condition for retry within ShouldHandle:
var retry = new ResiliencePipelineBuilder()
.AddRetry(new()
{
ShouldHandle = args => ValueTask.FromResult(args.Outcome.Exception is not TimeoutException)
})
.Build();
Reasoning:
As previously mentioned, always use the designated area to define retry conditions. Re-frame your original exit conditions to specify when a retry should be initiated.