Resilience Policies
Detailed reference for all resilience strategies in Pragmatic.Resilience. Strategies are composable and ordered by their Order value (ascending). Lower order means the strategy wraps more of the pipeline (more external).
Strategy Execution Order
Section titled “Strategy Execution Order”Request | vTimeout (Order 100) -----> cancels if total time exceeded | vBulkhead (Order 200) ----> rejects if max concurrency reached | vCircuitBreaker (Order 300) -> rejects if circuit is open | vRetry (Order 400) -------> retries on transient exception | vFallback (Order 500) ----> catches exception, returns alternative | vOperationAll strategies implement IResilienceStrategy:
public interface IResilienceStrategy{ int Order { get; }
Task<TResult> ExecuteAsync<TResult>( Func<ResilienceContext, CancellationToken, Task<TResult>> next, ResilienceContext context, CancellationToken ct);}Retry Strategy
Section titled “Retry Strategy”Retries on exceptions with configurable backoff and jitter.
RetryOptions
Section titled “RetryOptions”| Option | Type | Default | Description |
|---|---|---|---|
MaxRetries | int | 3 | Maximum retry attempts (0 = no retries, max 100) |
BaseDelay | TimeSpan | 200ms | Base delay between retries |
BackoffType | BackoffType | Exponential | Constant, Linear, or Exponential |
MaxDelay | TimeSpan | 30s | Upper bound on delay (prevents unbounded growth) |
UseJitter | bool | true | Decorrelated jitter to prevent thundering herd |
ShouldRetry | Func<Exception, bool>? | null (all) | Predicate to filter which exceptions trigger retry |
Backoff Formulas
Section titled “Backoff Formulas”- Constant:
baseDelay - Linear:
baseDelay * (attempt + 1) - Exponential:
baseDelay * 2^attempt
When UseJitter is enabled, the computed delay is multiplied by a random factor in [0.5, 1.5) (decorrelated jitter, per the AWS recommendation). Thread-local Random avoids lock contention in high-throughput scenarios.
Exception
Section titled “Exception”Throws RetryExhaustedException when all attempts are exhausted. The inner exception contains the last failure.
Example
Section titled “Example”services.AddResiliencePolicy("transient-calls", o =>{ o.Retry = new RetryOptions { MaxRetries = 5, BaseDelay = TimeSpan.FromMilliseconds(100), BackoffType = BackoffType.Exponential, MaxDelay = TimeSpan.FromSeconds(10), UseJitter = true, ShouldRetry = ex => ex is HttpRequestException or TimeoutException };});Timeout Strategy
Section titled “Timeout Strategy”Cancels the operation if it exceeds the configured duration.
TimeoutOptions
Section titled “TimeoutOptions”| Option | Type | Default | Description |
|---|---|---|---|
Timeout | TimeSpan | 30s | Maximum allowed duration |
TimeoutType | TimeoutType | Optimistic | Cancellation approach |
Timeout Types
Section titled “Timeout Types”- Optimistic — Creates a linked
CancellationTokenand cancels it after the timeout. Preferred for operations that honor cancellation tokens (most async .NET APIs). - Pessimistic — Races
Task.Delayagainst the operation viaTask.WhenAny. For operations that do not honor cancellation (e.g., legacy synchronous code wrapped in a task). The operation may continue running in the background after timeout.
Exception
Section titled “Exception”Throws TimeoutRejectedException when the timeout is exceeded.
Example
Section titled “Example”o.Timeout = new TimeoutOptions{ Timeout = TimeSpan.FromSeconds(5), TimeoutType = TimeoutType.Optimistic};Circuit Breaker Strategy
Section titled “Circuit Breaker Strategy”Opens after consecutive failures, rejects requests while open, allows a probe after the break duration elapses.
CircuitBreakerOptions
Section titled “CircuitBreakerOptions”| Option | Type | Default | Description |
|---|---|---|---|
FailureThreshold | int | 5 | Consecutive failures before opening |
BreakDuration | TimeSpan | 30s | How long the circuit stays open |
ShouldHandle | Func<Exception, bool>? | null (all) | Predicate to filter which exceptions count as failures |
State Machine
Section titled “State Machine”Closed ---[threshold failures]--> Open ---[break elapsed]--> HalfOpen ^ | | | +----[probe succeeds]----<------<------<------<------<-------+ | Open <----[probe fails]----<------<------<------<------<-----+- Closed: normal operation. Failures are counted.
- Open: all requests rejected immediately with
CircuitBrokenException. - HalfOpen: one probe request allowed through. If it succeeds, circuit closes. If it fails, circuit reopens.
State Store
Section titled “State Store”Circuit breaker state is managed by ICircuitBreakerStateStore. The default InMemoryCircuitBreakerStateStore is thread-safe and per-process. For distributed scenarios (multiple instances sharing circuit state), implement the interface with Redis or a database backend.
Exception
Section titled “Exception”Throws CircuitBrokenException when the circuit is open and a request is rejected.
Example
Section titled “Example”o.CircuitBreaker = new CircuitBreakerOptions{ FailureThreshold = 3, BreakDuration = TimeSpan.FromSeconds(60), ShouldHandle = ex => ex is not ArgumentException // Don't count argument errors};Bulkhead Strategy
Section titled “Bulkhead Strategy”Limits concurrent executions using SemaphoreSlim.
BulkheadOptions
Section titled “BulkheadOptions”| Option | Type | Default | Description |
|---|---|---|---|
MaxConcurrency | int | 10 | Maximum concurrent executions |
MaxQueuedActions | int | 0 | Overflow queue size (0 = no queue) |
QueueTimeout | TimeSpan | TimeSpan.Zero | Maximum wait time in queue |
When all slots are taken and the queue is full (or disabled), the request is rejected immediately with BulkheadRejectedException.
Example
Section titled “Example”o.Bulkhead = new BulkheadOptions{ MaxConcurrency = 5, MaxQueuedActions = 10, QueueTimeout = TimeSpan.FromSeconds(2)};Fallback Strategy
Section titled “Fallback Strategy”Catches exceptions and provides an alternative result. This is a generic strategy (FallbackStrategy<TResult>) that only activates when the result type matches.
FallbackOptions<TResult>
Section titled “FallbackOptions<TResult>”| Option | Type | Description |
|---|---|---|
FallbackAction | Func<Exception, ResilienceContext, CancellationToken, Task<TResult>> | Factory that produces the fallback value (required) |
ShouldHandle | Func<Exception, bool>? | Predicate to filter which exceptions trigger the fallback |
OnFallback | Action<Exception, ResilienceContext>? | Callback invoked when fallback is used (for logging/metrics) |
Fallback is not configurable via ResiliencePolicyOptions (JSON/DI). It must be added via the fluent builder because it requires a typed factory delegate.
Example
Section titled “Example”var pipeline = new ResiliencePipelineBuilder() .AddRetry() .AddStrategy(new FallbackStrategy<UserDto>(new FallbackOptions<UserDto> { FallbackAction = (ex, ctx, ct) => Task.FromResult(UserDto.Default), ShouldHandle = ex => ex is HttpRequestException, OnFallback = (ex, ctx) => logger.LogWarning("Using fallback for {Op}", ctx.OperationName) })) .Build();Custom Strategies
Section titled “Custom Strategies”Implement IResilienceStrategy to create custom strategies:
public class RateLimitStrategy : IResilienceStrategy{ public int Order => 150; // Between Timeout (100) and Bulkhead (200)
public async Task<TResult> ExecuteAsync<TResult>( Func<ResilienceContext, CancellationToken, Task<TResult>> next, ResilienceContext context, CancellationToken ct) { // Custom logic here return await next(context, ct); }}
var pipeline = new ResiliencePipelineBuilder() .AddStrategy(new RateLimitStrategy()) .AddRetry() .Build();ResilienceContext
Section titled “ResilienceContext”Cross-strategy communication without coupling strategies to each other:
var context = new ResilienceContext{ OperationName = "FetchUserData"};The OperationName is used in logging, metrics, and tracing.
Composing Policies
Section titled “Composing Policies”A ResiliencePolicyOptions composes multiple strategies into a single pipeline:
new ResiliencePolicyOptions{ Timeout = new() { ... }, // null = no timeout Bulkhead = new() { ... }, // null = no bulkhead CircuitBreaker = new() { ... }, // null = no circuit breaker Retry = new() { ... } // null = no retry}Set any strategy to null to exclude it from the pipeline. Only non-null strategies are composed.
Observability
Section titled “Observability”Distributed Tracing
Section titled “Distributed Tracing”ActivitySource: "Pragmatic.Resilience". Each pipeline execution creates an activity Resilience.{policyName} with tags: policy.name, outcome, attempt.
Metrics
Section titled “Metrics”Meter: "Pragmatic.Resilience".
| Instrument | Type | Name |
|---|---|---|
| Pipeline duration | Histogram | pragmatic.resilience.duration |
| Pipeline executions | Counter | pragmatic.resilience.executions |
| Retry attempts | Counter | pragmatic.resilience.retry_attempts |
| Circuit rejections | Counter | pragmatic.resilience.circuit_rejections |
| Timeouts | Counter | pragmatic.resilience.timeouts |
| Bulkhead rejections | Counter | pragmatic.resilience.bulkhead_rejections |
Structured Logging
Section titled “Structured Logging”All log messages use [LoggerMessage] source-generated partial methods:
| Level | Message |
|---|---|
| Warning | Retry attempt {N}/{Max} for {Op} after {Delay}ms |
| Warning | Operation {Op} timed out after {Timeout}ms |
| Error | All {Max} retry attempts exhausted for {Op} |
| Warning | Circuit '{Key}' rejected request -- circuit is open |
| Warning | Circuit '{Key}' opened after {N} consecutive failures |
| Warning | Bulkhead rejected '{Op}' -- max concurrency {N} reached |
| Information | Fallback used for '{Op}'. Original error: {Msg} |