Skip to content

Pragmatic.Resilience

Native, AOT-safe resilience library with strategy composition, fluent builder, DI integration, and source generator support.

Distributed systems fail. HTTP calls time out, databases go down, third-party APIs return errors. Without resilience patterns, every failure propagates immediately to the user. The standard approach — wrapping calls in Polly policies or hand-rolling try/catch with retry loops — scatters resilience logic across the codebase, pulls in external dependencies, and treats every failure the same regardless of whether it is transient (network blip) or permanent (validation error).

// Without Pragmatic: manual resilience for every external call
var retryPolicy = Policy.Handle<HttpRequestException>()
.WaitAndRetryAsync(3, attempt => TimeSpan.FromMilliseconds(200 * Math.Pow(2, attempt)));
var circuitBreaker = Policy.Handle<HttpRequestException>()
.CircuitBreakerAsync(5, TimeSpan.FromSeconds(30));
var timeout = Policy.TimeoutAsync(10);
var combined = Policy.WrapAsync(timeout, circuitBreaker, retryPolicy);
// Must repeat for every service, every action, every call site
await combined.ExecuteAsync(ct => http.PostAsJsonAsync("/charges", request, ct), ct);

Pragmatic.Resilience inverts the model. You declare a policy name, the framework composes the pipeline. One attribute, five strategies, zero manual wiring.

// With Pragmatic: declare the policy, the SG wires the pipeline
[DomainAction]
[ResiliencePolicy("payment-gateway")]
public partial class ChargeCustomerAction : DomainAction<PaymentResult>
{
public override async Task<Result<PaymentResult, IError>> Execute(CancellationToken ct)
{
// Wrapped by the "payment-gateway" pipeline automatically.
// Exceptions trigger retry + circuit breaker.
// Result failures (validation, not found) pass through unchanged.
var response = await http.PostAsJsonAsync("/charges", request, ct);
response.EnsureSuccessStatusCode();
return await response.Content.ReadFromJsonAsync<PaymentResult>(ct);
}
}

The policy is defined once in configuration — no code changes needed to tune retry counts, timeouts, or circuit breaker thresholds:

{
"Resilience": {
"Policies": {
"payment-gateway": {
"Timeout": { "Timeout": "00:00:10" },
"Retry": { "MaxRetries": 3, "BackoffType": "Exponential", "BaseDelay": "00:00:00.200" },
"CircuitBreaker": { "FailureThreshold": 5, "BreakDuration": "00:00:30" }
}
}
}
}

Native, AOT-safe, zero external dependencies. Only exceptions trigger resilience strategies — Result<T, E> failures are business errors and pass through unchanged. Unknown policy names resolve to PassthroughPipeline with zero overhead.

Terminal window
dotnet add package Pragmatic.Resilience

For source generator integration with [ResiliencePolicy]:

<ProjectReference Include="..\Pragmatic.SourceGenerator\src\Pragmatic.SourceGenerator\Pragmatic.SourceGenerator.csproj"
OutputItemType="Analyzer"
ReferenceOutputAssembly="false" />
ProblemSolution
Transient failures in external callsRetryStrategy with configurable backoff and jitter
Operations hanging indefinitelyTimeoutStrategy with optimistic or pessimistic cancellation
Cascading failures from unhealthy dependenciesCircuitBreakerStrategy with pluggable state store
Resource exhaustion from unbounded concurrencyBulkheadStrategy with SemaphoreSlim-based limiter
Hard failures that need a graceful degradation pathFallbackStrategy<TResult> with alternative value factory
Manual pipeline wiring per action[ResiliencePolicy("name")] attribute + source generator
Configuration scattered across codeNamed policies in appsettings.json with IOptions<T> binding
No observability into resilience behaviorBuilt-in ActivitySource, Meter instruments, and [LoggerMessage] logging

var stateStore = new InMemoryCircuitBreakerStateStore();
var pipeline = new ResiliencePipelineBuilder()
.AddRetry(o =>
{
o.MaxRetries = 3;
o.BackoffType = BackoffType.Exponential;
})
.AddTimeout(o => o.Timeout = TimeSpan.FromSeconds(5))
.AddCircuitBreaker(stateStore, o =>
{
o.FailureThreshold = 5;
o.BreakDuration = TimeSpan.FromSeconds(30);
})
.Build();
var result = await pipeline.ExecuteAsync(
(ctx, ct) => httpClient.GetStringAsync(url, ct),
new ResilienceContext { OperationName = "FetchData" });
services.AddPragmaticResilience(options =>
{
options.Policies["external-api"] = new ResiliencePolicyOptions
{
Timeout = new() { Timeout = TimeSpan.FromSeconds(10) },
Retry = new() { MaxRetries = 3, BaseDelay = TimeSpan.FromMilliseconds(200) },
CircuitBreaker = new() { FailureThreshold = 5, BreakDuration = TimeSpan.FromSeconds(30) }
};
});

Resolve and use a named pipeline at runtime:

public class ExternalApiClient(IResiliencePipelineProvider pipelines)
{
public async Task<string> FetchAsync(string url, CancellationToken ct)
{
var pipeline = pipelines.GetPipeline("external-api");
return await pipeline.ExecuteAsync(
(ctx, token) => httpClient.GetStringAsync(url, token),
new ResilienceContext { OperationName = "FetchData" },
ct);
}
}

You can also register named policies individually:

services.AddPragmaticResilience();
services.AddResiliencePolicy("external-api", o =>
{
o.Timeout = new() { Timeout = TimeSpan.FromSeconds(10) };
o.Retry = new() { MaxRetries = 3 };
});

Annotate a DomainAction with [ResiliencePolicy] to automatically wrap execution with the named pipeline:

[DomainAction]
[ResiliencePolicy("external-api")]
public partial class FetchUserAction : DomainAction<UserDto>
{
public override async Task<Result<UserDto, IError>> Execute(CancellationToken ct)
{
// This execution is wrapped by the "external-api" resilience pipeline.
// Exceptions trigger retry/circuit breaker; Result failures pass through.
var response = await httpClient.GetAsync("/users/123", ct);
// ...
}
}

Strategies are ordered by their Order value (ascending). Lower order = more external, wrapping more of the pipeline:

StrategyOrderPurposeDefaultException
Timeout100Cancel if total time exceeded30s, OptimisticTimeoutRejectedException
Bulkhead200Limit concurrent executionsMaxConcurrency=10, MaxQueued=0BulkheadRejectedException
Circuit Breaker300Reject fast if service is unhealthyThreshold=5, Break=30sCircuitBrokenException
Retry400Retry on transient failuresMaxRetries=3, BaseDelay=200ms, ExponentialRetryExhaustedException
Fallback500Provide alternative value on failure
Request
|
v
Timeout (Order 100) -----> cancels if total time exceeded
|
v
Bulkhead (Order 200) ----> rejects if max concurrency reached
|
v
CircuitBreaker (Order 300) -> rejects if circuit is open
|
v
Retry (Order 400) -------> retries on transient exception
|
v
Fallback (Order 500) ----> catches exception, returns alternative
|
v
Operation

Retries on exceptions with configurable backoff and jitter.

OptionDefaultDescription
MaxRetries3Maximum retry attempts (0 = no retries)
BaseDelay200msBase delay between retries
BackoffTypeExponentialConstant, Linear, or Exponential
MaxDelay30sUpper bound on delay (prevents unbounded growth)
UseJittertrueDecorrelated jitter (AWS recommendation) to prevent thundering herd
ShouldRetrynull (all)Predicate to filter which exceptions trigger retry

Backoff formulas:

  • Constant: baseDelay
  • Linear: baseDelay * (attempt + 1)
  • Exponential: baseDelay * 2^attempt

When UseJitter is enabled, the computed delay is multiplied by a random factor in [0.5, 1.5) (decorrelated jitter).

Cancels the operation if it exceeds the configured duration.

OptionDefaultDescription
Timeout30sMaximum allowed duration
TimeoutTypeOptimisticCancellation approach

Timeout types:

  • Optimistic — Creates a linked CancellationToken and cancels it after the timeout. Preferred for operations that honor cancellation.
  • Pessimistic — Races Task.Delay against the operation via Task.WhenAny. For operations that do not honor cancellation. The operation may continue running in the background.

Opens after consecutive failures, rejects requests while open, allows a probe after the break duration elapses.

OptionDefaultDescription
FailureThreshold5Consecutive failures before opening
BreakDuration30sHow long the circuit stays open
ShouldHandlenull (all)Predicate to filter which exceptions count as failures

State machine:

Closed ---[threshold failures]--> Open ---[break elapsed]--> HalfOpen
^ |
| |
+----[probe succeeds]----<------<------<------<------<-------+
|
Open <----[probe fails]----<------<------<------<------<-----+

The state store is pluggable via ICircuitBreakerStateStore. The default InMemoryCircuitBreakerStateStore is a thread-safe, per-process singleton. For distributed scenarios, implement the interface with Redis or a database backend.

Limits concurrent executions using SemaphoreSlim.

OptionDefaultDescription
MaxConcurrency10Maximum concurrent executions
MaxQueuedActions0Overflow queue size (0 = no queue)
QueueTimeoutTimeSpan.ZeroMaximum wait time in queue

When all slots are taken and the queue is full (or disabled), the request is rejected immediately with BulkheadRejectedException.

Catches exceptions and provides an alternative result. This is a generic strategy (FallbackStrategy<TResult>) that only activates when the result type matches.

var fallbackOptions = new FallbackOptions<UserDto>
{
FallbackAction = (ex, ctx, ct) => Task.FromResult(UserDto.Default),
ShouldHandle = ex => ex is HttpRequestException,
OnFallback = (ex, ctx) => logger.LogWarning("Using fallback for {Op}", ctx.OperationName)
};
builder.AddStrategy(new FallbackStrategy<UserDto>(fallbackOptions));
OptionDescription
FallbackActionFactory that produces the fallback value (required)
ShouldHandlePredicate to filter which exceptions trigger the fallback
OnFallbackCallback invoked when fallback is used (for logging/metrics)

Implement IResilienceStrategy and add it to the pipeline:

public class RateLimitStrategy : IResilienceStrategy
{
public int Order => 150; // Between Timeout and Bulkhead
public async Task<TResult> ExecuteAsync<TResult>(
Func<ResilienceContext, CancellationToken, Task<TResult>> next,
ResilienceContext context,
CancellationToken ct)
{
// Your logic here
return await next(context, ct);
}
}
var pipeline = new ResiliencePipelineBuilder()
.AddStrategy(new RateLimitStrategy())
.AddRetry()
.Build();

{
"Resilience": {
"Default": {
"Timeout": {
"Timeout": "00:00:30",
"TimeoutType": "Optimistic"
}
},
"Policies": {
"external-api": {
"Timeout": {
"Timeout": "00:00:10",
"TimeoutType": "Optimistic"
},
"Retry": {
"MaxRetries": 3,
"BaseDelay": "00:00:00.200",
"BackoffType": "Exponential",
"MaxDelay": "00:00:30",
"UseJitter": true
},
"CircuitBreaker": {
"FailureThreshold": 5,
"BreakDuration": "00:00:30"
}
},
"database": {
"Timeout": {
"Timeout": "00:00:05"
},
"Retry": {
"MaxRetries": 2,
"BackoffType": "Constant",
"BaseDelay": "00:00:00.100"
}
}
}
}
}

When IResiliencePipelineProvider.GetPipeline(name) is called:

  1. Fluent overrides — policies registered via AddPolicy() on the provider
  2. Configuration — policies from ResilienceOptions.Policies dictionary
  3. DefaultResilienceOptions.Default if set
  4. PassthroughPassthroughPipeline.Instance (zero overhead, no wrapping)

Resilience errors implement Pragmatic.Result.Error for integration with the Result pattern:

ErrorCodeHTTP StatusWhen
TimeoutErrorTIMEOUT504Operation exceeded timeout duration
RetryExhaustedErrorRETRY_EXHAUSTED503All retry attempts failed
CircuitBrokenErrorCIRCUIT_BROKEN503Circuit is open, requests rejected
BulkheadRejectedErrorBULKHEAD_REJECTED429Max concurrency exceeded

Each strategy also throws a corresponding exception (TimeoutRejectedException, RetryExhaustedException, CircuitBrokenException, BulkheadRejectedException) for pipeline-level control flow. The error records are for mapping to Result<T, E> at the action/endpoint layer.


ActivitySource: "Pragmatic.Resilience"

Each pipeline execution creates an activity Resilience.{policyName} with tags:

  • policy.name — the resolved policy name
  • outcomesuccess, retry, or exception
  • attempt — retry attempt number (if retried)

Meter: "Pragmatic.Resilience"

InstrumentTypeNameDescription
Pipeline durationHistogrampragmatic.resilience.durationExecution duration in ms
Pipeline executionsCounterpragmatic.resilience.executionsTotal pipeline executions
Retry attemptsCounterpragmatic.resilience.retry_attemptsTotal retry attempts
Circuit rejectionsCounterpragmatic.resilience.circuit_rejectionsRequests rejected by open circuits
TimeoutsCounterpragmatic.resilience.timeoutsTotal timeout occurrences
Bulkhead rejectionsCounterpragmatic.resilience.bulkhead_rejectionsRequests rejected by bulkhead

All log messages use [LoggerMessage] source-generated partial methods for zero-allocation structured logging:

EventLevelMessage
Retry attemptWarningRetry attempt {N}/{Max} for {Op} after {Delay}ms
TimeoutWarningOperation {Op} timed out after {Timeout}ms
Retry exhaustedErrorAll {Max} retry attempts exhausted for {Op}
Circuit rejectedWarningCircuit '{Key}' rejected request -- circuit is open
Circuit openedWarningCircuit '{Key}' opened after {N} consecutive failures
Bulkhead rejectedWarningBulkhead rejected '{Op}' -- max concurrency {N} reached
Fallback usedInformationFallback used for '{Op}'. Original error: {Msg}

DecisionRationale
Native implementation, not a Polly wrapperAOT-safe, zero external dependencies, full control over strategy composition
Only exceptions trigger resilienceResult<T, E> failures are business errors (validation, not found) — retrying them is wrong
PassthroughPipeline for unknown policiesZero overhead when no resilience is configured; no runtime errors for missing policies
Strategies sorted by Order ascendingLower order = more external wrapper. Timeout at 100 wraps everything; Retry at 400 is close to the operation
Pluggable ICircuitBreakerStateStoreIn-memory default for single-process; swap to Redis/DB for distributed circuit state
Thread-local Random for jitterAvoids lock contention on Random.Shared in high-throughput retry scenarios
ResilienceContext with Properties dictionaryCross-strategy communication without coupling strategies to each other

With ModuleIntegration
Pragmatic.Actions[ResiliencePolicy("name")] on DomainAction wraps execution with named pipeline
Pragmatic.ResultError records (TimeoutError, etc.) integrate with Result<T, E>
Pragmatic.CompositionAuto-registered by SG when referenced; AddPragmaticResilience() in IStartupStep for custom config

See samples/Pragmatic.Resilience.Samples/ for 7 runnable scenarios: fluent builder, named policies, SG attributes, error types, retry demo (transient failure recovery with backoff), timeout demo (combined strategies), and circuit breaker demo (fail-fast after threshold).

  • .NET 10.0+
  • Pragmatic.Result (for error types)
  • Pragmatic.SourceGenerator analyzer (for [ResiliencePolicy] integration)

Part of the Pragmatic.Design ecosystem.