distribute/retry

Retry policy with exponential backoff and jitter.

This module provides type-safe retry logic for distributed operations. It implements industry best practices for handling transient failures in distributed systems.

Jitter Strategies

Jitter is crucial to prevent the “thundering herd” problem where many clients retry simultaneously after a failure. This module supports multiple jitter strategies as recommended by AWS and Google Cloud.

References

Example

import distribute/retry

// Create a policy with full jitter (recommended)
let policy = retry.default_with_jitter()

// Calculate delay for attempt 3
let delay = retry.calculate_delay(policy, 3)
// delay will be random(0, min(5000, 100 * 2^2)) = random(0, 400)

Types

Result of a delay calculation, includes metadata for observability.

pub type DelayResult {
  DelayResult(
    delay_ms: Int,
    base_delay_ms: Int,
    attempt: Int,
    is_final_attempt: Bool,
  )
}

Constructors

  • DelayResult(
      delay_ms: Int,
      base_delay_ms: Int,
      attempt: Int,
      is_final_attempt: Bool,
    )

    Arguments

    delay_ms

    The actual delay to use (in milliseconds)

    base_delay_ms

    The base delay before jitter was applied

    attempt

    Current attempt number (1-indexed)

    is_final_attempt

    Whether this is the last attempt

Jitter strategy for randomizing retry delays.

Jitter helps prevent the “thundering herd” problem where many clients retry simultaneously after a shared failure.

pub type JitterStrategy {
  NoJitter
  FullJitter
  EqualJitter
  DecorrelatedJitter
}

Constructors

  • NoJitter

    No jitter - deterministic delays. NOT recommended for production. Use only for testing or when deterministic behavior is required.

  • FullJitter

    Full jitter: random(0, calculated_delay) Recommended for most use cases. Provides maximum spread of retries. Reference: AWS Architecture Blog

  • EqualJitter

    Equal jitter: calculated_delay/2 + random(0, calculated_delay/2) Guarantees minimum delay while still adding randomness. Good when you need both progress guarantee and jitter.

  • DecorrelatedJitter

    Decorrelated jitter: random(base_delay, previous_delay * 3) Each delay is based on the previous, creating natural variation. Good for API rate limiting scenarios.

Retry policy configuration.

Controls how failed operations are retried with exponential backoff and optional jitter for distributed systems.

Fields

  • max_attempts: Maximum number of attempts (1 = no retry)
  • base_delay_ms: Initial delay in milliseconds before jitter
  • max_delay_ms: Maximum delay to prevent runaway waits
  • multiplier: Backoff multiplier (typically 2.0 for exponential)
  • jitter: Jitter strategy for randomizing delays

Example

// Custom policy for critical operations
let policy = RetryPolicy(
  max_attempts: 5,
  base_delay_ms: 50,
  max_delay_ms: 10_000,
  multiplier: 2.0,
  jitter: FullJitter,
)
pub type RetryPolicy {
  RetryPolicy(
    max_attempts: Int,
    base_delay_ms: Int,
    max_delay_ms: Int,
    multiplier: Float,
    jitter: JitterStrategy,
  )
}

Constructors

  • RetryPolicy(
      max_attempts: Int,
      base_delay_ms: Int,
      max_delay_ms: Int,
      multiplier: Float,
      jitter: JitterStrategy,
    )

Values

pub fn aggressive() -> RetryPolicy

Aggressive retry policy for critical operations.

More retries with shorter base delays:

  • 5 attempts total
  • 50ms base delay
  • 3000ms max delay
  • Full jitter enabled
pub fn calculate_delay(
  policy: RetryPolicy,
  attempt: Int,
) -> DelayResult

Calculate the delay for a given attempt number.

Returns a DelayResult containing the delay in milliseconds and metadata. The attempt number should be 1-indexed (first retry is attempt 1).

Algorithm

  1. Calculate base exponential delay: base_delay * (multiplier ^ (attempt - 1))
  2. Cap at max_delay_ms
  3. Apply jitter strategy

Example

let policy = retry.default_with_jitter()
let result = retry.calculate_delay(policy, 1)
// result.delay_ms will be random(0, 100)
process.sleep(result.delay_ms)
pub fn conservative() -> RetryPolicy

Conservative retry policy for non-critical operations.

Fewer retries with longer delays:

  • 2 attempts total
  • 500ms base delay
  • 10000ms max delay
  • Full jitter enabled
pub fn default() -> RetryPolicy

Default retry policy without jitter.

Conservative settings suitable for general use:

  • 3 attempts total
  • 100ms base delay
  • 5000ms max delay
  • 2.0 multiplier (exponential)
  • No jitter

For distributed systems, prefer default_with_jitter().

pub fn default_with_jitter() -> RetryPolicy

Default retry policy with full jitter (RECOMMENDED).

Same as default() but with FullJitter enabled. This is the recommended policy for distributed systems to prevent thundering herd problems.

Delay progression (example, actual values are randomized):

  • Attempt 1: random(0, 100ms)
  • Attempt 2: random(0, 200ms)
  • Attempt 3: random(0, 400ms)
pub fn delay_ms(policy: RetryPolicy, attempt: Int) -> Int

Get just the delay value in milliseconds.

Convenience function when you don’t need the full DelayResult.

let delay = retry.delay_ms(policy, attempt)
process.sleep(delay)
pub fn is_final_attempt(
  policy: RetryPolicy,
  attempt: Int,
) -> Bool

Check if this is the final attempt.

case retry.is_final_attempt(policy, attempt) {
  True -> log.error("Final attempt, no more retries")
  False -> log.warn("Retrying...")
}
pub fn jitter_to_string(jitter: JitterStrategy) -> String

Convert jitter strategy to string for logging.

pub fn no_retry() -> RetryPolicy

No retry policy (single attempt only).

Use when retry is handled at a different layer or for operations that should not be retried (e.g., non-idempotent operations).

pub fn policy_to_string(policy: RetryPolicy) -> String

Convert retry policy to a loggable string representation.

pub fn should_retry(policy: RetryPolicy, attempt: Int) -> Bool

Check if we should retry after this attempt.

Returns True if attempt < max_attempts.

case retry.should_retry(policy, attempt) {
  True -> {
    process.sleep(retry.delay_ms(policy, attempt))
    try_operation(attempt + 1)
  }
  False -> Error(MaxRetriesExceeded)
}
pub fn total_attempts(policy: RetryPolicy) -> Int

Get the total number of attempts that will be made.

let total = retry.total_attempts(policy)
log.info("Will try up to " <> int.to_string(total) <> " times")
pub fn with_base_delay_ms(
  policy: RetryPolicy,
  delay_ms: Int,
) -> RetryPolicy

Set the base delay in milliseconds.

retry.default()
|> retry.with_base_delay_ms(200)
pub fn with_full_jitter(policy: RetryPolicy) -> RetryPolicy

Enable full jitter (convenience method).

retry.default()
|> retry.with_full_jitter()
pub fn with_jitter(
  policy: RetryPolicy,
  jitter: JitterStrategy,
) -> RetryPolicy

Set the jitter strategy.

retry.default()
|> retry.with_jitter(retry.FullJitter)
pub fn with_max_attempts(
  policy: RetryPolicy,
  attempts: Int,
) -> RetryPolicy

Set the maximum number of retry attempts.

retry.default()
|> retry.with_max_attempts(5)
pub fn with_max_delay_ms(
  policy: RetryPolicy,
  delay_ms: Int,
) -> RetryPolicy

Set the maximum delay cap in milliseconds.

retry.default()
|> retry.with_max_delay_ms(10_000)
pub fn with_multiplier(
  policy: RetryPolicy,
  multiplier: Float,
) -> RetryPolicy

Set the backoff multiplier.

retry.default()
|> retry.with_multiplier(1.5)  // Slower growth
Search Document