Error Handling with Compensation and Undo
View SourceIn this tutorial, you'll learn how to make your reactors resilient by adding proper error handling, retry logic, and rollback capabilities.
What you'll build
You'll enhance the user registration workflow from the first tutorial to handle:
- Network failures with automatic retries
- Validation errors with graceful failure
- Rollback scenarios when later steps fail
- Email service failures with compensation
You'll learn
- Implementing compensation for retryable errors
- Adding undo logic for rollback scenarios
- The difference between compensation and undo
- Building resilient workflows that handle failures gracefully
Error Handling Flow
Here's how Reactor handles errors through compensation and undo:
sequenceDiagram
participant Reactor
participant StepA
participant StepB
participant StepC
Reactor->>StepA: run()
StepA-->>Reactor: {:ok, result}
Reactor->>StepB: run()
StepB-->>Reactor: {:ok, result}
Reactor->>StepC: run()
StepC-->>Reactor: {:error, reason}
Note over Reactor: Begin compensation
Reactor->>StepB: compensate()
StepB-->>Reactor: {:continue, context}
Reactor->>StepA: compensate()
StepA-->>Reactor: {:continue, context}
Reactor-->>Reactor: Return compensated errorPrerequisites
- Complete the Getting Started tutorial
- Basic knowledge of Elixir error handling
Step 1: Set up the project
If you don't have the project from the previous tutorial, create it:
mix igniter.new reactor_tutorial --install reactor
cd reactor_tutorial
Step 2: Understanding Reactor error handling
Reactor provides two main mechanisms for error handling:
Compensation
When: A step fails during execution
Purpose: Decide whether to retry, continue, or fail the reactor
Return values:
:retry- Try the step again{:continue, value}- Continue execution with the provided value:ok- Successfully compensated, but still triggers rollback{:error, reason}- Fail the entire reactor
Undo
When: A step succeeded but a later step failed
Purpose: Roll back the successful step's changes
Return values:
:ok- Successfully undone{:error, reason}- Failed to undo (this will fail the reactor)
Step 3: Create services with realistic error handling
Let's create services that demonstrate different types of failures. Create lib/email_service.ex:
defmodule EmailService do
use Reactor.Step
# Simulate realistic email service failures based on email content
@impl true
def run(arguments, _context, _options) do
email = arguments.email
cond do
# Simulate network timeout (temporary failure)
String.contains?(email, "timeout") ->
{:error, %{type: :network_timeout, message: "Network timeout - please retry"}}
# Simulate rate limiting (temporary failure)
String.contains?(email, "ratelimit") ->
{:error, %{type: :rate_limit, message: "Rate limit exceeded - please retry"}}
# Simulate blocked email (permanent failure)
String.contains?(email, "blocked") ->
{:error, %{type: :blocked_email, message: "Email address is blocked"}}
# Simulate invalid email (permanent failure)
not String.contains?(email, "@") ->
{:error, %{type: :invalid_email, message: "Invalid email format"}}
# Success case - all other emails work
true ->
{:ok, %{
message_id: "msg_#{:rand.uniform(10000)}",
sent_at: DateTime.utc_now(),
recipient: email
}}
end
end
@impl true
def compensate(error, _arguments, _context, _options) do
case error do
# Temporary failures - retry with helpful logging
%{type: :network_timeout} ->
IO.puts("🔄 Network timeout - retrying email send...")
:retry
%{type: :rate_limit} ->
IO.puts("🔄 Rate limited - retrying email send...")
:retry
# Permanent failures - don't retry
%{type: :blocked_email} ->
IO.puts("❌ Email blocked - cannot retry")
:ok
%{type: :invalid_email} ->
IO.puts("❌ Invalid email - cannot retry")
:ok
_other ->
:ok
end
end
@impl true
def undo(result, _arguments, _context, _options) do
IO.puts("📧 Canceling email #{result.message_id} to #{result.recipient}")
:ok
end
endNow create lib/notification_service.ex for internal admin notifications:
defmodule NotificationService do
use Reactor.Step
@impl true
def run(arguments, _context, _options) do
user = arguments.user
# Admin notifications always succeed (internal system)
{:ok, %{
notification_id: "notif_#{:rand.uniform(10000)}",
sent_at: DateTime.utc_now(),
message: "New user registered: #{user.email}"
}}
end
@impl true
def undo(result, _arguments, _context, _options) do
IO.puts("🔔 Canceling admin notification #{result.notification_id}")
:ok
end
endStep 4: Create a database service that needs rollback
Create lib/database_service.ex:
defmodule DatabaseService do
use Reactor.Step
@impl true
def run(arguments, _context, _options) do
user = %{
id: :rand.uniform(10000),
email: arguments.email,
password_hash: arguments.password_hash,
created_at: DateTime.utc_now()
}
{:ok, user}
end
@impl true
def compensate(_error, _arguments, _context, _options) do
# Database errors are usually retryable
:retry
end
@impl true
def undo(user, _arguments, _context, _options) do
IO.puts("Rolling back user creation for #{user.email} (ID: #{user.id})")
:ok
end
endStep 5: Build a reactor with error handling
Now create lib/resilient_user_registration.ex:
defmodule ResilientUserRegistration do
use Reactor
input :email
input :password
step :validate_email do
argument :email, input(:email)
run fn %{email: email}, _context ->
if String.contains?(email, "@") and String.length(email) > 5 do
{:ok, email}
else
{:error, "Email must contain @ and be longer than 5 characters"}
end
end
end
step :hash_password do
argument :password, input(:password)
run fn %{password: password}, _context ->
if String.length(password) >= 8 do
hashed = :crypto.hash(:sha256, password) |> Base.encode16()
{:ok, hashed}
else
{:error, "Password must be at least 8 characters"}
end
end
end
step :create_user, DatabaseService do
argument :email, result(:validate_email)
argument :password_hash, result(:hash_password)
max_retries 3
end
step :send_welcome_email, EmailService do
argument :email, result(:validate_email)
argument :user, result(:create_user)
max_retries 2
end
step :send_admin_notification, NotificationService do
argument :user, result(:create_user)
max_retries 1
end
return :create_user
endStep 6: Test the error handling
Let's test our reactor in IEx:
iex -S mix
# ✅ SUCCESS: Normal email succeeds
{:ok, user} = Reactor.run(ResilientUserRegistration, %{
email: "alice@example.com",
password: "secretpassword123"
})
# 🔄 RETRY: Network timeout triggers retry logic
{:error, reason} = Reactor.run(ResilientUserRegistration, %{
email: "timeout@example.com", # Will trigger network timeout
password: "secretpassword123"
})
# 🔄 RETRY: Rate limiting triggers retry logic
{:error, reason} = Reactor.run(ResilientUserRegistration, %{
email: "ratelimit@example.com", # Will trigger rate limit
password: "secretpassword123"
})
# ❌ PERMANENT FAILURE: Blocked email fails immediately
{:error, reason} = Reactor.run(ResilientUserRegistration, %{
email: "blocked@example.com", # Will fail permanently
password: "secretpassword123"
})
# ❌ VALIDATION FAILURE: Invalid inputs fail immediately
{:error, reason} = Reactor.run(ResilientUserRegistration, %{
email: "invalid-email", # No @ symbol
password: "short" # Too short
})Step 7: Understanding the behaviour
When you run the tests, you'll see different behaviours based on the email content:
Successful execution (alice@example.com): All steps succeed, user is created, welcome email is sent, and admin notification is sent.
Retry scenarios:
timeout@example.com- Triggers network timeout, compensation returns:retry, step retries up to max_retries limitratelimit@example.com- Triggers rate limiting, compensation returns:retry, step retries up to max_retries limit
Permanent failures:
blocked@example.com- Email is blocked, compensation returns:ok(no retry)invalid-email- Invalid format, compensation returns:ok(no retry)
Validation failures: Invalid input (short passwords, malformed emails) fails immediately without retries - these are caught by the validation steps before reaching the email service.
Step 8: Adding retry backoff for better resilience
When steps retry immediately, they might overwhelm failing external services. Reactor supports backoff - adding delays between retry attempts. Importantly, the executor doesn't block during backoff - it continues processing other ready steps while the failed step waits to be rescheduled.
Note
Backoff delays are minimum delays - the actual retry time will be at least the specified delay, but may be longer because the executor prioritises processing other ready steps before checking for expired backoffs. Let's enhance our email service with intelligent retry delays.
Understanding backoff timing
Here's how backoff integrates with Reactor's retry flow:
sequenceDiagram
participant Executor
participant Step
participant BackoffStep as Step.backoff/4
participant Scheduler
participant OtherSteps as Other Ready Steps
Executor->>Step: run()
Step-->>Executor: {:error, reason}
Note over Executor: Begin compensation
Executor->>Step: compensate()
Step-->>Executor: :retry
Note over Executor: Check for backoff
Executor->>BackoffStep: backoff(error, args, context, step)
BackoffStep-->>Executor: 5000 (5 seconds)
Note over Executor: Schedule retry after delay
Executor->>Scheduler: schedule step retry in 5000ms
Note over Executor: Continue with other work
Executor->>OtherSteps: process ready steps
OtherSteps-->>Executor: results
Note over Scheduler: 5 seconds later...
Scheduler-->>Executor: step ready for retry
Note over Executor: Retry now ready
Executor->>Step: run()
Step-->>Executor: {:ok, result}Implementing backoff strategies
Update the EmailService to include backoff logic:
defmodule EmailService do
use Reactor.Step
@impl true
def run(arguments, _context, _options) do
email = arguments.email
cond do
String.contains?(email, "timeout") ->
{:error, %{type: :network_timeout, message: "Network timeout - please retry"}}
String.contains?(email, "ratelimit") ->
{:error, %{type: :rate_limit, message: "Rate limit exceeded - please retry"}}
String.contains?(email, "blocked") ->
{:error, %{type: :blocked_email, message: "Email address is blocked"}}
not String.contains?(email, "@") ->
{:error, %{type: :invalid_email, message: "Invalid email format"}}
true ->
{:ok, %{
message_id: "msg_#{:rand.uniform(10000)}",
sent_at: DateTime.utc_now(),
recipient: email
}}
end
end
@impl true
def compensate(error, _arguments, _context, _options) do
case error do
%{type: :network_timeout} ->
IO.puts("🔄 Network timeout - retrying email send...")
:retry
%{type: :rate_limit} ->
IO.puts("🔄 Rate limited - retrying email send...")
:retry
%{type: :blocked_email} ->
IO.puts("❌ Email blocked - cannot retry")
:ok
%{type: :invalid_email} ->
IO.puts("❌ Invalid email - cannot retry")
:ok
_other ->
:ok
end
end
# NEW: Backoff implementation
@impl true
def backoff(error, _arguments, context, _step) do
case error do
%{type: :network_timeout} ->
# Exponential backoff for network issues
retry_count = Map.get(context, :current_try, 0)
delay_ms = :math.pow(2, retry_count) * 1000 |> round() |> min(30_000)
IO.puts("⏰ Network timeout - backing off for #{delay_ms}ms")
delay_ms
%{type: :rate_limit} ->
# Longer fixed delay for rate limiting
delay_ms = 10_000 # 10 seconds
IO.puts("⏰ Rate limited - backing off for #{delay_ms}ms")
delay_ms
_other ->
# No backoff for non-retryable errors
:now
end
end
@impl true
def undo(result, _arguments, _context, _options) do
IO.puts("📧 Canceling email #{result.message_id} to #{result.recipient}")
:ok
end
endUsing backoff in DSL steps
You can also define backoff logic directly in DSL steps when using anonymous functions for run, compensate, etc. (The DSL backoff option is not available when using implementation modules):
defmodule BackoffUserRegistration do
use Reactor
input :email
input :password
step :validate_email do
argument :email, input(:email)
run fn %{email: email}, _context ->
if String.contains?(email, "@") and String.length(email) > 5 do
{:ok, email}
else
{:error, "Email must contain @ and be longer than 5 characters"}
end
end
end
step :hash_password do
argument :password, input(:password)
run fn %{password: password}, _context ->
if String.length(password) >= 8 do
hashed = :crypto.hash(:sha256, password) |> Base.encode16()
{:ok, hashed}
else
{:error, "Password must be at least 8 characters"}
end
end
end
step :create_user do
argument :email, result(:validate_email)
argument :password_hash, result(:hash_password)
max_retries 3
run fn %{email: email, password_hash: hash}, _context ->
user = %{
id: :rand.uniform(10000),
email: email,
password_hash: hash,
created_at: DateTime.utc_now()
}
{:ok, user}
end
compensate fn _error, _args, _context ->
:retry # Database errors are usually retryable
end
# DSL backoff function (only available with anonymous run functions)
backoff fn _error, _args, context, _step ->
retry_count = Map.get(context, :current_try, 0)
# Exponential backoff: 1s, 2s, 4s, 8s...
delay = :math.pow(2, retry_count) * 1000 |> round()
IO.puts("🔄 Database retry #{retry_count + 1} - waiting #{delay}ms")
delay
end
end
step :send_welcome_email, EmailService do
argument :email, result(:validate_email)
argument :user, result(:create_user)
max_retries 3
# EmailService module has its own backoff/4 callback
end
step :send_admin_notification, NotificationService do
argument :user, result(:create_user)
max_retries 1
end
return :create_user
endTesting backoff behaviour
Test the improved retry behaviour:
# This will now retry with exponential backoff delays
{:error, reason} = Reactor.run(BackoffUserRegistration, %{
email: "timeout@example.com", # Triggers network timeout with backoff
password: "secretpassword123"
})
# Watch the console output:
# 🔄 Network timeout - retrying email send...
# ⏰ Network timeout - backing off for 1000ms
# (1 second delay)
# 🔄 Network timeout - retrying email send...
# ⏰ Network timeout - backing off for 2000ms
# (2 second delay)
# 🔄 Network timeout - retrying email send...
# ⏰ Network timeout - backing off for 4000ms
# (4 second delay - final retry)Backoff strategies explained
Exponential backoff: Doubles delay each retry (1s, 2s, 4s, 8s...) - good for network issues and service overload.
Fixed backoff: Same delay each time - good for rate limiting where you know the reset interval.
No backoff: Use :now for errors that don't benefit from delays.
Custom strategies: Implement any timing logic based on error type, retry count, or external factors.
What you learned
You now understand Reactor's error handling mechanisms:
- Compensation handles step failures with retry logic
- Backoff strategies add intelligent delays between retry attempts to prevent overwhelming services
- Undo operations roll back successful steps when later steps fail
- Max retries controls how many times compensation can retry a step
- Error types should be handled differently (retry vs fail, with or without backoff)
- Context contains retry state for intelligent retry and backoff logic
- DSL backoff functions allow inline backoff logic without full step modules
What's next
Now that you can handle errors, you're ready for more advanced concepts:
- Async Workflows - Explore concurrent processing patterns
- Composition - Build complex workflows with sub-reactors
- Testing Strategies - Learn how to test error scenarios
Common issues
Steps retry infinitely: Always set max_retries and ensure compensation doesn't always return :retry
Undo operations fail: Make undo operations idempotent - they should succeed even if called multiple times
Reactor fails instead of retrying: Check that your compensation function returns :retry, not {:error, reason}
Happy building resilient workflows! 🛡️