Architecture

This guide explains the technical architecture and design decisions behind Ex Outlines. It covers system organization, key components, and implementation details that enable reliable structured LLM output validation.

System Overview
Module Organization
Schema Module Design
Validation Engine
Backend Architecture
Retry-Repair Loop Implementation
Batch Processing Design
Telemetry Integration
Ecto Integration
Extension Points
Design Decisions
Future Architecture

System Overview

Ex Outlines follows a modular architecture with clear separation of concerns. The system consists of four main layers:

┌─────────────────────────────────────────┐
│          Public API Layer               │
│         (ExOutlines module)             │
└─────────────┬───────────────────────────┘
              │
┌─────────────┴───────────────────────────┐
│       Schema & Validation Layer         │
│    (Spec.Schema, Spec modules)          │
└─────────────┬───────────────────────────┘
              │
┌─────────────┴───────────────────────────┐
│         Backend Layer                   │
│   (Backend.HTTP, Backend.Anthropic)     │
└─────────────┬───────────────────────────┘
              │
┌─────────────┴───────────────────────────┐
│         LLM APIs                        │
│   (OpenAI, Anthropic, etc.)             │
└─────────────────────────────────────────┘

Data Flow

User calls ExOutlines.generate/2 with schema and options
Schema validates configuration
Prompt module constructs initial prompt with JSON schema
Backend calls LLM API
Response is validated against schema
If invalid, repair prompt is constructed and loop repeats
Valid result or error is returned to user

Design Principles

Simple: Minimal abstractions, predictable behavior
Composable: Small modules with single responsibilities
Testable: All components can be tested in isolation
Concurrent: Leverage BEAM for parallel processing
Observable: Telemetry events for monitoring

Module Organization

lib/ex_outlines/
├── ex_outlines.ex                 # Public API
├── spec/
│   ├── spec.ex                    # Validation interface
│   └── schema.ex                  # Schema definition & validation
├── backend/
│   ├── backend.ex                 # Backend behaviour
│   ├── http.ex                    # OpenAI-compatible backend
│   ├── anthropic.ex               # Native Claude API backend
│   └── mock.ex                    # Testing backend
├── prompt.ex                      # Prompt construction (internal)
└── ecto.ex                        # Optional Ecto integration

Module Responsibilities

ExOutlines (Public API)

Main entry point: generate/2, generate_batch/2
Configuration validation
Telemetry event emission
Error handling and normalization

ExOutlines.Spec.Schema

Schema definition and storage
Field specification normalization
JSON Schema generation
Type validation dispatch

ExOutlines.Spec

Validation orchestration
Diagnostics generation
Key transformation (string to atom)

ExOutlines.Backend.*

LLM API communication
Request formatting
Response parsing
Error handling

ExOutlines.Prompt (Internal)

Initial prompt construction
Repair prompt generation
Message formatting

ExOutlines.Ecto (Optional)

Schema conversion from Ecto
Changeset validation extraction
Type mapping

Schema Module Design

The Schema module is the core of Ex Outlines' validation system.

Internal Representation

defmodule ExOutlines.Spec.Schema do
  @type t :: %__MODULE__{
    fields: %{atom() => field_spec()}
  }

  @type field_spec :: %{
    type: field_type(),
    required: boolean(),
    description: String.t() | nil,
    # Type-specific constraints...
  }

  @type field_type ::
    :string
    | :integer
    | :boolean
    | :number
    | {:enum, [any()]}
    | {:array, item_spec()}
    | {:object, t()}
    | {:union, [field_spec()]}

  defstruct fields: %{}
end

Normalization Process

When a schema is created, field specifications are normalized:

def new(fields) when is_map(fields) do
  normalized_fields =
    fields
    |> Enum.map(fn {name, spec} ->
      {to_atom(name), normalize_field_spec(spec)}
    end)
    |> Enum.into(%{})

  %__MODULE__{fields: normalized_fields}
end

Normalization includes:

Convert field names to atoms
Set default values (required: true, description: nil)
Apply Ecto normalization if available
Validate field specification structure

Validation Dispatch

Validation uses pattern matching for type dispatch:

defp validate_field_type(name, %{type: :string} = spec, value)
     when is_binary(value) do
  # Validate string constraints
  []
end

defp validate_field_type(name, %{type: :string}, value) do
  # Type mismatch error
  [build_error(name, :string, value)]
end

defp validate_field_type(name, %{type: :integer} = spec, value)
     when is_integer(value) do
  # Validate integer constraints
  []
end

# Pattern continues for each type...

This pattern provides:

Type safety through guard clauses
Clear error paths for type mismatches
Easy extension for new types

JSON Schema Generation

Schemas can be converted to JSON Schema format for LLM prompts:

def to_json_schema(%Schema{fields: fields}) do
  %{
    "type" => "object",
    "properties" => generate_properties(fields),
    "required" => required_fields(fields)
  }
end

This allows the LLM to understand the expected output structure.

Validation Engine

The validation engine processes input data against schemas and collects errors.

Validation Algorithm

def validate(%Schema{fields: fields}, input) do
  # 1. Parse JSON if needed
  data = parse_input(input)

  # 2. Validate each field
  errors =
    fields
    |> Enum.flat_map(fn {name, spec} ->
      validate_field(name, spec, data)
    end)

  # 3. Build diagnostics
  if Enum.empty?(errors) do
    # 4. Transform keys and return
    validated = transform_keys(data)
    {:ok, validated}
  else
    diagnostics = %Diagnostics{valid?: false, errors: errors}
    {:error, diagnostics}
  end
end

Error Collection Strategy

Ex Outlines validates all fields before returning errors (not fail-fast):

Advantages:

Complete feedback in one pass
Better error messages for LLM repair
Fewer retry cycles

Implementation:

# Collect all errors using flat_map
errors =
  fields
  |> Enum.flat_map(fn {name, spec} ->
    case validate_field(name, spec, data) do
      [] -> []
      errors -> errors
    end
  end)

Nested Validation

Nested objects are validated recursively:

defp validate_field_type(name, %{type: {:object, nested_schema}}, value)
     when is_map(value) do
  case Spec.validate(nested_schema, value) do
    {:ok, _} ->
      []
    {:error, diagnostics} ->
      # Prefix error paths with parent field name
      diagnostics.errors
      |> Enum.map(fn error ->
        prefix_error_path(error, name)
      end)
  end
end

This provides error paths like "address.city" for nested fields.

Array Validation

Arrays validate items with index tracking:

defp validate_array_items(name, item_spec, items) do
  items
  |> Enum.with_index()
  |> Enum.flat_map(fn {item, index} ->
    errors = validate_field_type(:"#{name}[#{index}]", item_spec, item)
    # Error messages include index
    errors
  end)
end

Union Type Validation

Union types try each specification in order:

defp validate_field_type(name, %{type: {:union, specs}}, value) do
  results =
    specs
    |> Enum.map(fn spec ->
      validate_field_type(name, spec, value)
    end)

  # Find first successful validation
  case Enum.find(results, &(&1 == [])) do
    [] ->
      # Success
      []
    nil ->
      # All failed
      [build_union_error(name, specs, value)]
  end
end

Backend Architecture

Backends handle communication with LLM APIs through a common interface.

Backend Behaviour

defmodule ExOutlines.Backend do
  @type message :: %{
    role: String.t(),
    content: String.t()
  }

  @callback call_llm(messages :: [message()], opts :: keyword()) ::
    {:ok, String.t()} | {:error, term()}
end

HTTP Backend Implementation

The HTTP backend supports OpenAI-compatible APIs:

defmodule ExOutlines.Backend.HTTP do
  @behaviour ExOutlines.Backend

  @impl true
  def call_llm(messages, opts) do
    with {:ok, config} <- validate_config(opts),
         {:ok, body} <- build_request_body(messages, config),
         {:ok, response} <- make_http_request(config, body) do
      parse_response(response)
    end
  end

  defp validate_config(opts) do
    required = [:api_key, :model]
    case Enum.find(required, &(!Keyword.has_key?(opts, &1))) do
      nil -> {:ok, build_config(opts)}
      missing -> {:error, {:missing_config, missing}}
    end
  end

  defp make_http_request(config, body) do
    url = config.api_url
    headers = [
      {~c"content-type", ~c"application/json"},
      {~c"authorization", ~c"Bearer #{config.api_key}"}
    ]

    case :httpc.request(:post, {url, headers, ~c"application/json", body}, [], []) do
      {:ok, {{_, 200, _}, _, response_body}} ->
        {:ok, to_string(response_body)}
      {:ok, {{_, status, _}, _, _}} ->
        {:error, {:http_error, status}}
      {:error, reason} ->
        {:error, {:connection_error, reason}}
    end
  end
end

Anthropic Backend Implementation

The Anthropic backend handles Claude-specific API format:

defmodule ExOutlines.Backend.Anthropic do
  @behaviour ExOutlines.Backend

  @impl true
  def call_llm(messages, opts) do
    with {:ok, config} <- validate_config(opts),
         {:ok, {system, conversation}} <- extract_system_message(messages),
         {:ok, body} <- build_anthropic_body(system, conversation, config),
         {:ok, response} <- make_anthropic_request(config, body) do
      parse_anthropic_response(response)
    end
  end

  defp extract_system_message(messages) do
    case Enum.split_with(messages, &(&1.role == "system")) do
      {[system | _], rest} -> {:ok, {system.content, rest}}
      {[], rest} -> {:ok, {"", rest}}
    end
  end

  defp build_anthropic_body(system, messages, config) do
    body = %{
      model: config.model,
      max_tokens: config.max_tokens,
      system: system,
      messages: Enum.map(messages, &format_message/1)
    }
    Jason.encode(body)
  end
end

Mock Backend Implementation

The Mock backend provides deterministic responses for testing:

defmodule ExOutlines.Backend.Mock do
  @behaviour ExOutlines.Backend

  defstruct [:agent_pid, call_count: 0]

  def new(responses) when is_list(responses) do
    {:ok, agent_pid} = Agent.start_link(fn -> {responses, 0} end)
    %__MODULE__{agent_pid: agent_pid, call_count: 0}
  end

  @impl true
  def call_llm(_messages, opts) do
    mock = Keyword.get(opts, :mock)
    if mock, do: get_next_response(mock), else: {:error, :no_mock_provided}
  end

  defp get_next_response(%__MODULE__{agent_pid: pid}) do
    Agent.get_and_update(pid, fn {responses, count} ->
      case responses do
        [] -> {{:error, :no_more_responses}, {[], count + 1}}
        [response | rest] -> {response, {rest, count + 1}}
      end
    end)
  end
end

Retry-Repair Loop Implementation

The retry-repair loop is implemented in the main ExOutlines module.

Loop Structure

defp generate_loop(schema, messages, backend, backend_opts, attempt, max_retries) do
  # Emit telemetry
  :telemetry.execute([:ex_outlines, :generate, :attempt], %{attempt: attempt}, %{})

  # Call backend
  case backend.call_llm(messages, backend_opts) do
    {:ok, response_text} ->
      # Validate response
      case Spec.validate(schema, response_text) do
        {:ok, validated} ->
          # Success
          {:ok, validated}

        {:error, diagnostics} when attempt < max_retries ->
          # Build repair prompt
          repair_message = build_repair_message(diagnostics)
          new_messages = messages ++ [
            %{role: "assistant", content: response_text},
            %{role: "user", content: repair_message}
          ]

          # Retry
          generate_loop(schema, new_messages, backend, backend_opts, attempt + 1, max_retries)

        {:error, _diagnostics} ->
          # Max retries exceeded
          {:error, :max_retries_exceeded}
      end

    {:error, reason} ->
      # Backend error
      {:error, {:backend_error, reason}}
  end
end

Repair Prompt Construction

defp build_repair_message(%Diagnostics{errors: errors}) do
  error_list =
    errors
    |> Enum.map(fn error ->
      "- Field: #{error.field}\n  Expected: #{error.expected}\n  Got: #{inspect(error.got)}\n  Issue: #{error.message}"
    end)
    |> Enum.join("\n\n")

  """
  Your previous output had validation errors:

  #{error_list}

  Please provide corrected JSON that addresses all errors.
  Respond with valid JSON only.
  """
end

Telemetry Events

# Start event
:telemetry.execute(
  [:ex_outlines, :generate, :start],
  %{system_time: System.system_time()},
  %{schema: schema, backend: backend}
)

# Stop event
:telemetry.execute(
  [:ex_outlines, :generate, :stop],
  %{
    duration: duration,
    attempt_count: final_attempt
  },
  %{
    schema: schema,
    backend: backend,
    status: status
  }
)

Batch Processing Design

Batch processing uses Task.async_stream for concurrent generation.

Implementation

def generate_batch(tasks, opts \\ []) do
  max_concurrency = Keyword.get(opts, :max_concurrency, System.schedulers_online())
  timeout = Keyword.get(opts, :timeout, 60_000)
  ordered = Keyword.get(opts, :ordered, true)

  # Emit batch start telemetry
  :telemetry.execute(
    [:ex_outlines, :batch, :start],
    %{system_time: System.system_time(), total_tasks: length(tasks)},
    %{max_concurrency: max_concurrency}
  )

  start_time = System.monotonic_time()

  # Process tasks concurrently
  results =
    tasks
    |> Task.async_stream(
      fn {schema, opts} -> generate(schema, opts) end,
      max_concurrency: max_concurrency,
      timeout: timeout,
      ordered: ordered
    )
    |> Enum.map(fn
      {:ok, result} -> result
      {:exit, reason} -> {:error, {:task_exit, reason}}
    end)

  # Emit batch stop telemetry
  duration = System.monotonic_time() - start_time
  {success_count, error_count} = count_results(results)

  :telemetry.execute(
    [:ex_outlines, :batch, :stop],
    %{
      duration: duration,
      total_tasks: length(tasks),
      success_count: success_count,
      error_count: error_count
    },
    %{}
  )

  results
end

Concurrency Model

Ex Outlines leverages BEAM's lightweight processes:

Advantages:

Thousands of concurrent tasks possible
Fault isolation (one failure does not affect others)
Efficient CPU scheduling
Built-in backpressure through max_concurrency

Performance:

Sequential: N tasks × average_time
Concurrent (max_concurrency=10): N tasks × average_time / 10 (approximately)

Telemetry Integration

Ex Outlines emits telemetry events for observability.

Event Design

Events follow the pattern: [:ex_outlines, operation, phase]

Generation events:

[:ex_outlines, :generate, :start] - Generation begins
[:ex_outlines, :generate, :stop] - Generation completes

Batch events:

[:ex_outlines, :batch, :start] - Batch processing begins
[:ex_outlines, :batch, :stop] - Batch processing completes

Measurement Data

Each event includes measurements and metadata:

# Generate stop event
:telemetry.execute(
  [:ex_outlines, :generate, :stop],
  %{
    duration: integer(),        # Nanoseconds
    attempt_count: integer()    # Number of attempts
  },
  %{
    schema: Schema.t(),
    backend: module(),
    status: :ok | :error,
    error_reason: term() | nil
  }
)

Handler Example

:telemetry.attach(
  "ex-outlines-logger",
  [:ex_outlines, :generate, :stop],
  fn _event, measurements, metadata, _config ->
    duration_ms = System.convert_time_unit(measurements.duration, :native, :millisecond)

    level = if metadata.status == :ok, do: :info, else: :warning

    Logger.log(level, """
    Generation #{metadata.status}:
      Backend: #{inspect(metadata.backend)}
      Duration: #{duration_ms}ms
      Attempts: #{measurements.attempt_count}
    """)
  end,
  nil
)

Ecto Integration

The Ecto integration is optional and conditionally compiled.

Conditional Compilation

defmodule ExOutlines.Ecto do
  if Code.ensure_loaded?(Ecto) do
    # Ecto integration code
    def from_ecto_schema(ecto_schema, opts \\ []) do
      # Implementation
    end
  else
    def from_ecto_schema(_ecto_schema, _opts) do
      raise "Ecto is not available. Add {:ecto, \"~> 3.11\"} to your dependencies."
    end
  end
end

Schema Conversion

def from_ecto_schema(ecto_schema, opts) do
  fields =
    ecto_schema.__schema__(:fields)
    |> Enum.map(fn field_name ->
      type = ecto_schema.__schema__(:type, field_name)
      field_spec = convert_ecto_type(type, field_name, ecto_schema, opts)
      {field_name, field_spec}
    end)
    |> Enum.into(%{})

  Schema.new(fields)
end

Type Mapping

defp convert_ecto_type(:string, _name, _schema, _opts) do
  %{type: :string, required: false}
end

defp convert_ecto_type(:integer, _name, _schema, _opts) do
  %{type: :integer, required: false}
end

defp convert_ecto_type({:array, inner_type}, _name, _schema, opts) do
  item_spec = convert_ecto_type(inner_type, nil, nil, opts)
  %{type: {:array, item_spec}, required: false}
end

defp convert_ecto_type({:parameterized, {Ecto.Enum, %{mappings: mappings}}}, _name, _schema, _opts) do
  values = Keyword.keys(mappings)
  %{type: {:enum, values}, required: false}
end

Changeset Analysis

defp extract_validations_from_changeset(ecto_schema, changeset_function) do
  # Create sample changeset
  sample = struct(ecto_schema)
  changeset = apply(ecto_schema, changeset_function, [sample, %{}])

  # Extract required fields
  required_fields = extract_required_fields(changeset)

  # Extract validation rules
  validations = extract_validation_rules(changeset)

  {required_fields, validations}
end

Extension Points

Ex Outlines is designed for extensibility.

Custom Backends

Implement the Backend behaviour:

defmodule MyApp.CustomBackend do
  @behaviour ExOutlines.Backend

  @impl true
  def call_llm(messages, opts) do
    # Custom implementation
    # Must return {:ok, response_text} or {:error, reason}
  end
end

Custom Validators

Currently, validation is handled through the schema system. Future versions may support custom validator behaviours:

# Future API
defmodule MyApp.CustomValidator do
  @behaviour ExOutlines.Validator

  @impl true
  def validate(field_name, value, opts) do
    # Custom validation logic
    # Return [] for valid, [error] for invalid
  end
end

Telemetry Handlers

Attach custom telemetry handlers for monitoring:

:telemetry.attach_many(
  "my-app-ex-outlines",
  [
    [:ex_outlines, :generate, :start],
    [:ex_outlines, :generate, :stop],
    [:ex_outlines, :batch, :start],
    [:ex_outlines, :batch, :stop]
  ],
  &MyApp.Telemetry.handle_event/4,
  %{}
)

Design Decisions

Post-Generation Validation vs. Token-Level Constraint

Decision: Use post-generation validation with repair loop

Rationale:

Simpler implementation (no FSM compilation)
Backend-agnostic (works with any LLM API)
No special model support needed
Clear error diagnostics for debugging
LLMs are good at error correction

Trade-off:

More LLM calls on validation failures
Higher latency on repair cycles
Potentially higher API costs

Mitigation:

Configurable max_retries
Good initial prompts reduce retries
Telemetry for monitoring retry rates

Atom Keys vs. String Keys

Decision: Convert validated output to use atom keys

Rationale:

Elixir convention
Better pattern matching
Struct compatibility
Clearer intent in code

Implementation:

Input accepts string keys (JSON standard)
Output uses atom keys (Elixir standard)
Conversion happens after validation

Error Collection vs. Fail-Fast

Decision: Collect all validation errors before returning

Rationale:

Complete feedback to LLM for repair
Fewer retry cycles
Better developer experience

Trade-off:

Slightly more computation per validation
More complex error structure

Benefit:

Single repair prompt can fix multiple issues
Reduced total LLM calls

Task.async_stream vs. GenServer Pool

Decision: Use Task.async_stream for batch processing

Rationale:

Built-in, no dependencies
Simple API
BEAM scheduler handles load balancing
Automatic cleanup

When to reconsider:

Need for persistent worker processes
Complex state management
Advanced backpressure handling
Custom scheduling logic

Future: May add GenStage support for advanced use cases

Behavior vs. Protocol for Backends

Decision: Use behavior for backend interface

Rationale:

Simpler for HTTP client abstraction
Compile-time guarantees
Clear contract with @callback

When protocols might be better:

Need for polymorphic dispatch
External implementations
Dynamic backend selection

Future Architecture

Planned Enhancements

Template System (v0.3)

# EEx-based prompt templates
template = ExOutlines.Template.new("""
<%= for example <- @examples do %>
Q: <%= example.question %>
A: <%= example.answer %>
<% end %>
Q: <%= @question %>
A:
""")

ExOutlines.generate(schema,
  template: template,
  assigns: %{examples: examples, question: question}
)

Streaming Support (v0.3)

# Incremental validation
ExOutlines.generate_stream(schema, opts)
|> Stream.each(fn
  {:partial, data} -> IO.write(data)
  {:complete, validated} -> IO.puts("\nDone")
  {:error, reason} -> IO.puts("Error: #{inspect(reason)}")
end)
|> Stream.run()

Generator Abstraction (v0.3)

# Reusable model + schema combination
generator = ExOutlines.Generator.new(
  backend: HTTP,
  backend_opts: opts,
  schema: user_schema
)

# Reuse compiled schema for multiple prompts
{:ok, user1} = ExOutlines.Generator.generate(generator, prompt1)
{:ok, user2} = ExOutlines.Generator.generate(generator, prompt2)

Context-Free Grammars (v0.4)

# Grammar-based validation
grammar = """
expression := term (('+' | '-') term)*
term := factor (('*' | '/') factor)*
factor := number | '(' expression ')'
number := [0-9]+
"""

schema = Schema.new(%{
  formula: %{type: {:grammar, grammar}}
})

Architectural Improvements

Caching Layer

# Cache schema compilation and LLM responses
config :ex_outlines,
  cache: [
    enabled: true,
    ttl: 3600,
    backend: ExOutlines.Cache.ETS
  ]

Circuit Breaker

# Prevent cascading failures
config :ex_outlines,
  circuit_breaker: [
    enabled: true,
    threshold: 5,
    timeout: 60_000
  ]

Middleware System

# Request/response middleware
ExOutlines.generate(schema,
  middleware: [
    MyApp.LoggingMiddleware,
    MyApp.RateLimitMiddleware,
    MyApp.CacheMiddleware
  ]
)

Summary

Ex Outlines architecture prioritizes simplicity, testability, and composability. Key architectural features:

Modular Design - Clear separation of concerns across modules
Validation-First - Post-generation validation with repair loop
Backend Agnostic - Behavior-based backend system
BEAM Native - Leverage lightweight processes for concurrency
Observable - Comprehensive telemetry integration
Extensible - Clear extension points for customization

The architecture supports the current feature set while providing foundation for future enhancements like streaming, templates, and grammars.

For implementation details, see the source code in lib/ex_outlines/.

← Previous Page Schema Patterns

Next Page → Batch Processing

Architecture

Table of Contents

System Overview

Data Flow

Design Principles

Module Organization

Module Responsibilities

Schema Module Design

Internal Representation

Normalization Process

Validation Dispatch

JSON Schema Generation

Validation Engine

Validation Algorithm

Error Collection Strategy

Nested Validation

Array Validation

Union Type Validation

Backend Architecture

Backend Behaviour

HTTP Backend Implementation

Anthropic Backend Implementation

Mock Backend Implementation

Retry-Repair Loop Implementation

Loop Structure

Repair Prompt Construction

Telemetry Events

Batch Processing Design

Implementation

Concurrency Model

Telemetry Integration

Event Design

Measurement Data

Handler Example

Ecto Integration

Conditional Compilation

Schema Conversion

Type Mapping

Changeset Analysis

Extension Points

Custom Backends

Custom Validators

Telemetry Handlers

Design Decisions

Post-Generation Validation vs. Token-Level Constraint

Atom Keys vs. String Keys

Error Collection vs. Fail-Fast

Task.async_stream vs. GenServer Pool

Behavior vs. Protocol for Backends

Future Architecture

Planned Enhancements

Architectural Improvements

Summary