Overview

View Source

OpenResponses is a production-grade Elixir implementation of the Open Responses specification — a provider-agnostic API for interacting with large language models.

What it does

At its simplest: you send a request, you get a response. Behind the scenes, OpenResponses manages the agentic loop, routes to the right provider, streams events back to your client, dispatches tool calls, and tracks conversation history — all without you writing any of that infrastructure.

POST /v1/responses
{
  "model": "gpt-4o",
  "input": [{"role": "user", "content": "What's the weather in London?"}],
  "tools": [{"type": "function", "name": "get_weather", "parameters": {...}}]
}

Your client receives a stream of Server-Sent Events, or a single JSON response — your choice.

Why Elixir?

The Open Responses spec describes a system that maps directly onto what the BEAM was built for:

Spec requirementBEAM advantage
Agentic loop with tool dispatchOne GenServer per request — crash isolation, OTP supervision
Semantic SSE streamingPhoenix handles chunked HTTP natively; PubSub fans out to multiple consumers
Multi-provider concurrencyThousands of simultaneous loops on a single node without threads
State machine on responsesAshStateMachine enforces valid transitions at compile time
Extensible provider routingPattern-match on model names in config — no code changes

Supported providers

ProviderModel patternNotes
OpenAIgpt-*Near pass-through
Anthropicclaude-*Full event translation
Google Geminigemini-*contents/parts format
Ollamallama*, mistral*, phi*, qwen*Local models, no API key
z.aiAnyUse Anthropic adapter with custom base_url
CustomAnythingImplement OpenResponses.Adapter

Architecture in one diagram

Client
  
  
POST /v1/responses
  
  
ResponseController
    creates Response (Ash / ETS)
  
LoopSupervisor.start_loop/1
  
  
Loop (GenServer, one per request)
    resolves adapter from model name
    applies middleware before_sample
    calls adapter.stream/2
    processes events
    dispatches tool calls
    applies middleware after_sample
    transitions state machine
    caches completed response
  
   broadcasts events via PubSub  SSE chunks to client
  
   stores in ResponseCache  available for previous_response_id

Next steps