Conversation History
View SourceOpenResponses maintains conversation history automatically using previous_response_id. You never need to replay prior messages — just reference the last response ID and OpenResponses reconstructs the full context.
How it works
When a response completes, OpenResponses stores it in ResponseCache (backed by Cachex). On the next request, if previous_response_id is present, the loop loads the prior response and prepends its input and output to the new request's input before sending to the provider.
Request 2: previous_response_id = "resp_01"
│
┌─────────────┘
▼
ResponseCache.get("resp_01")
│
▼
prev.input + prev.output + new_input
│
▼
sent to providerBasic usage
# Turn 1
curl -X POST http://localhost:4000/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-6",
"input": [{"role": "user", "content": "My favourite language is Elixir."}]
}'
# → {"id": "resp_abc", "status": "completed", ...}
# Turn 2 — no history needed in the request
curl -X POST http://localhost:4000/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-6",
"previous_response_id": "resp_abc",
"input": [{"role": "user", "content": "What is my favourite language?"}]
}'
# → Model knows it's Elixir
Cache configuration
By default, responses are cached for 24 hours in memory. To change the TTL, responses are stored via Cachex which you can configure at startup:
# application.ex
{Cachex, name: :response_cache, limit: 10_000}For cross-node or cross-restart persistence (Phase 3), add AshPostgres as a data layer and responses will be stored durably.
What gets cached
For each completed response, the cache stores:
id— the response IDmodel— the model usedstatus— terminal state (completed,failed, orincomplete)input— the original input sent by the clientoutput— all output items produced by the modelusage— token countscreated_at— timestamp
Responses in failed or incomplete states are cached but their output may be partial.
Chaining multiple turns
Each turn only needs to reference the immediately preceding response — not the entire chain. OpenResponses handles the reconstruction:
resp_001 ← resp_002 ← resp_003 ← resp_004 (current)When processing resp_004, OpenResponses loads resp_003 from cache. resp_003's own context was already reconstructed when it was created, so its input field contains the full accumulated history up to that point.
Branching conversations
Because previous_response_id is just a reference, you can branch at any point:
resp_001
├── resp_002a (branch A)
│ └── resp_003a
└── resp_002b (branch B)
└── resp_003bBoth branches reference resp_001 but diverge from there. This is useful for showing users alternative continuations or implementing undo.