Deploying

OpenResponses is a standard Phoenix application. Any deployment approach that works for Phoenix works here.

Environment variables

At minimum, set the API keys for the providers you use:

OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GEMINI_API_KEY=AIza...
SECRET_KEY_BASE=...   # Phoenix requirement
PHX_HOST=your-domain.com

Release builds

Build a release with mix release:

MIX_ENV=prod mix assets.deploy
MIX_ENV=prod mix release

The release is self-contained. Run it:

PHX_HOST=your-domain.com \
OPENAI_API_KEY=sk-... \
_build/prod/rel/open_responses/bin/open_responses start

Docker

FROM hexpm/elixir:1.18.3-erlang-27.3-alpine-3.21.0 AS build

WORKDIR /app
RUN mix do local.hex --force, local.rebar --force

COPY mix.exs mix.lock ./
RUN MIX_ENV=prod mix deps.get --only prod
RUN MIX_ENV=prod mix deps.compile

COPY lib lib
COPY priv priv
COPY guides guides
COPY config config

RUN MIX_ENV=prod mix assets.deploy
RUN MIX_ENV=prod mix release

FROM alpine:3.21.0 AS runtime

RUN apk add --no-cache libstdc++ openssl ncurses-libs

WORKDIR /app
COPY --from=build /app/_build/prod/rel/open_responses ./

ENV MIX_ENV=prod
ENV PHX_SERVER=true

EXPOSE 4000

ENTRYPOINT ["bin/open_responses"]
CMD ["start"]

Fly.io

fly launch
fly secrets set OPENAI_API_KEY=sk-... ANTHROPIC_API_KEY=sk-ant-...
fly deploy

Clustering

OpenResponses uses Phoenix.PubSub for event broadcasting within a node. In a multi-node deployment, PubSub must be configured for distribution:

config :open_responses, OpenResponses.PubSub,
  adapter: Phoenix.PubSub.PG2

And use dns_cluster (already included) for automatic node discovery on platforms like Fly.io:

config :open_responses, :dns_cluster_query, System.get_env("DNS_CLUSTER_QUERY")

Response cache in a cluster

The default Cachex cache is per-node in-memory. In a cluster, a previous_response_id might reference a response stored on a different node.

For production clustering, switch to a distributed cache or enable the AshPostgres persistence layer (Phase 3):

# Option 1: Cachex with distributed adapter (Nebulex)
# Option 2: AshPostgres — responses stored in Postgres, accessible across nodes

Until then, use sticky sessions to route a user's requests to the same node.

Scaling considerations

Concern	Guidance
Concurrent requests	The BEAM handles thousands of simultaneous loops comfortably. No special config needed.
Long-running agentic loops	Use streaming. Non-streaming requests hold a connection open.
Provider rate limits	Add a rate-limiting middleware (`MyApp.Middleware.RateLimit`).
Memory	Each active loop holds the response state in memory. Monitor `open_responses_loop_iterations_total` to track concurrency.
Ollama	Run Ollama on the same host or a fast private network. GPU latency dominates.

Health check

The Phoenix endpoint is healthy when it responds to HTTP requests. Add a health check route:

# router.ex
get "/health", OpenResponsesWeb.HealthController, :check

defmodule OpenResponsesWeb.HealthController do
  use OpenResponsesWeb, :controller

  def check(conn, _params) do
    json(conn, %{status: "ok"})
  end
end

← Previous Page Scaling