A behaviour and supervision framework for long-running LLM agent processes, modeled as OTP state machines.
Each agent is a :gen_statem process wrapping a persistent LLM session.
Every interaction is a prompt-response turn, and the implementation decides
what happens between turns.
It is a GenServer but every call is a prompt.
GenAgent handles the mechanics of turns. Implementations handle the semantics of turns.
Installation
def deps do
[
{:gen_agent, "~> 0.2.0"},
# Plus at least one backend:
{:gen_agent_claude, "~> 0.1.0"},
{:gen_agent_codex, "~> 0.1.0"},
{:gen_agent_anthropic, "~> 0.1.0"}
]
endQuick start
Define an implementation module by using the GenAgent behaviour:
defmodule MyApp.Coder do
use GenAgent
defmodule State do
defstruct [:path, responses: []]
end
@impl true
def init_agent(opts) do
path = Keyword.fetch!(opts, :cwd)
backend_opts = [
cwd: path,
system_prompt: "You are a coding assistant."
]
{:ok, backend_opts, %State{path: path}}
end
@impl true
def handle_response(_ref, response, state) do
{:noreply, %{state | responses: state.responses ++ [response.text]}}
end
endStart the agent under the supervision tree and interact with it by name:
{:ok, _pid} = GenAgent.start_agent(MyApp.Coder,
name: "my-coder",
backend: GenAgent.Backends.Claude,
cwd: "/path/to/project"
)
# Synchronous prompt.
{:ok, response} = GenAgent.ask("my-coder", "What does lib/foo.ex do?")
IO.puts(response.text)
# Async prompt.
{:ok, ref} = GenAgent.tell("my-coder", "Add tests for lib/foo.ex")
{:ok, :completed, response} = GenAgent.poll("my-coder", ref)
# Push an external event into handle_event/2.
GenAgent.notify("my-coder", {:ci_failed, "test_auth"})
GenAgent.stop("my-coder")State model
An agent is a state machine with two states:
idle --- ask/tell/notify ---> processing
|
v
idle <--- handle_response --- processing (turn done)- :idle -- waiting for work. On enter, drains the mailbox (queued prompts) in FIFO order.
- :processing -- a prompt is in flight. One at a time. New prompts queue.
- Self-chaining --
handle_response/3can return{:prompt, text, state}to immediately dispatch another turn without going through the mailbox. Useful for multi-step work that the agent drives itself. - Halting -- any callback can return
{:halt, state}to go idle but freeze the mailbox. A halted agent ignores queued prompts untilGenAgent.resume/1is called. - Watchdog -- a
:state_timeoutkills any turn that runs longer than the configured deadline (default 10 minutes). Configurable per agent.
Lifecycle hooks
In addition to the core callbacks, v0.2 adds four optional lifecycle hooks for fine-grained control over what happens around each turn and around the agent's full run:
| Hook | When it fires | Typical use |
|---|---|---|
pre_run/1 | Once, after init_agent/1, before the first turn | Slow async setup: clone a repo, create a worktree, fetch secrets |
pre_turn/2 | Before each prompt dispatch | Prompt augmentation, rate limiting, :skip/:halt as a gate |
post_turn/3 | After each turn, post-decision | State-mutating side effects: commit per turn, record usage |
post_run/1 | On clean {:halt, state} from any callback | Completion actions: open a PR, post a summary |
All four are optional with default no-op implementations. The guiding
principle is telemetry first, callbacks for state mutation --
observational use cases (log tokens, emit metrics) should use the
existing telemetry events; callbacks exist specifically for hooks that
need to mutate agent_state or block the next transition.
See the Workspace pattern guide for a complete example exercising all four hooks in sequence around a git workspace.
Backends
GenAgent ships with a GenAgent.Backend behaviour and no built-in backend.
Pick one of the sibling packages or write your own:
| Backend | Package | Transport |
|---|---|---|
| Claude (Anthropic) | gen_agent_claude | claude CLI via claude_wrapper |
| Codex (OpenAI) | gen_agent_codex | codex CLI via codex_wrapper |
| Anthropic HTTP | gen_agent_anthropic | direct HTTP API via req |
A backend owns its session lifecycle, translates the LLM-specific event
stream into the normalized GenAgent.Event values the state machine
consumes, and carries any state it needs (session id, message history) in
an opaque session term.
The contract is deliberately small: five callbacks
(start_session/1, prompt/2, update_session/2, resume_session/2,
terminate_session/1), of which two are optional. See GenAgent.Backend
for details.
Public API
| Function | What it does |
|---|---|
start_agent/2 | Start an agent under the supervision tree. |
ask/3 | Synchronous prompt. Blocks until the turn finishes. |
tell/3 | Async prompt. Returns a ref for poll/3. |
poll/3 | Check on a previously-issued tell/3. |
notify/2 | Push an external event into handle_event/2. |
interrupt/1 | Cancel an in-flight turn. |
resume/1 | Unhalt an agent and drain its mailbox. |
status/2 | Read the agent's current state. |
stop/1 | Terminate the agent. |
whereis/1 | Look up an agent's pid. |
Names resolve through a Registry. Callers hold names (any term), never
pids, so agents can be restarted without breaking callers.
Supervision
The package starts a fixed supervision tree on application boot:
GenAgent.Supervisor
GenAgent.Registry (Registry, keys: :unique)
GenAgent.TaskSupervisor (Task.Supervisor)
GenAgent.AgentSupervisor (DynamicSupervisor)
<your agents under here>Each prompt turn runs as a Task under the shared TaskSupervisor. A
crashed task delivers :DOWN to the owning agent, which turns it into an
{:error, {:task_crashed, reason}} response for the caller -- it does not
take down the agent process.
Patterns
Ten common topologies are documented as ex_doc guides shipped with the package. Each guide is a complete worked example you can read, copy, and adapt -- they are not installed as public API modules:
- Switchboard -- human-managed named agent fleet with non-blocking send/poll/inbox, the base for manager-driven UIs
- Research -- one agent self-chaining through phases
- Debate -- two agents pushing each other via cross-notify
- Pipeline -- linear stage chain, one-way notify
- Supervisor -- coordinator + dynamic workers (fan-out/in)
- Pool -- reusable worker pool with round-robin dispatch
- Watcher -- reactive event-driven agent, idle until triggered
- Checkpointer -- human-in-the-loop review workflow
- Retry -- handle_error self-chain for transient failures
- Workspace -- all four lifecycle hooks around a git workspace
Start with the patterns overview for a "choose your pattern" decision tree.
Telemetry
GenAgent emits telemetry events for observability:
[:gen_agent, :prompt, :start] # %{agent, ref}
[:gen_agent, :prompt, :stop] # %{agent, ref, duration}
[:gen_agent, :prompt, :error] # %{agent, ref, reason}
[:gen_agent, :event, :received] # %{agent, event}
[:gen_agent, :state, :changed] # %{agent, from, to}
[:gen_agent, :mailbox, :queued] # %{agent, depth}
[:gen_agent, :halted] # %{agent}Enough to build a communication graph, track latency, alert on stuck
agents. Attach handlers with :telemetry.attach/4.
What GenAgent does not do
- Prescribe agent behavior. No retry logic, no STATUS line conventions, no summary format. That is all implementation concern.
- Prescribe inter-agent communication. Agents can
notify/2each other by name, but the message format is up to you. - Manage persistence. If you want to persist agent state across
restarts, do it in
terminate_agent/2andinit_agent/1. - Manage pools. One agent = one session = one process. If you want a pool, start multiple and route to them.
- Track costs or budgets. Usage data is in
GenAgent.Response.usage. Do what you want with it.
Testing
mix test
mix format --check-formatted
mix credo --strict
mix dialyzer
The test suite uses an in-process GenAgent.Backends.Mock (in
test/support/) that lets you script backend responses without any
external process. See test/gen_agent/server_test.exs for examples.
License
MIT. See LICENSE.