Legion is an Elixir-native framework for building AI agents. Unlike traditional function-calling approaches, Legion agents generate and execute actual Elixir code, giving them the full power of the language while staying safely sandboxed.
Quick Start
1. Define your tools
Tools are regular Elixir modules that expose functions to your agents:
defmodule MyApp.Tools.ScraperTool do
use Legion.Tool
@doc "Fetches recent posts from HackerNews"
def fetch_posts do
Req.get!("https://hn.algolia.com/api/v1/search_by_date").body["hits"]
end
end
defmodule MyApp.Tools.DatabaseTool do
use Legion.Tool
@doc "Saves a post title to the database"
def insert_post(title), do: Repo.insert!(%Post{title: title})
end2. Define an Agent
Agents are long or short-lived Elixir processes that maintain state and can be messaged.
defmodule MyApp.ResearchAgent do
@moduledoc """
Fetch posts, evaluate their relevance and quality, and save the good ones.
"""
use Legion.Agent
def tools, do: [MyApp.Tools.ScraperTool, MyApp.Tools.DatabaseTool]
end3. Run the Agent
{:ok, result} = Legion.execute(MyApp.ResearchAgent, "Find cool Elixir posts about Advent of Code and save them")
# => {:ok, "Found 3 relevant posts and saved 2 that met quality criteria."}Features
- Code Generation over Function Calling - Agents write Elixir code instead of making dozens of tool-call round-trips. This makes them smarter and reduces the amount of tokens used. See anthropic post about this.
- Sandboxed Execution - Generated code runs in a restricted environment with controlled access to tools. You have full control over which tools are exposed to which agents, and you can monitor agent behavior using the
legion_webdashboard. - Simple Tool Definition - Expose any Elixir module as a tool with
use Legion.Tool. This allows you to reuse your existing app's logic. If you want to expose a third-party module as a set of tools, you can do that too. - Authorization baked in - The safest way to authorize tool calls via the
Vaultlibrary. Put all data needed to authorize an LLM call before starting the agent, and validate it inside the tool call. Everything will be available due toVault's nature. - Long-lived Agents - Treat your agents as GenServers, context is preserved naturally. Start your agent with
Legion.start_link/2, just as you'd start a GenServer. Agents can reference variables across turns (tasks) — just use theshare_bindings: trueoption. - Multi-Agent Systems - Agents can orchestrate other agents, letting you create complex systems that manage themselves. Agents spawn other agents as linked processes — when a parent dies, all children are stopped too. Your agent is just another BEAM process.
- Human in the Loop - Human-in-the-loop is just a built-in tool called
HumanTool. You could have written it yourself, but I wrote it for you. It just blocks the agent's execution until it receives a message from the user. Simple as that. - Structured Output - Define schemas to get typed, validated responses from agents, or omit types and operate on plain text. You have full control over prompts and schemas.
- Configurable - Global defaults with per-agent overrides for model, timeouts, and limits
- Telemetry - Built-in observability with events for calls, iterations, LLM requests, and more
- All BEAM/Elixir features - Since it's built on top of raw processes, everything that works with processes would work with Legion. In that: process groups, hot code reloading, processes being super lightweight and isolated, and many many more.
Installation
Add legion to your list of dependencies in mix.exs:
def deps do
[
{:legion, "~> 0.2"}
]
endConfigure your LLM API key (see req_llm configuration for all options):
# config/runtime.exs
config :req_llm, openai_api_key: System.get_env("OPENAI_API_KEY")How It Works
When you ask an agent: "Find cool Elixir posts about Advent of Code and save them"
The agent first fetches and filters relevant posts:
ScraperTool.fetch_posts()
|> Enum.filter(fn post ->
title = String.downcase(post["title"] || "")
String.contains?(title, "elixir") and String.contains?(title, "advent")
end)The LLM reviews the results, decides which posts are actually "cool", then saves them:
["Elixir Advent of Code 2024 - Day 5 walkthrough", "My first AoC in Elixir!"]
|> Enum.each(&DatabaseTool.insert_post/1)Traditional function-calling would need dozens of round-trips. Legion lets the LLM write expressive pipelines and make subjective judgments at the same time.
Long-lived Agents
For multi-turn conversations or persistent agents:
# Start an agent that maintains context
{:ok, pid} = Legion.start_link(MyApp.AssistantAgent, "Help me analyze this data")
# Send follow-up messages
{:ok, response} = Legion.call(pid, "Now filter for items over $100")
# Or fire-and-forget
Legion.cast(pid, "Also check the reviews")Configuration
Configure Legion in your config/config.exs:
config :legion, :config, %{
model: "openai:gpt-4o-mini",
max_iterations: 10,
max_retries: 3,
sandbox_timeout: 60_000,
share_bindings: false
}- Iterations are successful execution steps - the agent fetches data, processes it, calls another tool, etc. Each productive action counts as one iteration.
- Retries are consecutive failures - when the LLM generates invalid code or a tool raises an error. The counter resets after each successful iteration.
- share_bindings — when
true, variable bindings from code execution carry over between turns in a long-lived agent. For example, if the LLM assignsposts = ScraperTool.fetch_posts()in one turn, thepostsvariable will be available in the next turn. Defaults tofalse(each turn starts with a clean slate).
Agents can override global settings:
defmodule MyApp.DataAgent do
use Legion.Agent
def tools, do: [MyApp.HTTPTool]
def config, do: %{model: "anthropic:claude-sonnet-4-20250514", max_iterations: 5}
endAgent Callbacks
All callbacks are optional with sensible defaults:
| Callback | Default | Description |
|---|---|---|
tools/0 | [] | Tool modules available to the agent |
description/0 | @moduledoc | Agent description for the system prompt |
output_schema/0 | %{"type" => "string"} | JSON Schema for structured output |
tool_config/1 | [] | Per-tool keyword config |
system_prompt/0 | auto-generated | Override the entire system prompt |
config/0 | %{} | Model, timeouts, limits |
defmodule MyApp.DataAgent do
use Legion.Agent
def tools, do: [MyApp.HTTPTool]
# Structured output schema
def output_schema do
[
summary: [type: :string, required: true],
count: [type: :integer, required: true]
]
end
# Additional instructions for the LLM
def system_prompt do
"Always validate URLs before fetching. Prefer JSON responses."
end
# Pass options to specific tools (accessible via Vault)
def tool_config(MyApp.HTTPTool), do: [timeout: 10_000]
endAuthorization
To authorize tool calls for a specific user, put auth data into Vault before starting the agent and read it inside the tool. LLM-generated code has no access to Vault.
# Before starting the agent
Vault.init(:current_user, %{id: user.id})
{:ok, result} = Legion.execute(MyApp.PostsAgent, "Find my posts from today and summarize them")# Inside your tool
defmodule MyApp.Tools.PostsTool do
use Legion.Tool
def get_my_posts do
%{id: user_id} = Vault.get(:current_user)
Repo.all(from p in Post, where: p.user_id == ^user_id)
end
endHuman in the Loop tool
Request human input during agent execution:
# Agent can use built-in HumanTool (if you allow it to)
HumanTool.ask("Should I proceed with this operation?")
# Your application responds
Legion.call(agent_pid, {:respond, "Yes, proceed"})Multi-Agent Systems
Agents can spawn and communicate with other agents using the built-in AgentTool:
defmodule MyApp.OrchestratorAgent do
use Legion.Agent
def tools, do: [Legion.Tools.AgentTool, MyApp.Tools.DatabaseTool]
def tool_config(Legion.Tools.AgentTool), do: [agents: [MyApp.ResearchAgent, MyApp.WriterAgent]]
endThe orchestrator agent can then delegate tasks:
# One-off task delegation
{:ok, research} = AgentTool.call(MyApp.ResearchAgent, "Find info about Elixir 1.18")
# Start a long-lived sub-agent
{:ok, pid} = AgentTool.start_link(MyApp.WriterAgent, "Write a blog post")
AgentTool.cast(pid, "Add a section about pattern matching")
{:ok, draft} = AgentTool.call(pid, "Show me what you have so far")Agent Pools
Since agents are regular BEAM processes, you can use Erlang's :pg (process groups) to create agent pools with no external infrastructure:
# Spawn a pool of support agents
for _ <- 1..5 do
{:ok, pid} = Legion.start_link(SupportAgent)
:pg.join(:support_pool, pid)
end
# Route incoming tickets to the next available agent
defp handle_ticket(ticket) do
pool = :pg.get_members(:support_pool)
agent = Enum.random(pool)
Legion.cast(agent, "Handle this support ticket: #{ticket}")
endHot Code Reloading
Since tools and agents are regular Elixir modules, the BEAM's hot code reloading works out of the box. You can update tool implementations, swap agent behaviors, or add entirely new capabilities to running agents — without restarting the VM, without dropping conversations, without losing state.
Telemetry
Legion.Telemetry.attach_default_logger()Legion emits telemetry events for observability:
[:legion, :agent, :started | :stopped]- agent lifecycle[:legion, :agent, :message, :start | :stop]- per-message lifecycle[:legion, :iteration, :start | :stop]- each execution step[:legion, :llm, :request, :start | :stop]- LLM API calls[:legion, :sandbox, :eval, :start | :stop]- code evaluation[:legion, :human, :input_required | :input_received]- human-in-the-loop
Plus, Legion emits Req telemetry events for HTTP requests.
Limitations
Sandboxing
Legion's sandbox restricts what LLM-generated code can do — but it is not a full process isolation sandbox yet. Generated code runs inside the same BEAM VM as your application.
What the sandbox does:
- Blocks dangerous language constructs:
defmodule,import,spawn,send,receive,apply, etc. - Restricts module access to an explicit allowlist (standard library + your tools)
- Kills the evaluation process if it exceeds
sandbox_timeout
What it does not do:
- Isolate memory — runaway allocations affect the whole VM
- Prevent atom table exhaustion —
String.to_atom/1is available and atoms are never garbage collected - Restrict access to the BEAM node name, process pid, or refs via
Kernelfunctions
The practical implication: Legion is designed for trusted code generators (your own LLM-backed agents with controlled tool access), not for running arbitrary untrusted code from unknown sources. If your threat model requires full process isolation, you might want to spawn legion agents in an isolated BEAM instance.
Summary
Functions
Sends a message to a running agent and waits for the result.
Sends a message to a running agent without waiting for a result.
Runs an agent on a single task and returns the result.
Runs multiple agent tasks concurrently and collects results.
Runs agent tasks sequentially, threading each result to the next step.
Starts a long-lived agent process.
Chains an agent task after a previous result.
Functions
Sends a message to a running agent and waits for the result.
Examples
{:ok, pid} = Legion.start_link(AssistantAgent)
{:ok, answer} = Legion.call(pid, "What is the capital of France?")
{:ok, follow_up} = Legion.call(pid, "And its population?")
Sends a message to a running agent without waiting for a result.
Examples
{:ok, pid} = Legion.start_link(ReportAgent)
Legion.cast(pid, "Generate the weekly report and email it")
Runs an agent on a single task and returns the result.
Starts a temporary agent process, blocks until the task completes, then stops it.
Examples
{:ok, summary} = Legion.execute(ResearchAgent, "Summarize the Elixir getting started guide")
{:cancel, :reached_max_iterations} = Legion.execute(ResearchAgent, "impossible task")
Runs multiple agent tasks concurrently and collects results.
Returns {:ok, results} if all succeed, or the first {:cancel, reason}.
Examples
# Run two agents in parallel
{:ok, [research, analysis]} =
Legion.parallel([
{ResearchAgent, "Find recent Elixir blog posts"},
{AnalysisAgent, "Summarize market trends"}
])
# With a timeout (in milliseconds)
{:ok, results} =
Legion.parallel(
[{FastAgent, "task 1"}, {FastAgent, "task 2"}],
30_000
)
Runs agent tasks sequentially, threading each result to the next step.
Each step is {agent, task} where task is a string or a function
that receives the previous result and returns a task string.
Halts early if any step returns {:cancel, reason}.
Examples
# Static tasks — each runs independently
{:ok, final} =
Legion.pipeline([
{ResearchAgent, "Find info about Elixir OTP"},
{WriterAgent, "Write a blog post about OTP"}
])
# Thread results — each step receives the previous result
{:ok, post} =
Legion.pipeline([
{ResearchAgent, "Find recent Elixir news"},
{WriterAgent, fn research -> "Write a summary based on: #{research}" end},
{EditorAgent, fn draft -> "Polish this draft: #{draft}" end}
])
Starts a long-lived agent process.
Options
:name- register the process under a name- Any config overrides (
:model,:max_iterations, etc.)
Examples
{:ok, pid} = Legion.start_link(AssistantAgent)
{:ok, pid} = Legion.start_link(AssistantAgent, name: MyAssistant, model: "openai:gpt-4o")
Chains an agent task after a previous result.
Useful for piping from parallel/2 or pipeline/1.
Examples
# Chain after parallel
Legion.parallel([
{ResearchAgent, "Find Elixir news"},
{ResearchAgent, "Find Erlang news"}
])
|> Legion.then(WriterAgent, fn results ->
"Summarize these findings: #{inspect(results)}"
end)
# Passes through cancellations
{:cancel, reason} |> Legion.then(WriterAgent, fn _ -> "ignored" end)
#=> {:cancel, reason}