Legion (Legion v0.4.0)

View Source

An Elixir framework for building AI agents that write and execute code instead of making function calls.

Traditional agents call tools one at a time - fetch, wait, decide, fetch again - burning tokens and latency on every round-trip. Legion agents write Elixir code that fetches, filters, decides, and acts in a single step, running safely in a sandbox. Fewer LLM calls, smarter behavior, full language expressivity. Why code execution beats function calling.

Quick Start

1. Define your tools

Tools are regular Elixir modules. The LLM sees their source code and can call any public function.

defmodule MyApp.Tools.ScraperTool do
  use Legion.Tool

  @doc "Fetches recent posts from HackerNews"
  def fetch_posts do
    Req.get!("https://hn.algolia.com/api/v1/search_by_date").body["hits"]
  end
end

defmodule MyApp.Tools.DatabaseTool do
  use Legion.Tool

  @doc "Saves a post title to the database"
  def insert_post(title), do: Repo.insert!(%Post{title: title})
end

2. Define an agent

Agents are Elixir processes that receive tasks, write code to solve them, and maintain conversation state.

defmodule MyApp.ResearchAgent do
  @moduledoc """
  Fetch posts, evaluate their relevance and quality, and save the good ones.
  """
  use Legion.Agent

  def tools, do: [MyApp.Tools.ScraperTool, MyApp.Tools.DatabaseTool]
end

3. Run it

{:ok, result} = Legion.execute(MyApp.ResearchAgent, "Find cool Elixir posts about Advent of Code and save them")
# => {:ok, "Found 3 relevant posts and saved 2 that met quality criteria."}

How It Works

When you send "Find cool Elixir posts about Advent of Code and save them", the agent writes:

ScraperTool.fetch_posts()
|> Enum.filter(fn post ->
  title = String.downcase(post["title"] || "")
  String.contains?(title, "elixir") and String.contains?(title, "advent")
end)

It sees the results, decides which posts are worth saving, and writes:

["Elixir Advent of Code 2024 - Day 5 walkthrough", "My first AoC in Elixir!"]
|> Enum.each(&DatabaseTool.insert_post/1)

A traditional agent would need a separate LLM call for each filter decision and each insert. Legion handles filtering, judgment, and action in two steps - with the full power of Elixir's Enum, pattern matching, and pipelines available at every step.

Features

  • Code generation over function calling - Agents write Elixir pipelines, not individual tool calls. Fewer tokens, fewer round-trips, smarter behavior.
  • Sandboxed execution - Generated code runs in a restricted environment. Dangerous constructs (defmodule, spawn, send, import) are blocked at the AST level. Module access is limited to stdlib + your tools.
  • Tools are just modules - use Legion.Tool on any module to expose it. The LLM reads your source code and calls your functions. No schemas to write, no wrappers - reuse existing app logic directly.
  • Authorization via Vault - Set auth context before the agent starts, validate inside tools at runtime. LLM-generated code never touches credentials. See Vault.
  • Long-lived agents - Start agents with Legion.start_link/2 and message them with call/2 and cast/2, just like a GenServer. Variables can persist across turns with binding_scope: :conversation.
  • Multi-agent orchestration - Agents delegate to other agents via the built-in AgentTool. Fan out with parallel/2, chain with pipeline/1. Sub-agents are linked processes - when a parent dies, children stop too.
  • Human in the loop - The built-in HumanTool pauses agent execution until a human responds. It's just message passing - your handler receives a question and sends back an answer.
  • Structured output - Define a JSON Schema via output_schema/0 to get typed, validated responses. Or skip it and work with plain text.
  • Telemetry - Events for agent lifecycle, messages, iterations, LLM calls, and code evaluation. Plug into any monitoring stack.
  • Process-native - Agents are BEAM processes. Supervision trees, process groups, hot code reloading, lightweight concurrency - all work out of the box.

Installation

Add legion to your dependencies:

def deps do
  [
    {:legion, "~> 0.4"}
  ]
end

Configure your LLM provider (all options):

# config/runtime.exs
config :req_llm, openai_api_key: System.get_env("OPENAI_API_KEY")

Web Dashboard

legion_web provides a real-time Phoenix LiveView dashboard for monitoring agents, viewing conversation traces, and inspecting generated code.

Legion Web Dashboard

Long-lived Agents

# Start an agent that maintains context
{:ok, pid} = Legion.start_link(MyApp.AssistantAgent)

# Send follow-up messages
{:ok, response} = Legion.call(pid, "Now filter for items over $100")

# Or fire-and-forget
Legion.cast(pid, "Also check the reviews")

Multi-Agent Systems

Agents orchestrate other agents through the built-in AgentTool:

defmodule MyApp.OrchestratorAgent do
  @moduledoc "Coordinates research and writing sub-agents to produce finished content."
  use Legion.Agent

  def tools, do: [Legion.Tools.AgentTool, MyApp.Tools.DatabaseTool]
  def tool_config(Legion.Tools.AgentTool), do: [agents: [MyApp.ResearchAgent, MyApp.WriterAgent]]
end

The orchestrator's generated code can then delegate:

{:ok, research} = AgentTool.call(MyApp.ResearchAgent, "Find info about Elixir 1.18")
{:ok, draft} = AgentTool.call(MyApp.WriterAgent, "Write a blog post using: #{research}")

Run independent tasks in parallel or chain them sequentially:

{:ok, [posts, trends]} = Legion.parallel([
  {MyApp.ResearchAgent, "Find recent Elixir posts"},
  {MyApp.AnalysisAgent, "Summarize Elixir trends"}
])

{:ok, result} = Legion.pipeline([
  {MyApp.ResearchAgent, "Find Elixir blog posts from this week"},
  {MyApp.WriterAgent, &"Summarize these posts: #{&1}"}
])

Authorization

Set auth context before starting the agent. Tools read it at runtime via Vault. LLM-generated code has no access to Vault.

Vault.init(current_user: %{id: user.id})
{:ok, result} = Legion.execute(MyApp.PostsAgent, "Find my posts from today and summarize them")
defmodule MyApp.Tools.PostsTool do
  use Legion.Tool

  def get_my_posts do
    %{id: user_id} = Vault.get(:current_user)
    Repo.all(from p in Post, where: p.user_id == ^user_id)
  end
end

Human in the Loop

The HumanTool pauses agent execution and sends a question to your handler process:

defmodule MyApp.AssistantAgent do
  @moduledoc "An assistant that can ask the user questions."
  use Legion.Agent

  def tools, do: [Legion.Tools.HumanTool]
  def tool_config(Legion.Tools.HumanTool), do: [handler: MyApp.ChatHandler, timeout: 30_000]
end

Your handler receives {:human_request, ref, from_pid, question, meta} and replies with {:human_response, ref, answer}.

Configuration

config :legion, :config, %{
  model: "openai:gpt-4o-mini",
  max_iterations: 10,
  max_retries: 3,
  sandbox_timeout: 60_000,
  binding_scope: :turn,
  max_message_length: 20_000
}
OptionDescription
max_iterationsSuccessful execution steps before the agent is stopped.
max_retriesConsecutive failures (bad code, tool errors) before giving up. Resets after each success.
binding_scope:iteration (fresh each step), :turn (persist within a message, default), or :conversation (persist across messages).
max_message_lengthByte limit for any single message. Longer content is truncated. Set to :infinity to disable.

Agents override global config by defining config/0:

defmodule MyApp.DataAgent do
  @moduledoc "Fetches and processes data from HTTP APIs."
  use Legion.Agent

  def tools, do: [MyApp.HTTPTool]
  def config, do: %{model: "anthropic:claude-sonnet-4-20250514", max_iterations: 5}
end

Agent Callbacks

All optional with sensible defaults:

CallbackDefaultDescription
tools/0[]Tool modules available to the agent
output_schema/0%{"type" => "string"}JSON Schema for structured output
tool_config/1[]Per-tool keyword config (accessible via Vault)
system_prompt/0auto-generatedOverride the entire system prompt
config/0%{}Model, timeouts, limits
action_types/0all four actionsRestrict which actions the LLM can take

Third-Party Modules as Tools

Expose any module - even third-party ones like Req or Jason - without writing a wrapper:

# config/config.exs
config :legion, extra_source_modules: [Req, Jason]
defmodule MyApp.APIAgent do
  @moduledoc "Fetches data from JSON APIs and decodes responses."
  use Legion.Agent

  def tools, do: [Req, Jason]
end

The LLM receives the module's full source and can call any public function in the sandbox.

For large libraries or when you want a curated interface, write a thin facade instead:

defmodule MyApp.Tools.JSONTool do
  use Legion.Tool

  def description do
    """
    JSONTool - encode and decode JSON.

    ## Functions
    - `encode!(term)` - returns a JSON string
    - `decode!(binary)` - returns a decoded term
    """
  end

  defdelegate encode!(term), to: Jason
  defdelegate decode!(binary), to: Jason
end

Agent Pools

Agents are BEAM processes - use :pg for pooling with zero external infrastructure:

for _ <- 1..5 do
  {:ok, pid} = Legion.start_link(SupportAgent)
  :pg.join(:support_pool, pid)
end

defp handle_ticket(ticket) do
  agent = :pg.get_members(:support_pool) |> Enum.random()
  Legion.cast(agent, "Handle this support ticket: #{ticket}")
end

Telemetry

Legion.Telemetry.attach_default_logger()

Events emitted at every level:

  • [:legion, :agent, :started | :stopped] - agent lifecycle

  • [:legion, :agent, :message, :start | :stop | :exception] - per-message

  • [:legion, :iteration, :start | :stop | :exception] - each execution step

  • [:legion, :llm, :request, :start | :stop | :exception] - LLM API calls

  • [:legion, :sandbox, :eval, :start | :stop | :exception] - code evaluation

Limitations

Sandboxing

Legion's sandbox restricts what LLM-generated code can do, but it is not full process isolation. Generated code runs inside the same BEAM VM as your application.

What the sandbox does:

  • Blocks dangerous constructs at the AST level: defmodule, import, spawn, send, receive, apply, and others
  • Restricts module access to an explicit allowlist (stdlib + your tools)
  • Kills evaluation if it exceeds sandbox_timeout

What it does not do yet:

  • Isolate memory - runaway allocations affect the whole VM
  • Prevent atom table exhaustion - String.to_atom/1 is available and atoms are never garbage collected
  • Restrict access to BEAM node name, process pid, or refs

Legion is built for trusted code generators (your own LLM-backed agents with controlled tool access), not for running arbitrary code from unknown sources. If your threat model requires full isolation, run agents in a separate BEAM instance.

Summary

Functions

Sends a message to a running agent and waits for the result.

Sends a message to a running agent without waiting for a result.

Runs an agent on a single task and returns the result.

Returns the conversation history from a running agent.

Runs multiple agent tasks concurrently and collects results.

Runs agent tasks sequentially, threading each result to the next step.

Starts a long-lived agent process.

Chains an agent task after a previous result.

Functions

call(pid, message, timeout \\ :infinity)

Sends a message to a running agent and waits for the result.

Examples

{:ok, pid} = Legion.start_link(AssistantAgent)
{:ok, answer} = Legion.call(pid, "What is the capital of France?")
{:ok, follow_up} = Legion.call(pid, "And its population?")

cast(pid, message)

Sends a message to a running agent without waiting for a result.

Examples

{:ok, pid} = Legion.start_link(ReportAgent)
Legion.cast(pid, "Generate the weekly report and email it")

execute(agent_module, task)

Runs an agent on a single task and returns the result.

Starts a temporary agent process, blocks until the task completes, then stops it.

Examples

{:ok, summary} = Legion.execute(ResearchAgent, "Summarize the Elixir getting started guide")
{:cancel, :reached_max_iterations} = Legion.execute(ResearchAgent, "impossible task")

get_messages(pid)

Returns the conversation history from a running agent.

Examples

{:ok, pid} = Legion.start_link(AssistantAgent)
{:ok, _} = Legion.call(pid, "Hello")
messages = Legion.get_messages(pid)

parallel(tasks, timeout \\ :infinity)

Runs multiple agent tasks concurrently and collects results.

Returns {:ok, results} if all succeed, or the first {:cancel, reason}.

Examples

# Run two agents in parallel
{:ok, [research, analysis]} =
  Legion.parallel([
    {ResearchAgent, "Find recent Elixir blog posts"},
    {AnalysisAgent, "Summarize market trends"}
  ])

# With a timeout (in milliseconds)
{:ok, results} =
  Legion.parallel(
    [{FastAgent, "task 1"}, {FastAgent, "task 2"}],
    30_000
  )

pipeline(steps)

Runs agent tasks sequentially, threading each result to the next step.

Each step is {agent, task} where task is a string or a function that receives the previous result and returns a task string.

Halts early if any step returns {:cancel, reason}.

Examples

# Static tasks — each runs independently
{:ok, final} =
  Legion.pipeline([
    {ResearchAgent, "Find info about Elixir OTP"},
    {WriterAgent, "Write a blog post about OTP"}
  ])

# Thread results — each step receives the previous result
{:ok, post} =
  Legion.pipeline([
    {ResearchAgent, "Find recent Elixir news"},
    {WriterAgent, fn research -> "Write a summary based on: #{research}" end},
    {EditorAgent, fn draft -> "Polish this draft: #{draft}" end}
  ])

start_link(agent_module, opts \\ [])

Starts a long-lived agent process.

Options

  • :name - register the process under a name
  • Any config overrides (:model, :max_iterations, etc.)

Examples

{:ok, pid} = Legion.start_link(AssistantAgent)
{:ok, pid} = Legion.start_link(AssistantAgent, name: MyAssistant, model: "openai:gpt-4o")

then(cancelled, agent, fun)

Chains an agent task after a previous result.

Useful for piping from parallel/2 or pipeline/1.

Examples

# Chain after parallel
Legion.parallel([
  {ResearchAgent, "Find Elixir news"},
  {ResearchAgent, "Find Erlang news"}
])
|> Legion.then(WriterAgent, fn results ->
  "Summarize these findings: #{inspect(results)}"
end)

# Passes through cancellations
{:cancel, reason} |> Legion.then(WriterAgent, fn _ -> "ignored" end)
#=> {:cancel, reason}