Subagents

Subagents let Codex break a task into child agent threads and then bring the results back to the parent. They are useful when the work can be split into clean, bounded pieces such as codebase exploration, focused review passes, log analysis, or a simple one-parent -> one-child investigation.

In this SDK, a good subagent workflow has two parts:

the SDK control surface for structured operations such as configuration, discovery, inspection, streaming, and waiting on known child threads
prompt instructions that tell Codex whether to delegate, how many children to spawn, which agent to use, whether to wait, and what summary to return

For broader background, setup details, and product-level guidance, see the official Codex docs:

When Subagents Help

Subagents are a good fit when the work can be divided into independent pieces.

Typical examples:

read-heavy codebase exploration
multiple review passes with different goals
tracing separate code paths in parallel
log analysis or incident triage
one child doing bounded research while the parent keeps the main thread clean

Be more careful with write-heavy workflows. Multiple agents editing the same area at once can create conflicts and extra coordination work.

Availability and Behavior

The official product docs describe subagents as generally available, but the vendored Codex runtime in this repository still gates the runtime tool surface behind features.multi_agent, which is experimental and disabled by default in the current Rust source tree. Enable it explicitly before you expect a parent turn to spawn children.

Codex still only uses subagents when you explicitly ask for them, and each child agent adds token cost because it does its own model and tool work.

Subagents inherit the parent session's sandbox and approval posture. Custom agent files can set child defaults, but the runtime reapplies the parent turn's live overrides after role config is layered in, so interactive sandbox or approval changes on the parent still win.

The Two Parts of Using Subagents From Elixir

The clean mental model is:

use the SDK control surface for deterministic, structured operations
use prompts for delegation behavior

The direct SDK surface for this guide is:

Concern	Use the SDK control surface	Use prompt instructions
Enable or tune subagent settings	Yes	No
Set `agents.max_threads` and `agents.max_depth`	Yes	No
Discover child threads for a parent	Yes	No
Inspect child metadata such as parent id, depth, role, or nickname	Yes	No
Observe collaboration events and tool-call items	Yes	No
Read or continue a known child thread	Yes	No
Decide whether to delegate	No	Yes
Decide how many children to spawn	No	Yes
Choose `explorer`, `worker`, or a custom agent	No	Yes
Tell the parent to wait, summarize, or keep working	No	Yes

This split matters. The SDK should own the structured parts. Your prompt should own the delegation strategy.

Just as important, the SDK does not expose prompt-template helper APIs for this workflow. Prompt snippets in this guide are documented practice, not helper functions. There is no host-side spawn_agent/3, delegate/2, or wait_and_summarize/2 surface because those choices are still model-mediated.

Configure Subagents

Global subagent settings live under [agents] in .codex/config.toml.

[features]
multi_agent = true

[agents]
max_threads = 2
max_depth = 1

Useful defaults from the Codex docs:

agents.max_threads defaults to 6
agents.max_depth defaults to 1
agents.job_max_runtime_seconds defaults to nil in config; the spawn_agents_on_csv tool falls back to 1800 seconds only when both the config value and the per-call override are unset

For a simple one-parent -> one-child workflow, max_depth = 1 is usually what you want. It allows the parent to spawn a child but prevents the child from building a deeper tree.

You can also set these values through the SDK when you are connected to Codex:

{:ok, conn} = Codex.AppServer.connect(codex_opts, experimental_api: true)
{:ok, _} = Codex.AppServer.config_write(conn, "features.multi_agent", true)
{:ok, _} = Codex.AppServer.config_write(conn, "agents.max_threads", 2)
{:ok, _} = Codex.AppServer.config_write(conn, "agents.max_depth", 1)

Runtime Tool Surface

Prompt-shaped delegation still happens inside the Codex turn, not through direct Elixir helper functions. In the current Rust runtime, enabling features.multi_agent makes these model-callable tools available:

spawn_agent
send_input
resume_agent
wait
close_agent

When features.enable_fanout is enabled as well, the runtime also exposes the experimental CSV batch tools:

spawn_agents_on_csv
report_agent_job_result

Codex.Subagents intentionally does not wrap those tools. It only gives host code deterministic visibility into the threads and metadata they create.

The current app-server protocol uses plural fields such as receiverThreadIds and agentsStates for collaboration tool-call items. Some upstream summaries still mention older singular variants such as receiverThreadId, newThreadId, and agentStatus. The SDK normalizes both shapes when it parses app-server payloads.

Built-In and Custom Agents

Codex ships with three useful built-in agents:

default for general-purpose fallback work
worker for implementation and fixes
explorer for read-heavy exploration

When you need a narrower role, define a custom agent in one standalone TOML file under .codex/agents/ for project-scoped agents or ~/.codex/agents/ for personal agents.

Every custom agent file should define:

name
description
developer_instructions

Optional fields such as nickname_candidates, model, model_reasoning_effort, sandbox_mode, mcp_servers, and skills.config inherit from the parent when you omit them.

Example:

name = "reviewer"
description = "PR reviewer focused on correctness, security, and missing tests."
developer_instructions = """
Review code like an owner.
Prioritize correctness, security, behavior regressions, and missing test coverage.
"""
nickname_candidates = ["Atlas", "Delta", "Echo"]
model = "gpt-5.4-mini"
model_reasoning_effort = "medium"
sandbox_mode = "read-only"

Keep custom agents narrow and opinionated. A good custom agent has one clear job and instructions that keep it from drifting into adjacent work.

If a custom agent uses the same name as a built-in agent such as explorer, the custom definition takes precedence.

If you use nickname_candidates, keep the list non-empty and unique, and stick to ASCII letters, digits, spaces, hyphens, and underscores. The runtime validates those constraints when it loads agent files.

Basic SDK Workflow

The normal Elixir flow is:

Configure subagent limits.
Start a parent thread.
Prompt the parent to spawn exactly the children you want.
Observe the workflow in streamed events.
Discover the child thread or threads from the SDK control surface.
Inspect the child metadata.
Read, follow up on, or await the child thread as needed.

Here is the shape of a simple one-parent -> one-child flow:

{:ok, conn} = Codex.AppServer.connect(codex_opts, experimental_api: true)
{:ok, _} = Codex.AppServer.config_write(conn, "features.multi_agent", true)
{:ok, _} = Codex.AppServer.config_write(conn, "agents.max_threads", 2)
{:ok, _} = Codex.AppServer.config_write(conn, "agents.max_depth", 1)

{:ok, parent} =
  Codex.start_thread(codex_opts, %{
    transport: {:app_server, conn},
    working_directory: File.cwd!(),
    model: "gpt-5.4-mini"
  })

prompt = """
Spawn exactly one child agent for this task.
Use the explorer agent.
Do not spawn any additional agents.
The child must not spawn more agents.
Inspect lib/codex/subagents.ex and summarize what host-side controls it exposes.
Wait for the child before answering.
If subagents are unavailable, continue solo and say so explicitly.
"""

{:ok, parent_result} = Codex.Thread.run(parent, prompt, %{timeout_ms: 120_000})

{:ok, [child]} = Codex.Subagents.children(conn, parent_result.thread.thread_id)
source = Codex.Subagents.source(child)

IO.inspect(%{
  child_thread_id: child["id"],
  parent_thread_id: Codex.Subagents.parent_thread_id(source),
  source_kind: Codex.Protocol.SessionSource.source_kind(source),
  depth: source.sub_agent.depth,
  agent_role: source.sub_agent.agent_role,
  agent_nickname: source.sub_agent.agent_nickname
})

{:ok, _child_snapshot} = Codex.Subagents.read(conn, child["id"], include_turns: true)

{:ok, child_thread} =
  Codex.resume_thread(child["id"], codex_opts, %{
    transport: {:app_server, conn},
    working_directory: File.cwd!()
  })

{:ok, _child_result} =
  Codex.Thread.run(child_thread, "Reply with one sentence that starts with 'child follow-up:'")

{:ok, :completed} = Codex.Subagents.await(conn, child["id"], timeout: 30_000)

The important pattern is simple:

the prompt tells Codex how to delegate
the SDK gives you structured visibility and control over the resulting child thread

For a runnable end-to-end version of this flow, see examples/live_subagent_host_controls.exs. That example now exercises the full public Codex.Subagents helper surface:

list/2
children/3
source/1
parent_thread_id/1
child_thread?/1
read/3
await/3

It also drives the current prompt-mediated multi-agent tool surface across successive parent turns so you can observe spawn_agent, send_input, resume_agent, wait, and close_agent in the live event stream.

Streaming and Observability

Subagent workflows are much easier to debug when you stream events instead of waiting for only the final answer.

The SDK should let you observe collaboration activity such as:

child spawn begin and end
follow-up interaction begin and end
waiting begin and end
close begin and end
typed collaboration tool-call items in the item stream

That gives you a reliable way to answer questions such as:

did the parent actually spawn a child?
how many child threads were created?
which child thread ids were used?
did the parent wait for the child or continue immediately?

Prompting Strategy

Codex does not spawn subagents automatically. If you want subagents, say so clearly.

Good subagent prompts usually specify:

whether to delegate at all
the exact number of children to create
which built-in or custom agent to use
whether children may spawn additional children
whether the parent should wait or keep working
what final answer shape to return

These prompt patterns are documentation only. They are not wrapped in helper APIs because delegation remains a model decision inside the turn.

A Reliable One-Child Prompt

Spawn exactly one child agent for this task.
Use the explorer agent.
Do not spawn any additional agents.
Inspect lib/my_app/payments.ex and explain the payment lifecycle.
Wait for the child to finish before answering.
Return a concise summary with file references.
If subagents are unavailable, continue solo and say so explicitly.

This is a good default pattern because it keeps the workflow bounded and easy to inspect from Elixir.

A Good Parallel Review Prompt

Review this branch with parallel subagents.
Spawn one child for security risks, one for test gaps, and one for maintainability.
Wait for all children, then summarize the findings by category with file references.
Do not create any additional agents beyond those three.

Prompting Tips

ask for an exact number of children, not "some" or "a few"
name the agent you want when you care about behavior
say whether the parent should wait
say what the final answer should look like
add an explicit fallback so the run still succeeds without subagents

Working With Child Threads

Once a child exists, treat it as a first-class thread.

The SDK control surface should let you:

list the children for a parent
inspect the child's source metadata
confirm the parent/child relationship
read the child thread
stream or run direct follow-up work on the child
wait for the child to reach a final state

That is the part Elixir should own. It is structured, deterministic, and useful for application code.

In practice, Codex.Subagents.source/1 returns a %Codex.Protocol.SessionSource{} and subagent threads use %Codex.Protocol.SubAgentSource{} for variant-specific metadata. thread_spawn children expose the structured fields host code usually needs most:

parent_thread_id
depth
agent_nickname
agent_role

Approvals and Sandbox Controls

Subagents inherit the parent session's sandbox and approval posture. Custom agent files can set child defaults, but the runtime still reapplies the parent turn's live sandbox and approval overrides after the role config is applied.

That means:

a read-only parent usually leads to read-only children unless you opt out
a stricter custom agent can be safer for review or exploration tasks
approval failures in a child flow back into the broader workflow instead of silently disappearing

For review, exploration, and documentation tasks, a read-only child is often the right default.

CSV Fan-Out Jobs

The underlying runtime also ships an experimental CSV batch workflow for repeated subagent work items. This is distinct from Codex.Subagents:

enable features.enable_fanout = true before expecting the tools to exist
ask Codex to call spawn_agents_on_csv
each worker must call report_agent_job_result exactly once
agents.max_threads still caps concurrent open threads
agents.job_max_runtime_seconds provides a config default, while the tool's max_runtime_seconds argument overrides it per job

This workflow is useful for repeated audits such as one file/package/service per row. It remains model-mediated and is not wrapped as a direct Elixir API in this SDK.

Choosing Agents and Models

Start simple.

use gpt-5.4 for the parent and for agents handling harder reasoning or ambiguous work
use gpt-5.4-mini for faster, lower-cost day-to-day parent/child examples
use gpt-5.3-codex-spark only when your account/runtime exposes it and you want a faster read-heavy or summarization-focused agent
use medium reasoning effort as the default unless you have a clear reason to go lower or higher

If you create custom agents, pin model or reasoning settings only when the role truly benefits from it. Otherwise, let the child inherit the parent session's defaults.

Recommended Starting Point

If you are new to subagents, start with this exact pattern:

Set agents.max_threads = 2.
Set agents.max_depth = 1.
Start one parent thread on gpt-5.4-mini.
Ask the parent to spawn exactly one explorer child.
Tell the parent to wait for the child.
Use the SDK control surface to confirm the child exists, inspect its source, and await completion.

That keeps the workflow small, easy to reason about, and easy to test.