Codex.Realtime (Codex SDK v0.7.2)

Copy Markdown View Source

Realtime audio streaming with OpenAI's Realtime API.

This module provides a high-level interface for building voice-enabled AI applications using WebSocket-based real-time communication.

Quick Start

# Define an agent
agent = Codex.Realtime.agent(
  name: "VoiceAssistant",
  instructions: "You are a helpful voice assistant.",
  tools: [weather_tool]
)

# Create and run a session
{:ok, session} = Codex.Realtime.run(agent)

# Send audio and receive events
Codex.Realtime.send_audio(session, audio_bytes)
Codex.Realtime.subscribe(session, self())

receive do
  {:session_event, event} -> handle_event(event)
end

Features

  • Real-time audio streaming (PCM16, G.711)
  • Voice activity detection (semantic VAD, server VAD)
  • Tool execution during conversations
  • Agent handoffs
  • Output guardrails
  • Dynamic instructions (string or function)

Architecture

The realtime feature consists of:

Configuration

Sessions can be configured with various options:

{:ok, session} = Codex.Realtime.run(agent,
  config: %Codex.Realtime.Config.RunConfig{
    model_settings: %Codex.Realtime.Config.SessionModelSettings{
      voice: "nova",
      turn_detection: %Codex.Realtime.Config.TurnDetectionConfig{
        type: :semantic_vad,
        eagerness: :medium
      }
    }
  }
)

Event Handling

Subscribers receive events as {:session_event, event} messages:

Codex.Realtime.subscribe(session, self())

receive do
  {:session_event, %Codex.Realtime.Events.AgentStartEvent{}} ->
    IO.puts("Agent started")

  {:session_event, %Codex.Realtime.Events.AudioEvent{audio: audio}} ->
    play_audio(audio.data)

  {:session_event, %Codex.Realtime.Events.ToolStartEvent{tool: tool}} ->
    IO.puts("Calling tool: #{tool.name}")

  {:session_event, %Codex.Realtime.Events.AgentEndEvent{}} ->
    IO.puts("Turn completed")
end

Summary

Functions

Create a realtime agent.

Close the session.

Get the current agent.

Get the conversation history.

Interrupt the current response.

Create and start a realtime session with an agent.

Create a runner for more control over session creation.

Send audio data to the model.

Send a text message to the model.

Send a raw event to the model.

Subscribe to session events.

Unsubscribe from session events.

Update session settings.

Functions

agent(opts)

@spec agent(keyword()) :: Codex.Realtime.Agent.t()

Create a realtime agent.

This is a convenience function for creating an agent struct from keyword options.

Options

  • :name - Agent name (default: "Agent")
  • :instructions - System instructions (string or function)
  • :model - Model name (default: "gpt-4o-realtime-preview")
  • :tools - List of tools available to the agent
  • :handoffs - List of agents or handoffs for transfers
  • :output_guardrails - Output guardrails to apply
  • :hooks - Event hooks

Example

agent = Codex.Realtime.agent(
  name: "Assistant",
  instructions: "Be helpful and concise.",
  tools: [my_tool],
  handoffs: [support_agent]
)

# With dynamic instructions
agent = Codex.Realtime.agent(
  name: "Greeter",
  instructions: fn ctx -> "Hello #{ctx.user_name}!" end
)

close(session)

@spec close(GenServer.server()) :: :ok

Close the session.

Closes the WebSocket connection and stops the session process.

current_agent(session)

@spec current_agent(GenServer.server()) :: Codex.Realtime.Agent.t()

Get the current agent.

history(session)

Get the conversation history.

Returns all items in the conversation history.

interrupt(session)

@spec interrupt(GenServer.server()) :: :ok

Interrupt the current response.

Sends a cancel signal to stop the model from generating more output.

run(agent, opts \\ [])

@spec run(
  Codex.Realtime.Agent.t(),
  keyword()
) :: {:ok, pid()} | {:error, term()}

Create and start a realtime session with an agent.

This is a convenience function that creates a runner and starts a session in one step. For more control, use Codex.Realtime.Runner directly.

Options

  • :config - Run configuration (%Codex.Realtime.Config.RunConfig{})
  • :model_config - Model connection config (%Codex.Realtime.Config.ModelConfig{})
  • :context - Context map passed to the session

Returns

  • {:ok, pid} - Session started successfully
  • {:error, reason} - Failed to start session

Example

{:ok, session} = Codex.Realtime.run(agent,
  config: %Codex.Realtime.Config.RunConfig{
    model_settings: %{voice: "nova"}
  }
)

# Use the session
Codex.Realtime.send_message(session, "Hello!")
Codex.Realtime.subscribe(session, self())

runner(agent, opts \\ [])

Create a runner for more control over session creation.

Use this when you need to configure the runner separately from running it, or when you want to reuse the same runner for multiple sessions.

Example

runner = Codex.Realtime.runner(agent,
  config: %RunConfig{tracing_disabled: true}
)

{:ok, session1} = Codex.Realtime.Runner.run(runner)
{:ok, session2} = Codex.Realtime.Runner.run(runner, context: %{user: "Alice"})

send_audio(session, audio, opts \\ [])

@spec send_audio(GenServer.server(), binary(), keyword()) :: :ok

Send audio data to the model.

Options

  • :commit - Whether to commit the audio buffer (default: false)

Example

Codex.Realtime.send_audio(session, audio_bytes)
Codex.Realtime.send_audio(session, audio_bytes, commit: true)

send_message(session, message)

@spec send_message(GenServer.server(), String.t() | map()) :: :ok

Send a text message to the model.

Can be a simple string or a structured message map.

Example

Codex.Realtime.send_message(session, "Hello!")

Codex.Realtime.send_message(session, %{
  "type" => "message",
  "role" => "user",
  "content" => [%{"type" => "input_text", "text" => "Hello!"}]
})

send_raw_event(session, event)

@spec send_raw_event(GenServer.server(), map()) :: :ok

Send a raw event to the model.

Use this for advanced scenarios where you need to send custom events.

subscribe(session, pid)

@spec subscribe(GenServer.server(), pid()) :: :ok

Subscribe to session events.

The subscriber process will receive {:session_event, event} messages for all session events.

Example

Codex.Realtime.subscribe(session, self())

receive do
  {:session_event, %Codex.Realtime.Events.AudioEvent{} = event} ->
    play_audio(event.audio.data)
end

unsubscribe(session, pid)

@spec unsubscribe(GenServer.server(), pid()) :: :ok

Unsubscribe from session events.

update_session(session, settings)

Update session settings.

Use this to change model settings mid-session, such as voice or modalities.

Example

settings = %Codex.Realtime.Config.SessionModelSettings{voice: "nova"}
Codex.Realtime.update_session(session, settings)