Voice pipeline for speech-to-speech AI interactions.
This module provides a non-realtime voice pipeline that orchestrates:
- Speech-to-text (STT) - Transcribe audio input
- Workflow processing - Generate text responses
- Text-to-speech (TTS) - Synthesize audio output
Quick Start
# Create a workflow
workflow = Codex.Voice.SimpleWorkflow.new(fn text ->
["You said: #{text}"]
end)
# Create pipeline
pipeline = Codex.Voice.Pipeline.new(workflow: workflow)
# Run with audio input
audio = Codex.Voice.Input.AudioInput.new(audio_bytes)
{:ok, result} = Codex.Voice.Pipeline.run(pipeline, audio)
# Stream output events
result
|> Codex.Voice.Result.stream()
|> Enum.each(fn event ->
case event.type do
:voice_stream_event_audio -> play(event.data)
:voice_stream_event_lifecycle -> handle(event.event)
:voice_stream_event_error -> log(event.error)
end
end)Components
Codex.Voice.Pipeline- Main pipeline orchestratorCodex.Voice.Workflow- Workflow behaviourCodex.Voice.SimpleWorkflow- Function-based workflowCodex.Voice.AgentWorkflow- Agent-based workflowCodex.Voice.Input- Audio input typesCodex.Voice.Result- Streamed result
Convenience Functions
This module provides convenience functions to create and run voice pipelines without needing to import submodules directly:
# Run a complete pipeline in one call
{:ok, result} = Codex.Voice.run(audio,
workflow: my_workflow,
config: %Codex.Voice.Config{
tts_settings: %{voice: :nova}
}
)
# Create audio inputs
audio = Codex.Voice.audio_input(binary_data)
streamed = Codex.Voice.streamed_input()
Summary
Functions
Create an audio input from binary data.
Create a new pipeline configuration.
Create and run a voice pipeline.
Create a simple workflow from a function.
Create a streamed audio input.
Functions
@spec audio_input( binary(), keyword() ) :: Codex.Voice.Input.AudioInput.t()
Create an audio input from binary data.
Options
:frame_rate- Sample rate in Hz (default: 24000):sample_width- Bytes per sample (default: 2):channels- Number of audio channels (default: 1)
Example
audio = Codex.Voice.audio_input(File.read!("recording.pcm"))
@spec config(keyword()) :: Codex.Voice.Config.t()
Create a new pipeline configuration.
Options
:workflow_name- Name for tracing:stt_settings- STT configuration:tts_settings- TTS configuration:model_provider- Custom model provider module:trace_include_sensitive_data- Include sensitive data in traces:trace_include_sensitive_audio_data- Include audio data in traces
Example
config = Codex.Voice.config(
workflow_name: "Customer Support",
tts_settings: Codex.Voice.Config.TTSSettings.new(voice: :nova)
)
@spec run( Codex.Voice.Input.AudioInput.t() | Codex.Voice.Input.StreamedAudioInput.t(), keyword() ) :: {:ok, Codex.Voice.Result.t()}
Create and run a voice pipeline.
This is a convenience function that creates a pipeline and runs it in one
step. For more control, use Codex.Voice.Pipeline directly.
Options
:workflow- The workflow to use (required):stt_model- STT model name or instance:tts_model- TTS model name or instance:config- Pipeline configuration
Returns
{:ok, result}- AResultstruct that can be streamed for events
Example
workflow = Codex.Voice.SimpleWorkflow.new(fn text ->
["You said: #{text}"]
end)
{:ok, result} = Codex.Voice.run(audio,
workflow: workflow,
config: %Codex.Voice.Config{
tts_settings: %Codex.Voice.Config.TTSSettings{voice: :nova}
}
)
# Stream output events
for event <- Codex.Voice.Result.stream(result) do
IO.inspect(event)
end
@spec simple_workflow( (String.t() -> Enumerable.t()), keyword() ) :: Codex.Voice.SimpleWorkflow.t()
Create a simple workflow from a function.
The function should take a transcription string and return an enumerable of response text chunks.
Options
:greeting- Optional greeting to send when the workflow starts
Example
workflow = Codex.Voice.simple_workflow(fn text ->
["You said: #{text}"]
end, greeting: "Hello! How can I help?")
@spec streamed_input() :: Codex.Voice.Input.StreamedAudioInput.t()
Create a streamed audio input.
Use this for multi-turn conversations where you want to push audio chunks incrementally.
Example
input = Codex.Voice.streamed_input()
# Push chunks from another process
spawn(fn ->
for chunk <- audio_source do
Codex.Voice.Input.StreamedAudioInput.add(input, chunk)
end
Codex.Voice.Input.StreamedAudioInput.close(input)
end)
# Run the pipeline
{:ok, result} = Codex.Voice.run(input, workflow: my_workflow)