Codex.Voice.Models.OpenAITTS (Codex SDK v0.16.1)

OpenAI text-to-speech model implementation.

This module implements the Codex.Voice.Model.TTSModel behaviour using OpenAI's audio speech API. It converts text to audio and returns the result as a stream of PCM bytes.

Default Model

The default model is gpt-4o-mini-tts, which provides high-quality text-to-speech with support for multiple voices and instructions.

Voices

The following voices are available:

:alloy - Neutral and balanced
:ash - Warm and conversational (default)
:coral - Clear and articulate
:echo - Soft and thoughtful
:fable - Expressive and dramatic
:onyx - Deep and authoritative
:nova - Friendly and upbeat
:sage - Calm and measured
:shimmer - Bright and energetic

Example

model = OpenAITTS.new()
settings = TTSSettings.new(voice: :nova, speed: 1.0)

audio_stream = OpenAITTS.run(model, "Hello, world!", settings)

Enum.each(audio_stream, fn chunk ->
  # Process PCM audio chunk
end)

Summary

Types

t()

Functions

new(model \\ nil, opts \\ [])

Create a new OpenAI TTS model.

run(model, text, settings)

Convert text to speech, returning a stream of PCM audio bytes.

Types

t()

@type t() :: %Codex.Voice.Models.OpenAITTS{
  api_key: String.t() | nil,
  base_url: String.t(),
  client: term(),
  model: String.t()
}

Functions

new(model \\ nil, opts \\ [])

@spec new(
  String.t() | nil,
  keyword()
) :: t()

Create a new OpenAI TTS model.

Options

:client - Optional HTTP client (for testing)
:api_key - API key (defaults to OPENAI_API_KEY env var)
:base_url - API base URL (defaults to OpenAI)

Examples

iex> model = Codex.Voice.Models.OpenAITTS.new()
iex> model.model
"gpt-4o-mini-tts"

iex> model = Codex.Voice.Models.OpenAITTS.new("tts-1")
iex> model.model
"tts-1"

run(model, text, settings)

@spec run(t(), String.t(), Codex.Voice.Config.TTSSettings.t()) :: Enumerable.t()

Convert text to speech, returning a stream of PCM audio bytes.

Parameters

model - The OpenAITTS model struct
text - The text to convert to speech
settings - TTSSettings with voice and speed options

Returns

An enumerable that yields audio bytes in PCM format. Each chunk is approximately 1024 bytes.

Example

model = OpenAITTS.new()
settings = TTSSettings.new(voice: :nova)

audio_chunks =
  OpenAITTS.run(model, "Hello!", settings)
  |> Enum.to_list()