Codex.Voice.Models.OpenAISTT (Codex SDK v0.14.0)

Copy Markdown View Source

OpenAI speech-to-text model implementation.

This module implements the Codex.Voice.Model.STTModel behaviour using OpenAI's audio transcription API. It supports both single-shot transcription and streamed-input sessions that transcribe buffered audio once the input closes.

Default Model

The default model is gpt-4o-transcribe, which provides high-quality transcriptions with support for multiple languages.

Example

model = OpenAISTT.new()
audio = AudioInput.new(wav_data)
settings = STTSettings.new(language: "en")

{:ok, text} = OpenAISTT.transcribe(model, audio, settings, true, false)

Summary

Types

t()

@type t() :: %Codex.Voice.Models.OpenAISTT{
  api_key: String.t() | nil,
  base_url: String.t(),
  client: term(),
  model: String.t()
}

Functions

new(model \\ nil, opts \\ [])

@spec new(
  String.t() | nil,
  keyword()
) :: t()

Create a new OpenAI STT model.

Options

  • :client - Optional HTTP client (for testing)
  • :api_key - API key (defaults to OPENAI_API_KEY env var)
  • :base_url - API base URL (defaults to OpenAI)

Examples

iex> model = Codex.Voice.Models.OpenAISTT.new()
iex> model.model
"gpt-4o-transcribe"

iex> model = Codex.Voice.Models.OpenAISTT.new("whisper-1")
iex> model.model
"whisper-1"

transcribe(model, input, settings, trace_include_sensitive_data, trace_include_sensitive_audio_data)

@spec transcribe(
  t(),
  Codex.Voice.Input.AudioInput.t(),
  Codex.Voice.Config.STTSettings.t(),
  boolean(),
  boolean()
) :: {:ok, String.t()} | {:error, term()}

Transcribe audio input to text.

Makes a POST request to OpenAI's audio transcriptions endpoint with the audio data in WAV format.

Parameters

  • model - The OpenAISTT model struct
  • input - AudioInput with the audio data
  • settings - STTSettings with transcription options
  • _trace_include_sensitive_data - Whether to include text in traces (unused)
  • _trace_include_sensitive_audio_data - Whether to include audio in traces (unused)

Returns

  • {:ok, text} - The transcribed text
  • {:error, reason} - If the request fails