OpenAI speech-to-text model implementation.
This module implements the Codex.Voice.Model.STTModel behaviour using
OpenAI's audio transcription API. It supports both single-shot transcription
and streamed-input sessions that transcribe buffered audio once the input
closes.
Default Model
The default model is gpt-4o-transcribe, which provides high-quality
transcriptions with support for multiple languages.
Example
model = OpenAISTT.new()
audio = AudioInput.new(wav_data)
settings = STTSettings.new(language: "en")
{:ok, text} = OpenAISTT.transcribe(model, audio, settings, true, false)
Summary
Functions
Create a new OpenAI STT model.
transcribe(model, input, settings, trace_include_sensitive_data, trace_include_sensitive_audio_data)
Transcribe audio input to text.
Types
Functions
Create a new OpenAI STT model.
Options
:client- Optional HTTP client (for testing):api_key- API key (defaults to OPENAI_API_KEY env var):base_url- API base URL (defaults to OpenAI)
Examples
iex> model = Codex.Voice.Models.OpenAISTT.new()
iex> model.model
"gpt-4o-transcribe"
iex> model = Codex.Voice.Models.OpenAISTT.new("whisper-1")
iex> model.model
"whisper-1"
@spec transcribe( t(), Codex.Voice.Input.AudioInput.t(), Codex.Voice.Config.STTSettings.t(), boolean(), boolean() ) :: {:ok, String.t()} | {:error, term()}
Transcribe audio input to text.
Makes a POST request to OpenAI's audio transcriptions endpoint with the audio data in WAV format.
Parameters
model- The OpenAISTT model structinput- AudioInput with the audio datasettings- STTSettings with transcription options_trace_include_sensitive_data- Whether to include text in traces (unused)_trace_include_sensitive_audio_data- Whether to include audio in traces (unused)
Returns
{:ok, text}- The transcribed text{:error, reason}- If the request fails