Codex.Voice.Model.STTModel behaviour (Codex SDK v0.7.2)

Behaviour for speech-to-text models.

STT models convert audio input into text transcriptions. They support both single-shot transcription and streaming transcription sessions.

Note: The behaviour callbacks use module-level functions. Implementations should use struct-based models where the struct is passed as the first parameter to instance methods like transcribe/5.

Summary

Callbacks

create_session(input, settings, trace_include_sensitive_data, trace_include_sensitive_audio_data)

Creates a streaming transcription session.

model_name()

Returns the name of the STT model.

Callbacks

create_session(input, settings, trace_include_sensitive_data, trace_include_sensitive_audio_data)

@callback create_session(
  input :: Codex.Voice.Input.StreamedAudioInput.t(),
  settings :: Codex.Voice.Config.STTSettings.t(),
  trace_include_sensitive_data :: boolean(),
  trace_include_sensitive_audio_data :: boolean()
) :: {:ok, pid()} | {:error, term()}

Creates a streaming transcription session.

The session receives audio input via the StreamedAudioInput and produces text transcriptions for each detected turn.

Parameters

input - The streamed audio input
settings - STT settings
trace_include_sensitive_data - Whether to include text in traces
trace_include_sensitive_audio_data - Whether to include audio in traces

Returns

{:ok, session_pid} - The session process
{:error, reason} - If session creation fails

model_name()

@callback model_name() :: String.t()

Returns the name of the STT model.