WhisperCt2.Transcription (whisper_ct2 v0.5.0)

Copy Markdown View Source

Result of a WhisperCt2.transcribe/3 call.

text is the concatenated, whitespace-trimmed transcript across every segment. segments is the structured per-<|t_..|> decomposition produced by CTranslate2, with absolute start/end times in seconds, no_speech_prob, sequence-level avg_logprob, and the underlying token IDs. language is the resolved ISO code (auto-detected when not pinned). duration_s is the input audio length, useful for VAD/diarization pipelines that hand short splices in.

Summary

Types

t()

@type t() :: %WhisperCt2.Transcription{
  duration_s: float(),
  language: String.t(),
  segments: [WhisperCt2.Segment.t()],
  text: String.t()
}