One <|t_start|> text <|t_end|> segment of a transcription.
Times are absolute seconds within the input audio. tokens is the raw
text-token ID list (timestamp tokens stripped); useful for diarization or
custom decoding. no_speech_prob is the no-speech probability of the
parent 30 s chunk, repeated on every segment in that chunk. avg_logprob
is the sequence-level average log probability returned by CTranslate2 -
filter at e.g. avg_logprob < -1.0 to reject low-confidence hallucination.
words is nil unless :word_timestamps was set on the transcribe call;
when present it carries one %WhisperCt2.Word{} per Whisper word with its
own time span.
Summary
Types
@type t() :: %WhisperCt2.Segment{ avg_logprob: float(), end: float(), no_speech_prob: float(), start: float(), text: String.t(), tokens: [non_neg_integer()], words: [WhisperCt2.Word.t()] | nil }