BoldTranscriptsEx (bold_transcripts_ex v0.7.0)
BoldTranscriptsEx is a library for working with transcripts in the Bold Video platform.
It provides functionality for:
- Converting transcripts from various vendors (AssemblyAI, Deepgram) to Bold format
- Generating WebVTT subtitles from Bold transcripts
- Working with chapter markers in WebVTT format
Summary
Functions
Converts a list of chapters into WebVTT format.
Converts a transcript from a specific service to Bold format.
Generates WebVTT subtitles from a Bold transcript.
Parses WebVTT content to extract chapters.
Functions
Converts a list of chapters into WebVTT format.
Parameters
chapters: A list of maps, each containing:start,:end, and:titlekeys, or:assemblyaiand a transcript map for AssemblyAI format.
Returns
- When given a list of chapters: A string in WebVTT format
- When given
:assemblyaiformat:{:ok, string}or{:error, reason}
Examples
iex> chapters = [%{start: "0:03", end: "0:16", title: "Chapter 1"}]
iex> BoldTranscriptsEx.chapters_to_webvtt(chapters)
"WEBVTT\n\n1\n00:00:03.000 --> 00:00:16.000\nChapter 1"
iex> transcript = %{"chapters" => [%{"start" => 3000, "end" => 16000, "gist" => "Chapter 1"}]}
iex> BoldTranscriptsEx.chapters_to_webvtt(:assemblyai, transcript)
{:ok, "WEBVTT\n\n1\n00:00:03.000 --> 00:00:16.000\nChapter 1"}
Converts a transcript from a specific service to Bold format.
Parameters
service: The service that generated the transcript (e.g.,:assemblyai,:deepgram)transcript_data: The JSON string or decoded map of the transcript dataopts: Options for the conversion::language: (required for Deepgram) The language code of the transcript (e.g., "en", "lt")- Other service-specific options
Returns
{:ok, data}: A tuple with:okatom and the data in Bold Transcript format{:error, reason}: If the conversion fails or required options are missing
Examples
iex> transcript = ~s({"audio_duration": 10.5, "language_code": "en", "audio_url": "https://example.com/audio.mp3", "utterances": []})
iex> BoldTranscriptsEx.convert(:assemblyai, transcript)
{:ok, %{
"metadata" => %{
"duration" => 10.5,
"language" => "en_us",
"source_url" => "https://example.com/audio.mp3",
"speakers" => %{},
"source_model" => "",
"source_vendor" => "assemblyai",
"source_version" => "",
"transcription_date" => nil,
"version" => "2.0"
},
"utterances" => []
}}
Generates WebVTT subtitles from a Bold transcript.
Parameters
transcript: A Bold transcript in v2 format.
Returns
A string containing the WebVTT subtitles with speaker labels. Speaker labels are only shown for named speakers using the WebVTT <v> tag. Single-letter speaker IDs (A, B, C) are not shown.
Examples
iex> transcript = %{
...> "metadata" => %{"speakers" => %{"A" => "Jack Smith"}},
...> "utterances" => [%{"words" => [%{"start" => 0.8, "end" => 1.2, "word" => "Hello", "speaker" => "A"}], "speaker" => "A"}]
...> }
iex> BoldTranscriptsEx.generate_subtitles(transcript)
"WEBVTT\n\n1\n00:00:00.800 --> 00:00:01.200\n<v Jack Smith>Hello</v>"
Parses WebVTT content to extract chapters.
Parameters
webvtt: The WebVTT content as a string.
Returns
A list of maps, each containing :start and :title keys for a chapter.
Examples
iex> webvtt = "WEBVTT\n\n1\n00:00:03.000 --> 00:00:16.000\nComing soon: Back to Stanford"
iex> BoldTranscriptsEx.parse_chapters(webvtt)
[%{start: "0:03", end: "0:16", title: "Coming soon: Back to Stanford"}]