View Source BoldTranscriptsEx.Convert.Language (bold_transcripts_ex v0.8.1)

Language code normalization for transcript vendors.

Converts vendor-specific language codes to a unified internal format:

  • Underscore-separated lowercase: en_us, en_uk, de_de
  • Base English defaults to en_us
  • Base languages get default region: dede_de, eses_es

Vendor Formats

  • Deepgram: BCP-47 format (en-US, en-GB, de-DE)
  • AssemblyAI: Underscore format (en_us, en_uk, de)
  • Speechmatics: Base language only (en, de, es)

Summary

Functions

Generic language code normalizer with fallback to en_us.

Normalizes AssemblyAI underscore-format language codes to internal format.

Normalizes Deepgram BCP-47 language codes to internal format.

Normalizes Mistral Voxtral language codes to internal format.

Normalizes Speechmatics base language codes to internal format.

Functions

normalize(code)

Generic language code normalizer with fallback to en_us.

Examples

iex> BoldTranscriptsEx.Convert.Language.normalize("en")
"en_us"

iex> BoldTranscriptsEx.Convert.Language.normalize(nil)
"en_us"

iex> BoldTranscriptsEx.Convert.Language.normalize("")
"en_us"

normalize_assemblyai(code)

Normalizes AssemblyAI underscore-format language codes to internal format.

Examples

iex> BoldTranscriptsEx.Convert.Language.normalize_assemblyai("en_us")
"en_us"

iex> BoldTranscriptsEx.Convert.Language.normalize_assemblyai("en_uk")
"en_uk"

iex> BoldTranscriptsEx.Convert.Language.normalize_assemblyai("de")
"de_de"

iex> BoldTranscriptsEx.Convert.Language.normalize_assemblyai("EN_US")
"en_us"

iex> BoldTranscriptsEx.Convert.Language.normalize_assemblyai(nil)
"en_us"

normalize_deepgram(code)

Normalizes Deepgram BCP-47 language codes to internal format.

Examples

iex> BoldTranscriptsEx.Convert.Language.normalize_deepgram("en-US")
"en_us"

iex> BoldTranscriptsEx.Convert.Language.normalize_deepgram("en-GB")
"en_uk"

iex> BoldTranscriptsEx.Convert.Language.normalize_deepgram("de-DE")
"de_de"

iex> BoldTranscriptsEx.Convert.Language.normalize_deepgram("en")
"en_us"

iex> BoldTranscriptsEx.Convert.Language.normalize_deepgram(nil)
"en_us"

normalize_mistral(code)

Normalizes Mistral Voxtral language codes to internal format.

Since Voxtral doesn't include language in its response, this normalizes user-provided language codes (typically base language like "en", "de").

Examples

iex> BoldTranscriptsEx.Convert.Language.normalize_mistral("en")
"en_us"

iex> BoldTranscriptsEx.Convert.Language.normalize_mistral("de")
"de_de"

iex> BoldTranscriptsEx.Convert.Language.normalize_mistral("es")
"es_es"

iex> BoldTranscriptsEx.Convert.Language.normalize_mistral(nil)
"en_us"

normalize_speechmatics(code)

Normalizes Speechmatics base language codes to internal format.

Examples

iex> BoldTranscriptsEx.Convert.Language.normalize_speechmatics("en")
"en_us"

iex> BoldTranscriptsEx.Convert.Language.normalize_speechmatics("de")
"de_de"

iex> BoldTranscriptsEx.Convert.Language.normalize_speechmatics("es")
"es_es"

iex> BoldTranscriptsEx.Convert.Language.normalize_speechmatics("fr")
"fr_fr"

iex> BoldTranscriptsEx.Convert.Language.normalize_speechmatics(nil)
"en_us"