LLMDB.Normalize (LLM DB v2026.3.0)

Copy Markdown View Source

Complete normalization utilities for raw data into consistent formats.

This module handles ALL normalization in one place:

  • Provider IDs: string → atom (with hyphen → underscore conversion)
  • Model providers: string → atom
  • Modalities: string → atom (from valid set)
  • Tags: map → list, nil → []
  • Dates: DateTime/Date → ISO8601 string
  • Removing nil values from maps

Uses String.to_existing_atom/1 at runtime to prevent atom leaking. Uses String.to_atom/1 ONLY in unsafe mode during build-time (mix tasks).

Summary

Functions

Normalizes a date string to "YYYY-MM-DD" format.

Normalizes a model's identity to a {provider_atom, model_id} tuple.

Normalizes a list of model maps.

Normalizes a provider ID to an atom.

Normalizes a list of provider maps.

Functions

normalize_date(date_string)

@spec normalize_date(String.t() | nil) :: String.t() | nil

Normalizes a date string to "YYYY-MM-DD" format.

Attempts to parse and normalize various date formats. If the date cannot be normalized, it is returned as-is.

Examples

iex> LLMDB.Normalize.normalize_date("2024-01-15")
"2024-01-15"

iex> LLMDB.Normalize.normalize_date("2024/01/15")
"2024-01-15"

iex> LLMDB.Normalize.normalize_date("invalid-date")
"invalid-date"

iex> LLMDB.Normalize.normalize_date(nil)
nil

normalize_model_identity(model, opts \\ [])

@spec normalize_model_identity(
  map(),
  keyword()
) :: {:ok, {atom(), String.t()}} | {:error, term()}

Normalizes a model's identity to a {provider_atom, model_id} tuple.

Extracts the provider (as an atom) and id from a model map.

Examples

iex> LLMDB.Normalize.normalize_model_identity(%{provider: "google-vertex", id: "gemini-pro"})
{:ok, {:google_vertex, "gemini-pro"}}

iex> LLMDB.Normalize.normalize_model_identity(%{provider: :openai, id: "gpt-4"})
{:ok, {:openai, "gpt-4"}}

iex> LLMDB.Normalize.normalize_model_identity(%{provider: "openai"})
{:error, :missing_id}

normalize_models(models)

@spec normalize_models([map()]) :: [map()]

Normalizes a list of model maps.

Applies normalize_provider_id to the :provider field and ensures :id is present.

Examples

iex> LLMDB.Normalize.normalize_models([%{provider: "google-vertex", id: "gemini-pro"}])
[%{provider: :google_vertex, id: "gemini-pro"}]

normalize_provider_id(provider_id, opts \\ [])

@spec normalize_provider_id(
  binary() | atom(),
  keyword()
) :: {:ok, atom()} | {:error, :bad_provider}

Normalizes a provider ID to an atom.

Converts binary provider IDs to atoms, handling hyphens by converting them to underscores. Uses String.to_existing_atom/1 to prevent atom leaking at runtime. During activation task, unsafe conversion is allowed.

Examples

iex> LLMDB.Normalize.normalize_provider_id("google-vertex")
{:ok, :google_vertex}

iex> LLMDB.Normalize.normalize_provider_id(:openai)
{:ok, :openai}

iex> LLMDB.Normalize.normalize_provider_id("maliciousaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa")
{:error, :bad_provider}

normalize_providers(providers)

@spec normalize_providers([map()]) :: [map()]

Normalizes a list of provider maps.

Applies normalize_provider_id to the :id field of each provider map.

Examples

iex> LLMDB.Normalize.normalize_providers([%{id: "google-vertex"}, %{id: :openai}])
[%{id: :google_vertex}, %{id: :openai}]