Complete normalization utilities for raw data into consistent formats.
This module handles ALL normalization in one place:
- Provider IDs: string → atom (with hyphen → underscore conversion)
- Model providers: string → atom
- Modalities: string → atom (from valid set)
- Tags: map → list, nil → []
- Dates: DateTime/Date → ISO8601 string
- Removing nil values from maps
Uses String.to_existing_atom/1 at runtime to prevent atom leaking.
Uses String.to_atom/1 ONLY in unsafe mode during build-time (mix tasks).
Summary
Functions
Normalizes a date string to "YYYY-MM-DD" format.
Normalizes a model's identity to a {provider_atom, model_id} tuple.
Normalizes a list of model maps.
Normalizes a provider ID to an atom.
Normalizes a list of provider maps.
Functions
Normalizes a date string to "YYYY-MM-DD" format.
Attempts to parse and normalize various date formats. If the date cannot be normalized, it is returned as-is.
Examples
iex> LLMDB.Normalize.normalize_date("2024-01-15")
"2024-01-15"
iex> LLMDB.Normalize.normalize_date("2024/01/15")
"2024-01-15"
iex> LLMDB.Normalize.normalize_date("invalid-date")
"invalid-date"
iex> LLMDB.Normalize.normalize_date(nil)
nil
@spec normalize_model_identity( map(), keyword() ) :: {:ok, {atom(), String.t()}} | {:error, term()}
Normalizes a model's identity to a {provider_atom, model_id} tuple.
Extracts the provider (as an atom) and id from a model map.
Examples
iex> LLMDB.Normalize.normalize_model_identity(%{provider: "google-vertex", id: "gemini-pro"})
{:ok, {:google_vertex, "gemini-pro"}}
iex> LLMDB.Normalize.normalize_model_identity(%{provider: :openai, id: "gpt-4"})
{:ok, {:openai, "gpt-4"}}
iex> LLMDB.Normalize.normalize_model_identity(%{provider: "openai"})
{:error, :missing_id}
Normalizes a list of model maps.
Applies normalize_provider_id to the :provider field and ensures :id is present.
Examples
iex> LLMDB.Normalize.normalize_models([%{provider: "google-vertex", id: "gemini-pro"}])
[%{provider: :google_vertex, id: "gemini-pro"}]
@spec normalize_provider_id( binary() | atom(), keyword() ) :: {:ok, atom()} | {:error, :bad_provider}
Normalizes a provider ID to an atom.
Converts binary provider IDs to atoms, handling hyphens by converting them to underscores. Uses String.to_existing_atom/1 to prevent atom leaking at runtime. During activation task, unsafe conversion is allowed.
Examples
iex> LLMDB.Normalize.normalize_provider_id("google-vertex")
{:ok, :google_vertex}
iex> LLMDB.Normalize.normalize_provider_id(:openai)
{:ok, :openai}
iex> LLMDB.Normalize.normalize_provider_id("maliciousaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa")
{:error, :bad_provider}
Normalizes a list of provider maps.
Applies normalize_provider_id to the :id field of each provider map.
Examples
iex> LLMDB.Normalize.normalize_providers([%{id: "google-vertex"}, %{id: :openai}])
[%{id: :google_vertex}, %{id: :openai}]