Model Spec Formats

View Source

Model specifications in LLMDB can be expressed in multiple formats to suit different use cases. This guide covers the supported formats and when to use each one.

Overview

A model spec uniquely identifies an LLM model by combining a provider identifier and a model ID. LLMDB supports three formats:

  1. Colon format ("provider:model") - Traditional, human-readable
  2. @ format ("model@provider") - Filesystem-safe, email-like
  3. Tuple format ({:provider, "model_id"}) - Internal representation

All three formats can be used interchangeably throughout the API.

Colon Format (Default)

"openai:gpt-4o-mini"
"anthropic:claude-3-5-sonnet-20241022"
"google-vertex:gemini-1.5-pro"

Syntax

  • Provider comes first, followed by a colon, then the model ID
  • Provider names with hyphens are normalized to underscores (e.g., google-vertex:google_vertex)
  • Model IDs can contain colons (e.g., "bedrock:anthropic.claude-opus-4:0")

When to Use

  • Default choice for most cases
  • Configuration files and user input
  • Logs and error messages
  • Documentation and examples

Parsing

{:ok, {:openai, "gpt-4o-mini"}} = LLMDB.parse("openai:gpt-4o-mini")

Formatting

"openai:gpt-4o-mini" = LLMDB.format({:openai, "gpt-4o-mini"})
# or explicitly
"openai:gpt-4o-mini" = LLMDB.format({:openai, "gpt-4o-mini"}, :provider_colon_model)

@ Format (Filename-Safe)

"gpt-4o-mini@openai"
"claude-3-5-sonnet-20241022@anthropic"
"gemini-1.5-pro@google_vertex"

Syntax

  • Model ID comes first, followed by an @ symbol, then the provider
  • Email-like semantics: model@provider
  • No colons anywhere in the spec

When to Use

  • Filenames: Template files, cache files, logs
    template_file = "system-prompt-#{LLMDB.format(spec, :filename_safe)}.liquid"
    # => "system-prompt-gpt-4o-mini@openai.liquid"
  • CI/CD artifacts: Build artifacts, test results, benchmark data
    artifact_path = "benchmarks/#{LLMDB.format(spec, :filename_safe)}/#{date}.json"
    # => "benchmarks/gpt-4o-mini@openai/2025-11-07.json"
  • URLs and paths: S3 keys, API endpoints, file paths
  • Cross-platform compatibility: Windows, macOS, Linux all accept @ in filenames

Parsing

{:ok, {:openai, "gpt-4o-mini"}} = LLMDB.parse("gpt-4o-mini@openai")

Formatting

"gpt-4o-mini@openai" = LLMDB.format({:openai, "gpt-4o-mini"}, :filename_safe)
# or
"gpt-4o-mini@openai" = LLMDB.format({:openai, "gpt-4o-mini"}, :model_at_provider)

Tuple Format (Internal)

{:openai, "gpt-4o-mini"}
{:anthropic, "claude-3-5-sonnet-20241022"}
{:google_vertex, "gemini-1.5-pro"}

Syntax

  • Two-element tuple: {provider_atom, model_id_string}
  • Provider is always an atom with underscores (not hyphens)
  • Model ID is always a string

When to Use

  • Internal application state
  • Pattern matching
  • Function arguments when provider is already known
  • Performance-critical code (avoids parsing overhead)

Conversion

# Parse to tuple
{:openai, "gpt-4o-mini"} = LLMDB.parse!("openai:gpt-4o-mini")

# Format from tuple
"openai:gpt-4o-mini" = LLMDB.format({:openai, "gpt-4o-mini"})

Format Conversion

Use LLMDB.build/2 to convert between formats:

# Colon to @
"gpt-4@openai" = LLMDB.build("openai:gpt-4", format: :filename_safe)

# @ to colon
"openai:gpt-4" = LLMDB.build("gpt-4@openai", format: :provider_colon_model)

# Tuple to @
"gpt-4@openai" = LLMDB.build({:openai, "gpt-4"}, format: :model_at_provider)

Automatic Format Detection

All parsing functions automatically detect which format you're using:

# Both work seamlessly
{:ok, model} = LLMDB.model("openai:gpt-4o-mini")
{:ok, model} = LLMDB.model("gpt-4o-mini@openai")

# Parsing detects format automatically
{:ok, spec} = LLMDB.parse("openai:gpt-4")  # detects colon format
{:ok, spec} = LLMDB.parse("gpt-4@openai")  # detects @ format

Ambiguous Input

If a spec contains both : and @, you must specify the format explicitly:

# This is ambiguous - error!
{:error, :ambiguous_format} = LLMDB.parse("provider:model@test")

# Specify the format explicitly
{:ok, {:provider, "model@test"}} = LLMDB.parse("provider:model@test", format: :colon)
{:ok, {:test, "provider:model"}} = LLMDB.parse("provider:model@test", format: :at)

Validation Rules

Common Rules (Both Formats)

  • Provider and model segments cannot be empty
  • Leading/trailing whitespace is trimmed

Colon Format Rules

  • Provider cannot contain : or @
  • Model ID can contain : (for Bedrock models like "anthropic.claude-opus:0")
  • Model ID cannot contain @

@ Format Rules

  • Provider cannot contain : or @
  • Model ID can contain @ (e.g., "model@test@openai" → provider is openai, model is "model@test")
  • Model ID cannot contain :

Configuration

Set the default output format in your config:

# config/config.exs
config :llm_db,
  model_spec_format: :provider_colon_model  # default
  # or
  model_spec_format: :model_at_provider     # filename-safe by default

Per-call overrides always take precedence:

# Even if config says :model_at_provider, this returns colon format
"openai:gpt-4" = LLMDB.format(spec, :provider_colon_model)

Use Case Examples

Template Files

defmodule MyApp.PromptLoader do
  def load(model_spec, template_name) do
    # Use @ format for filename safety
    model_str = LLMDB.format(model_spec, :filename_safe)
    path = Path.join(["templates", "#{template_name}-#{model_str}.liquid"])
    
    File.read!(path)
    # Reads: "templates/system-prompt-gpt-4o-mini@openai.liquid"
  end
end

Oban Job Arguments

defmodule MyApp.LLMWorker do
  use Oban.Worker
  
  def new(model_spec, prompt) do
    %{
      # Store as filename-safe string
      model: LLMDB.format(model_spec, :filename_safe),
      prompt: prompt
    }
    |> __MODULE__.new()
  end
  
  @impl Oban.Worker
  def perform(%Oban.Job{args: %{"model" => model_str, "prompt" => prompt}}) do
    # Parse from either format
    {:ok, {provider, model_id}} = LLMDB.parse(model_str)
    
    # Use the spec
    {:ok, model} = LLMDB.model(provider, model_id)
    # ...
  end
end

Cache Keys

defmodule MyApp.Cache do
  def cache_key(model_spec, input_hash) do
    # Use @ format for S3/filesystem compatibility
    model_str = LLMDB.format(model_spec, :filename_safe)
    "llm-cache/#{model_str}/#{input_hash}.json"
    # => "llm-cache/gpt-4o-mini@openai/abc123.json"
  end
end

CI Artifacts

defmodule MyApp.Benchmark do
  def artifact_path(model_spec, timestamp) do
    model_str = LLMDB.format(model_spec, :filename_safe)
    Path.join([
      "benchmark-results",
      model_str,
      "#{timestamp}.json"
    ])
    # => "benchmark-results/gpt-4o-mini@openai/2025-11-07T10:30:00Z.json"
  end
end

Migration Guide

If you want to adopt the @ format for existing stored specs:

No migration needed - just use the new format going forward:

# Old data uses colon format
old_spec = "openai:gpt-4"
{:ok, model} = LLMDB.model(old_spec)  # still works

# New data uses @ format
new_spec = LLMDB.format({:openai, "gpt-4"}, :filename_safe)
{:ok, model} = LLMDB.model(new_spec)  # also works

Option 2: Migrate Stored Data

For databases or files storing specs as strings:

defmodule MyApp.MigrateSpecs do
  def run do
    MyApp.Repo.all(MyApp.Record)
    |> Enum.each(fn record ->
      # Parse old format
      {:ok, spec} = LLMDB.parse(record.model_spec)
      
      # Format in new format
      new_spec = LLMDB.format(spec, :model_at_provider)
      
      # Update record
      record
      |> Ecto.Changeset.change(model_spec: new_spec)
      |> MyApp.Repo.update!()
    end)
  end
end

Option 3: Set Global Default

Change the default output format:

# config/config.exs
config :llm_db,
  model_spec_format: :model_at_provider

Now all LLMDB.format/1 calls (without explicit format) return @ format:

"gpt-4@openai" = LLMDB.format({:openai, "gpt-4"})

Best Practices

  1. Use colon format by default - It's more familiar and readable
  2. Use @ format for filenames - Avoids cross-platform issues
  3. Use tuples internally - Skip parsing overhead when provider is known
  4. Let auto-detection work - Don't specify :format unless dealing with ambiguous input
  5. Document your choice - Make it clear which format your system expects
  6. Be consistent - Pick one format for logs, another for files, and stick with it

Summary

FormatSyntaxUse CaseExample
Colon"provider:model"Default, human-readable"openai:gpt-4o-mini"
@"model@provider"Filenames, URLs, cross-platform"gpt-4o-mini@openai"
Tuple{:provider, "model"}Internal, performance-critical{:openai, "gpt-4o-mini"}

All three formats are fully supported and can be used interchangeably throughout the LLMDB API.