ReqLLM.Providers.Meta (ReqLLM v1.0.0)

View Source

Generic Meta Llama provider implementing Meta's native prompt format.

Handles Meta's Llama models (Llama 3, 3.1, 3.2, 3.3, 4) using the native Llama prompt format and request/response structure.

Usage Note

Most cloud providers and self-hosted deployments wrap Llama models in OpenAI-compatible APIs and should delegate to ReqLLM.Providers.OpenAI instead of this module:

  • Azure AI Foundry - Uses OpenAI-compatible API
  • Google Cloud Vertex AI - Uses OpenAI-compatible API
  • vLLM (self-hosted) - Uses OpenAI-compatible API
  • Ollama (self-hosted) - Uses OpenAI-compatible API
  • llama.cpp (self-hosted) - Uses OpenAI-compatible API

This module is for providers that use Meta's native format with prompt, max_gen_len, generation, etc. Currently this is primarily:

Native Request Format

Llama's native format uses a single prompt string with special tokens:

  • prompt - Formatted text with special tokens (required)
  • max_gen_len - Maximum tokens to generate
  • temperature - Sampling temperature
  • top_p - Nucleus sampling parameter

Native Response Format

  • generation - The generated text
  • prompt_token_count - Input token count
  • generation_token_count - Output token count
  • stop_reason - Why generation stopped

Llama Prompt Format

Llama 3+ uses a structured prompt format with special tokens:

  • System messages: <|start_header_id|>system<|end_header_id|>
  • User messages: <|start_header_id|>user<|end_header_id|>
  • Assistant messages: <|start_header_id|>assistant<|end_header_id|>

Cloud Provider Integration

Cloud providers using the native format should wrap this module's functions with their specific auth/endpoint handling. See ReqLLM.Providers.AmazonBedrock.Meta as an example.

Summary

Functions

Extracts usage metadata from the response body.

Formats messages into Llama 3 prompt format.

Formats a ReqLLM context into Meta Llama request format.

Parses Meta Llama response into ReqLLM format.

Parses stop reason from Meta's response format.

Functions

extract_usage(body)

Extracts usage metadata from the response body.

Looks for prompt_token_count and generation_token_count fields.

Examples

body = %{
  "prompt_token_count" => 10,
  "generation_token_count" => 5
}

ReqLLM.Providers.Meta.extract_usage(body)
# => {:ok, %{input_tokens: 10, output_tokens: 5, total_tokens: 15, ...}}

format_llama_prompt(messages)

Formats messages into Llama 3 prompt format.

Format: <|begin_of_text|><|start_header_id|>role<|end_header_id|>\ncontent<|eot_id|>

Examples

messages = [
  %{role: :system, content: "You are helpful"},
  %{role: :user, content: "Hello"}
]

ReqLLM.Providers.Meta.format_llama_prompt(messages)
# => "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are helpful<|eot_id|>..."

format_request(context, opts \\ [])

Formats a ReqLLM context into Meta Llama request format.

Converts structured messages into Llama 3's prompt format and returns a map with the prompt and optional parameters.

Options

  • :max_tokens - Maximum tokens to generate (mapped to max_gen_len)
  • :temperature - Sampling temperature
  • :top_p - Nucleus sampling parameter

Examples

context = %ReqLLM.Context{
  messages: [
    %{role: :user, content: "Hello!"}
  ]
}

ReqLLM.Providers.Meta.format_request(context, max_tokens: 100)
# => %{
#   "prompt" => "<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nHello!<|eot_id|>...",
#   "max_gen_len" => 100
# }

parse_response(body, opts)

Parses Meta Llama response into ReqLLM format.

Expects a response body with:

  • "generation" - The generated text
  • "prompt_token_count" - Input token count (optional)
  • "generation_token_count" - Output token count (optional)
  • "stop_reason" - Why generation stopped (optional)

Examples

body = %{
  "generation" => "Hello! How can I help?",
  "prompt_token_count" => 10,
  "generation_token_count" => 5,
  "stop_reason" => "stop"
}

ReqLLM.Providers.Meta.parse_response(body, model: "meta.llama3")
# => {:ok, %ReqLLM.Response{...}}

parse_stop_reason(arg1)

Parses stop reason from Meta's response format.

Maps Meta's stop reasons to ReqLLM's standard finish reasons:

  • "stop":stop
  • "length":length
  • anything else → :stop