ReqLLM.Providers.Meta (ReqLLM v1.0.0-rc.8)

View Source

Generic Meta Llama provider implementing Meta's native prompt format.

Handles Meta's Llama models (Llama 3, 3.1, 3.2, 3.3, 4) using the native Llama prompt format and request/response structure.

Usage Note

Most cloud providers and self-hosted deployments wrap Llama models in OpenAI-compatible APIs and should delegate to ReqLLM.Providers.OpenAI instead of this module:

  • Azure AI Foundry - Uses OpenAI-compatible API
  • Google Cloud Vertex AI - Uses OpenAI-compatible API
  • vLLM (self-hosted) - Uses OpenAI-compatible API
  • Ollama (self-hosted) - Uses OpenAI-compatible API
  • llama.cpp (self-hosted) - Uses OpenAI-compatible API

This module is for providers that use Meta's native format with prompt, max_gen_len, generation, etc. Currently this is primarily:

Native Request Format

Llama's native format uses a single prompt string with special tokens:

  • prompt - Formatted text with special tokens (required)
  • max_gen_len - Maximum tokens to generate
  • temperature - Sampling temperature
  • top_p - Nucleus sampling parameter

Native Response Format

  • generation - The generated text
  • prompt_token_count - Input token count
  • generation_token_count - Output token count
  • stop_reason - Why generation stopped

Llama Prompt Format

Llama 3+ uses a structured prompt format with special tokens:

  • System messages: <|start_header_id|>system<|end_header_id|>
  • User messages: <|start_header_id|>user<|end_header_id|>
  • Assistant messages: <|start_header_id|>assistant<|end_header_id|>

Cloud Provider Integration

Cloud providers using the native format should wrap this module's functions with their specific auth/endpoint handling. See ReqLLM.Providers.AmazonBedrock.Meta as an example.

Summary

Functions

Extracts usage metadata from the response body.

Formats messages into Llama 3 prompt format.

Formats a ReqLLM context into Meta Llama request format.

Parses Meta Llama response into ReqLLM format.

Parses stop reason from Meta's response format.

Functions

extract_usage(body)

Extracts usage metadata from the response body.

Looks for prompt_token_count and generation_token_count fields.

Examples

body = %{
  "prompt_token_count" => 10,
  "generation_token_count" => 5
}

ReqLLM.Providers.Meta.extract_usage(body)
# => {:ok, %{input_tokens: 10, output_tokens: 5, total_tokens: 15, ...}}

format_llama_prompt(messages)

Formats messages into Llama 3 prompt format.

Format: <|begin_of_text|><|start_header_id|>role<|end_header_id|>\ncontent<|eot_id|>

Examples

messages = [
  %{role: :system, content: "You are helpful"},
  %{role: :user, content: "Hello"}
]

ReqLLM.Providers.Meta.format_llama_prompt(messages)
# => "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are helpful<|eot_id|>..."

format_request(context, opts \\ [])

Formats a ReqLLM context into Meta Llama request format.

Converts structured messages into Llama 3's prompt format and returns a map with the prompt and optional parameters.

Options

  • :max_tokens - Maximum tokens to generate (mapped to max_gen_len)
  • :temperature - Sampling temperature
  • :top_p - Nucleus sampling parameter

Examples

context = %ReqLLM.Context{
  messages: [
    %{role: :user, content: "Hello!"}
  ]
}

ReqLLM.Providers.Meta.format_request(context, max_tokens: 100)
# => %{
#   "prompt" => "<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nHello!<|eot_id|>...",
#   "max_gen_len" => 100
# }

parse_response(body, opts)

Parses Meta Llama response into ReqLLM format.

Expects a response body with:

  • "generation" - The generated text
  • "prompt_token_count" - Input token count (optional)
  • "generation_token_count" - Output token count (optional)
  • "stop_reason" - Why generation stopped (optional)

Examples

body = %{
  "generation" => "Hello! How can I help?",
  "prompt_token_count" => 10,
  "generation_token_count" => 5,
  "stop_reason" => "stop"
}

ReqLLM.Providers.Meta.parse_response(body, model: "meta.llama3")
# => {:ok, %ReqLLM.Response{...}}

parse_stop_reason(arg1)

Parses stop reason from Meta's response format.

Maps Meta's stop reasons to ReqLLM's standard finish reasons:

  • "stop":stop
  • "length":length
  • anything else → :stop