ReqLLM.Providers.Meta (ReqLLM v1.0.0-rc.8)

Generic Meta Llama provider implementing Meta's native prompt format.

Handles Meta's Llama models (Llama 3, 3.1, 3.2, 3.3, 4) using the native Llama prompt format and request/response structure.

Usage Note

Most cloud providers and self-hosted deployments wrap Llama models in OpenAI-compatible APIs and should delegate to ReqLLM.Providers.OpenAI instead of this module:

Azure AI Foundry - Uses OpenAI-compatible API
Google Cloud Vertex AI - Uses OpenAI-compatible API
vLLM (self-hosted) - Uses OpenAI-compatible API
Ollama (self-hosted) - Uses OpenAI-compatible API
llama.cpp (self-hosted) - Uses OpenAI-compatible API

This module is for providers that use Meta's native format with prompt, max_gen_len, generation, etc. Currently this is primarily:

AWS Bedrock - Uses native Meta format via ReqLLM.Providers.AmazonBedrock.Meta

Native Request Format

Llama's native format uses a single prompt string with special tokens:

prompt - Formatted text with special tokens (required)
max_gen_len - Maximum tokens to generate
temperature - Sampling temperature
top_p - Nucleus sampling parameter

Native Response Format

generation - The generated text
prompt_token_count - Input token count
generation_token_count - Output token count
stop_reason - Why generation stopped

Llama Prompt Format

Llama 3+ uses a structured prompt format with special tokens:

System messages: <|start_header_id|>system<|end_header_id|>
User messages: <|start_header_id|>user<|end_header_id|>
Assistant messages: <|start_header_id|>assistant<|end_header_id|>

Cloud Provider Integration

Cloud providers using the native format should wrap this module's functions with their specific auth/endpoint handling. See ReqLLM.Providers.AmazonBedrock.Meta as an example.

Summary

Functions

extract_usage(body)

Extracts usage metadata from the response body.

format_llama_prompt(messages)

Formats messages into Llama 3 prompt format.

format_request(context, opts \\ [])

Formats a ReqLLM context into Meta Llama request format.

parse_response(body, opts)

Parses Meta Llama response into ReqLLM format.

parse_stop_reason(arg1)

Parses stop reason from Meta's response format.

Functions

extract_usage(body)

Extracts usage metadata from the response body.

Looks for prompt_token_count and generation_token_count fields.

Examples

body = %{
  "prompt_token_count" => 10,
  "generation_token_count" => 5
}

ReqLLM.Providers.Meta.extract_usage(body)
# => {:ok, %{input_tokens: 10, output_tokens: 5, total_tokens: 15, ...}}

format_llama_prompt(messages)

Formats messages into Llama 3 prompt format.

Examples

messages = [
  %{role: :system, content: "You are helpful"},
  %{role: :user, content: "Hello"}
]

ReqLLM.Providers.Meta.format_llama_prompt(messages)
# => "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are helpful<|eot_id|>..."

format_request(context, opts \\ [])

Formats a ReqLLM context into Meta Llama request format.

Converts structured messages into Llama 3's prompt format and returns a map with the prompt and optional parameters.

Options

:max_tokens - Maximum tokens to generate (mapped to max_gen_len)
:temperature - Sampling temperature
:top_p - Nucleus sampling parameter

Examples

context = %ReqLLM.Context{
  messages: [
    %{role: :user, content: "Hello!"}
  ]
}

ReqLLM.Providers.Meta.format_request(context, max_tokens: 100)
# => %{
#   "prompt" => "<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nHello!<|eot_id|>...",
#   "max_gen_len" => 100
# }

parse_response(body, opts)

Parses Meta Llama response into ReqLLM format.

Expects a response body with:

"generation" - The generated text
"prompt_token_count" - Input token count (optional)
"generation_token_count" - Output token count (optional)
"stop_reason" - Why generation stopped (optional)

Examples

body = %{
  "generation" => "Hello! How can I help?",
  "prompt_token_count" => 10,
  "generation_token_count" => 5,
  "stop_reason" => "stop"
}

ReqLLM.Providers.Meta.parse_response(body, model: "meta.llama3")
# => {:ok, %ReqLLM.Response{...}}

parse_stop_reason(arg1)

Parses stop reason from Meta's response format.

Maps Meta's stop reasons to ReqLLM's standard finish reasons:

"stop" → :stop
"length" → :length
anything else → :stop