ReqLLM.Providers.Meta (ReqLLM v1.0.0-rc.8)
View SourceGeneric Meta Llama provider implementing Meta's native prompt format.
Handles Meta's Llama models (Llama 3, 3.1, 3.2, 3.3, 4) using the native Llama prompt format and request/response structure.
Usage Note
Most cloud providers and self-hosted deployments wrap Llama models in
OpenAI-compatible APIs and should delegate to ReqLLM.Providers.OpenAI
instead of this module:
- Azure AI Foundry - Uses OpenAI-compatible API
- Google Cloud Vertex AI - Uses OpenAI-compatible API
- vLLM (self-hosted) - Uses OpenAI-compatible API
- Ollama (self-hosted) - Uses OpenAI-compatible API
- llama.cpp (self-hosted) - Uses OpenAI-compatible API
This module is for providers that use Meta's native format with
prompt, max_gen_len, generation, etc. Currently this is primarily:
- AWS Bedrock - Uses native Meta format via ReqLLM.Providers.AmazonBedrock.Meta
Native Request Format
Llama's native format uses a single prompt string with special tokens:
- prompt- Formatted text with special tokens (required)
- max_gen_len- Maximum tokens to generate
- temperature- Sampling temperature
- top_p- Nucleus sampling parameter
Native Response Format
- generation- The generated text
- prompt_token_count- Input token count
- generation_token_count- Output token count
- stop_reason- Why generation stopped
Llama Prompt Format
Llama 3+ uses a structured prompt format with special tokens:
- System messages: <|start_header_id|>system<|end_header_id|>
- User messages: <|start_header_id|>user<|end_header_id|>
- Assistant messages: <|start_header_id|>assistant<|end_header_id|>
Cloud Provider Integration
Cloud providers using the native format should wrap this module's functions
with their specific auth/endpoint handling. See ReqLLM.Providers.AmazonBedrock.Meta
as an example.
Summary
Functions
Extracts usage metadata from the response body.
Formats messages into Llama 3 prompt format.
Formats a ReqLLM context into Meta Llama request format.
Parses Meta Llama response into ReqLLM format.
Parses stop reason from Meta's response format.
Functions
Extracts usage metadata from the response body.
Looks for prompt_token_count and generation_token_count fields.
Examples
body = %{
  "prompt_token_count" => 10,
  "generation_token_count" => 5
}
ReqLLM.Providers.Meta.extract_usage(body)
# => {:ok, %{input_tokens: 10, output_tokens: 5, total_tokens: 15, ...}}Formats messages into Llama 3 prompt format.
Format: <|begin_of_text|><|start_header_id|>role<|end_header_id|>\ncontent<|eot_id|>
Examples
messages = [
  %{role: :system, content: "You are helpful"},
  %{role: :user, content: "Hello"}
]
ReqLLM.Providers.Meta.format_llama_prompt(messages)
# => "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are helpful<|eot_id|>..."Formats a ReqLLM context into Meta Llama request format.
Converts structured messages into Llama 3's prompt format and returns a map with the prompt and optional parameters.
Options
- :max_tokens- Maximum tokens to generate (mapped to- max_gen_len)
- :temperature- Sampling temperature
- :top_p- Nucleus sampling parameter
Examples
context = %ReqLLM.Context{
  messages: [
    %{role: :user, content: "Hello!"}
  ]
}
ReqLLM.Providers.Meta.format_request(context, max_tokens: 100)
# => %{
#   "prompt" => "<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nHello!<|eot_id|>...",
#   "max_gen_len" => 100
# }Parses Meta Llama response into ReqLLM format.
Expects a response body with:
- "generation"- The generated text
- "prompt_token_count"- Input token count (optional)
- "generation_token_count"- Output token count (optional)
- "stop_reason"- Why generation stopped (optional)
Examples
body = %{
  "generation" => "Hello! How can I help?",
  "prompt_token_count" => 10,
  "generation_token_count" => 5,
  "stop_reason" => "stop"
}
ReqLLM.Providers.Meta.parse_response(body, model: "meta.llama3")
# => {:ok, %ReqLLM.Response{...}}Parses stop reason from Meta's response format.
Maps Meta's stop reasons to ReqLLM's standard finish reasons:
- "stop"→- :stop
- "length"→- :length
- anything else → :stop