OpenAI Provider Guide

View Source

The OpenAI provider supports GPT models including GPT-4, GPT-3.5, and reasoning models like o1, o3, and GPT-5.

Configuration

Set your OpenAI API key:

# Add to .env file (automatically loaded)
OPENAI_API_KEY=sk-...

Or use in-memory storage:

ReqLLM.put_key(:openai_api_key, "sk-...")

Dual API Architecture

OpenAI provider uses two specialized APIs:

Chat Completions API (ChatAPI)

For standard GPT models:

  • gpt-4o
  • gpt-4o-mini
  • gpt-4-turbo
  • gpt-3.5-turbo

Responses API (ResponsesAPI)

For reasoning models with extended thinking:

  • o1, o1-mini, o1-preview
  • o3, o3-mini
  • o4-mini-gpt-4.1,gpt-5ReqLLM automatically routes to the correct API based on model metadata. ## Basic Usage ```elixir # Standard chat model {:ok, response} = ReqLLM.generate_text( "openai:gpt-4o", "Explain async/await in JavaScript" ) # Reasoning model {:ok, response} = ReqLLM.generate_text( "openai:o1", "Solve this complex math problem step by step" ) # Access reasoning tokens for reasoning models reasoning_tokens = response.usage.reasoning_tokens ``` ## Provider-Specific Options ### Embedding Dimensions Control dimensions for embedding models: ```elixir {:ok, embedding} = ReqLLM.embed( "openai:text-embedding-3-small", "Hello world", provider_options: [dimensions: 512] ) ``` ### Encoding Format Specify format for embedding output: ```elixir {:ok, embedding} = ReqLLM.embed( "openai:text-embedding-3-small", "Hello world", provider_options: [encoding_format: "base64"] ) ``` ### Max Completion Tokens Required for reasoning models (o1, o3, gpt-5): ```elixir {:ok, response} = ReqLLM.generate_text( "openai:o1", "Complex problem", provider_options: [max_completion_tokens: 4000] ) ``` **Note**: Reasoning models usemax_completion_tokensinstead ofmax_tokens. ReqLLM handles this translation automatically when you usemax_tokens. ### Structured Output Mode Control how structured outputs are generated: ```elixir schema = [ name: [type: :string, required: true], age: [type: :integer, required: true] ] {:ok, response} = ReqLLM.generate_object( "openai:gpt-4o", "Generate a person", schema, provider_options: [openai_structured_output_mode: :json_schema] ) ``` Modes: -:auto- Usejson_schemawhen supported, else strict tools (default) -:json_schema- Forceresponse_formatwithjson_schema(requires model support) -:tool_strict- Forcestrict: trueon function tools ### Response Format Custom response format configuration: ```elixir {:ok, response} = ReqLLM.generate_text( "openai:gpt-4o", "Return JSON", provider_options: [ response_format: %{ type: "json_schema", json_schema: %{ name: "person", schema: %{type: "object", properties: %{name: %{type: "string"}}} } } ] ) ``` ### Parallel Tool Calls Control parallel tool call behavior: ```elixir {:ok, response} = ReqLLM.generate_text( "openai:gpt-4o", "What's the weather in NYC and LA?", tools: [weather_tool], provider_options: [openai_parallel_tool_calls: false] ) ``` ### Reasoning Effort (Responses API Only) Control reasoning effort for o1/o3/GPT-5 models: ```elixir {:ok, response} = ReqLLM.generate_text( "openai:o1", "Difficult problem", provider_options: [reasoning_effort: :high] ) ``` Levels::minimal,:low,:medium,:high` ### Seed for Reproducibility Set seed for deterministic outputs: elixir {:ok, response} = ReqLLM.generate_text( "openai:gpt-4o", "Random story", provider_options: [seed: 42] ) ### Logprobs Request log probabilities: elixir {:ok, response} = ReqLLM.generate_text( "openai:gpt-4o", "Hello", provider_options: [ logprobs: true, top_logprobs: 3 ] ) ### User Identifier Track usage by user: elixir {:ok, response} = ReqLLM.generate_text( "openai:gpt-4o", "Hello", provider_options: [user: "user_123"] ) ## Streaming elixir {:ok, stream_response} = ReqLLM.stream_text( "openai:gpt-4o", "Write a story" ) # Stream tokens in real-time ReqLLM.StreamResponse.tokens(stream_response) |> Stream.each(&IO.write/1) |> Stream.run() # Get usage after streaming completes usage = ReqLLM.StreamResponse.usage(stream_response) ## Tool Calling elixir tools = [ ReqLLM.tool( name: "get_weather", description: "Get weather for a location", parameter_schema: [ location: [type: :string, required: true] ], callback: {WeatherAPI, :fetch} ) ] {:ok, response} = ReqLLM.generate_text( "openai:gpt-4o", "What's the weather in Seattle?", tools: tools ) ## Embeddings elixir # Single embedding {:ok, embedding} = ReqLLM.embed( "openai:text-embedding-3-small", "Hello world" ) # Multiple texts {:ok, embeddings} = ReqLLM.embed( "openai:text-embedding-3-small", ["Hello", "World", "Elixir"] ) ## Vision (Multimodal) elixir import ReqLLM.Context alias ReqLLM.Message.ContentPart context = Context.new([ user([ ContentPart.text("Describe this image"), ContentPart.image_url("https://example.com/photo.jpg") ]) ]) {:ok, response} = ReqLLM.generate_text("openai:gpt-4o", context) ## Complete Example elixir import ReqLLM.Context context = Context.new([ system("You are a helpful coding assistant"), user("Explain tail recursion in Elixir") ]) {:ok, response} = ReqLLM.generate_text( "openai:gpt-4o", context, temperature: 0.7, max_tokens: 500, provider_options: [ seed: 42, logprobs: true, top_logprobs: 2 ] ) text = ReqLLM.Response.text(response) usage = response.usage IO.puts(text) IO.puts("Cost: $#{usage.total_cost}") ## Reasoning Models Example elixir {:ok, response} = ReqLLM.generate_text( "openai:o1", "Solve: If a train travels 120 km in 2 hours, how far will it travel in 5 hours at the same speed?", max_tokens: 2000, # Automatically translated to max_completion_tokens provider_options: [reasoning_effort: :medium] ) # Reasoning models use thinking tokens IO.puts("Reasoning tokens: #{response.usage.reasoning_tokens}") IO.puts("Output tokens: #{response.usage.output_tokens}") ## Usage Metrics OpenAI models provide comprehensive usage data: elixir %{ input_tokens: 10, output_tokens: 50, reasoning_tokens: 100, # For reasoning models (o1, o3, gpt-5) cached_tokens: 5, # Cached input tokens total_tokens: 60, input_cost: 0.00005, output_cost: 0.00075, total_cost: 0.0008 } ## Error Handling elixir case ReqLLM.generate_text("openai:gpt-4o", "Hello") do {:ok, response} -> handle_success(response) {:error, error} -> IO.puts("Error: #{error.message}") end ## Resources - OpenAI API Documentation - Model Overview - Pricing