Custom LLM Endpoint Guide
View SourceThis guide explains how to integrate custom LLM endpoints with GettextTranslator using LangChain 0.4.0.
Table of Contents
- Overview
- Supported Models
- Response Format Requirements
- Custom Adapter Implementation
- Configuration
- Testing
- Common Issues
- Examples
Overview
GettextTranslator uses LangChain to communicate with LLM providers. As of version 0.5.0, the library uses LangChain 0.4.0, which introduced breaking changes in how messages are structured.
Key Change in LangChain 0.4.0:
- Message
contentis now a list of ContentPart structs instead of plain strings - All custom endpoints must return responses in this format
Supported Models
LangChain 0.4.0 officially supports:
| Provider | Module | Status |
|---|---|---|
| OpenAI | LangChain.ChatModels.ChatOpenAI | ✅ Fully Supported |
| Anthropic | LangChain.ChatModels.ChatAnthropic | ✅ Fully Supported |
| Google Gemini | LangChain.ChatModels.ChatGoogleAI | ✅ Fully Supported |
| Google Vertex AI | LangChain.ChatModels.ChatVertexAI | ✅ Fully Supported |
| Ollama | LangChain.ChatModels.ChatOllamaAI | ⚠️ May not work |
| Others | Custom implementation | ⚠️ Requires custom adapter |
Important: If you're using Ollama or other unsupported models, they may not function correctly with LangChain 0.4.0. Consider:
- Using GettextTranslator 0.4.5 (with LangChain 0.3.3)
- Switching to a supported provider
- Implementing a custom adapter (see below)
Response Format Requirements
Synchronous (Non-Streaming) Response
Your LLM endpoint must return a tuple {:ok, updated_chain} where the chain contains the response message.
Required Structure:
{:ok, %LangChain.Chains.LLMChain{
last_message: %LangChain.Message{
role: :assistant,
content: [
%LangChain.Message.ContentPart{
type: :text,
content: "Translated text here"
}
],
status: :complete,
index: 0
},
messages: [
# All messages in the conversation
]
}}Critical Requirements:
Content as List:
contentMUST be a list ofContentPartstructs# ✅ CORRECT content: [%ContentPart{type: :text, content: "text"}] # ❌ WRONG content: "text"ContentPart Structure: Each part must have
typeandcontent%LangChain.Message.ContentPart{ type: :text, # Required: :text, :image, :tool_call, etc. content: "string" # Required: the actual content }Message Fields:
role: Must be:assistantfor LLM responsesstatus: Should be:completewhen donecontent: List of ContentPart structs (never a string)
Streaming Response
For streaming responses, emit deltas via callbacks.
Delta Structure:
# Each delta emitted
%LangChain.MessageDelta{
role: :assistant,
content: [
%LangChain.Message.ContentPart{
type: :text,
content: "Partial text chunk"
}
],
status: :incomplete # or :complete for the final delta
}Streaming Requirements:
The
on_llm_new_deltacallback receives a list of deltas:def handle_delta(deltas) when is_list(deltas) do # Process list of MessageDelta structs endMerge deltas using
LLMChain.merge_deltas/2:updated_chain = LLMChain.merge_deltas(current_chain, deltas)Access merged content via
MessageDelta.merged_content:text = updated_chain.delta.merged_contentFinal delta must have
status: :complete
Error Response
On error, return a three-element tuple:
{:error, updated_chain, reason}Components:
updated_chain: The chain state at the time of error (can benil)reason: Error description (string, atom, or structured error)
Example:
{:error, chain, "API rate limit exceeded"}GettextTranslator handles errors gracefully:
- Logs the error with context
- Returns an empty translation
{:ok, ""} - Allows the translation process to continue
Custom Adapter Implementation
Minimal Adapter Example
defmodule MyApp.CustomLLMAdapter do
@moduledoc """
Custom LLM adapter for GettextTranslator using LangChain 0.4.0.
"""
use LangChain.ChatModels.ChatModel
alias LangChain.Message
alias LangChain.Message.ContentPart
defstruct [
:model,
:temperature,
:endpoint,
:api_key
]
@type t :: %__MODULE__{
model: String.t(),
temperature: float(),
endpoint: String.t(),
api_key: String.t()
}
@doc """
Creates a new instance of the custom LLM adapter.
## Examples
iex> MyApp.CustomLLMAdapter.new!(%{
...> model: "custom-model-v1",
...> temperature: 0.0,
...> endpoint: "https://api.example.com/v1/chat",
...> api_key: "sk-..."
...> })
"""
@impl true
def new(attrs \\ %{}) do
%__MODULE__{
model: attrs[:model] || "default-model",
temperature: attrs[:temperature] || 0.7,
endpoint: attrs[:endpoint] || "https://api.example.com/v1/chat",
api_key: attrs[:api_key]
}
|> validate()
end
@impl true
def new!(attrs \\ %{}) do
case new(attrs) do
{:ok, adapter} -> adapter
{:error, reason} -> raise ArgumentError, reason
end
end
defp validate(adapter) do
cond do
is_nil(adapter.api_key) ->
{:error, "API key is required"}
is_nil(adapter.endpoint) ->
{:error, "Endpoint URL is required"}
true ->
{:ok, adapter}
end
end
@doc """
Sends messages to the LLM and returns the response.
This is the main function called by LangChain to get completions.
"""
@impl true
def call(adapter, messages, _functions \\ []) do
# Build the request payload
payload = build_payload(adapter, messages)
# Make the API request
case make_api_request(adapter, payload) do
{:ok, response_text} ->
# Convert to LangChain 0.4.0 format
message = %Message{
role: :assistant,
content: [
%ContentPart{
type: :text,
content: response_text
}
],
status: :complete
}
{:ok, message}
{:error, reason} ->
{:error, nil, reason}
end
end
defp build_payload(adapter, messages) do
%{
model: adapter.model,
temperature: adapter.temperature,
messages: Enum.map(messages, &message_to_api_format/1)
}
end
defp message_to_api_format(%Message{role: role, content: content}) do
%{
role: to_string(role),
content: ContentPart.parts_to_string(content)
}
end
defp make_api_request(adapter, payload) do
headers = [
{"Authorization", "Bearer #{adapter.api_key}"},
{"Content-Type", "application/json"}
]
body = Jason.encode!(payload)
case HTTPoison.post(adapter.endpoint, body, headers) do
{:ok, %{status_code: 200, body: response_body}} ->
case Jason.decode(response_body) do
{:ok, %{"choices" => [%{"message" => %{"content" => content}} | _]}} ->
{:ok, content}
{:error, _} = error ->
error
end
{:ok, %{status_code: status_code, body: body}} ->
{:error, "API returned status #{status_code}: #{body}"}
{:error, %HTTPoison.Error{reason: reason}} ->
{:error, "HTTP request failed: #{inspect(reason)}"}
end
end
endStreaming Adapter Example
defmodule MyApp.StreamingLLMAdapter do
use LangChain.ChatModels.ChatModel
alias LangChain.Message
alias LangChain.Message.ContentPart
alias LangChain.MessageDelta
alias LangChain.Chains.LLMChain
# ... struct and new/new! implementations ...
@impl true
def call(adapter, messages, _functions \\ []) do
# For streaming, we need to handle Server-Sent Events (SSE)
payload = build_payload(adapter, messages, stream: true)
# Initialize the delta accumulator
delta_acc = %MessageDelta{
role: :assistant,
content: [],
status: :incomplete
}
case stream_api_request(adapter, payload, delta_acc) do
{:ok, final_delta} ->
# Convert final delta to message
message = MessageDelta.to_message(final_delta)
{:ok, message}
{:error, reason} ->
{:error, nil, reason}
end
end
defp stream_api_request(adapter, payload, delta_acc) do
headers = [
{"Authorization", "Bearer #{adapter.api_key}"},
{"Content-Type", "application/json"},
{"Accept", "text/event-stream"}
]
body = Jason.encode!(payload)
# Use streaming HTTP client
case HTTPoison.post(adapter.endpoint, body, headers, stream_to: self(), async: :once) do
{:ok, %HTTPoison.AsyncResponse{id: ref}} ->
receive_stream(ref, delta_acc)
{:error, reason} ->
{:error, reason}
end
end
defp receive_stream(ref, delta_acc) do
receive do
%HTTPoison.AsyncChunk{id: ^ref, chunk: chunk} ->
# Parse SSE chunk
case parse_sse_chunk(chunk) do
{:ok, content_chunk} ->
# Create a delta for this chunk
new_delta = %MessageDelta{
role: :assistant,
content: [%ContentPart{type: :text, content: content_chunk}],
status: :incomplete
}
# Merge with accumulator
updated_acc = merge_message_deltas(delta_acc, new_delta)
# Request next chunk
HTTPoison.stream_next(%HTTPoison.AsyncResponse{id: ref})
receive_stream(ref, updated_acc)
{:done} ->
# Stream complete
{:ok, %{delta_acc | status: :complete}}
{:error, reason} ->
{:error, reason}
end
%HTTPoison.AsyncEnd{id: ^ref} ->
{:ok, %{delta_acc | status: :complete}}
%HTTPoison.Error{id: ^ref, reason: reason} ->
{:error, reason}
after
30_000 ->
{:error, "Stream timeout"}
end
end
defp parse_sse_chunk(chunk) do
# Parse Server-Sent Events format
# Example: "data: {\"choices\":[{\"delta\":{\"content\":\"text\"}}]}\n\n"
case String.trim(chunk) do
"data: [DONE]" ->
{:done}
"data: " <> json_data ->
case Jason.decode(json_data) do
{:ok, %{"choices" => [%{"delta" => %{"content" => content}} | _]}} ->
{:ok, content}
_ ->
{:ok, ""}
end
_ ->
{:ok, ""}
end
end
defp merge_message_deltas(acc, new_delta) do
# Merge content lists
merged_content = acc.content ++ new_delta.content
%{acc | content: merged_content}
end
endConfiguration
Basic Configuration
# config/config.exs
config :gettext_translator, GettextTranslator,
endpoint: MyApp.CustomLLMAdapter,
endpoint_model: "custom-model-v1",
endpoint_temperature: 0,
endpoint_config: %{
"api_key" => System.get_env("CUSTOM_LLM_API_KEY"),
"endpoint" => "https://api.example.com/v1/chat"
},
persona: "You are a professional translator. Translate accurately while preserving meaning and length.",
style: "Casual, using simple language",
ignored_languages: ["en"]Runtime Configuration
The endpoint_config map is dynamically applied at runtime. GettextTranslator converts config keys to LangChain application environment variables:
# This config map:
endpoint_config: %{
"api_key" => "sk-...",
"custom_setting" => "value"
}
# Becomes:
Application.put_env(:langchain, :api_key, "sk-...")
Application.put_env(:langchain, :custom_setting, "value")This allows you to configure any LangChain-compatible adapter without code changes.
Environment-Specific Configuration
# config/dev.exs
config :gettext_translator, GettextTranslator,
endpoint: LangChain.ChatModels.ChatOllamaAI,
endpoint_model: "llama3.2:latest",
endpoint_temperature: 0,
endpoint_config: %{}, # Local Ollama, no config needed
persona: "You are a professional translator.",
style: "Casual",
ignored_languages: ["en"]
# config/prod.exs
config :gettext_translator, GettextTranslator,
endpoint: LangChain.ChatModels.ChatOpenAI,
endpoint_model: "gpt-4",
endpoint_temperature: 0,
endpoint_config: %{
"openai_key" => System.get_env("OPENAI_API_KEY")
},
persona: "You are a professional translator.",
style: "Casual",
ignored_languages: ["en"]Testing Your Custom Endpoint
1. Unit Test Your Adapter
# test/my_app/custom_llm_adapter_test.exs
defmodule MyApp.CustomLLMAdapterTest do
use ExUnit.Case
alias MyApp.CustomLLMAdapter
alias LangChain.Message
alias LangChain.Message.ContentPart
test "new/1 creates adapter with valid config" do
{:ok, adapter} = CustomLLMAdapter.new(%{
model: "test-model",
api_key: "test-key",
endpoint: "https://test.com"
})
assert adapter.model == "test-model"
assert adapter.api_key == "test-key"
end
test "new/1 validates required fields" do
assert {:error, _} = CustomLLMAdapter.new(%{model: "test"})
end
test "call/2 returns properly formatted message" do
adapter = CustomLLMAdapter.new!(%{
model: "test-model",
api_key: "test-key",
endpoint: "https://test.com"
})
messages = [
Message.new_user!("Translate 'hello' to Spanish")
]
# Mock the API response
# ... your mocking logic ...
assert {:ok, response} = CustomLLMAdapter.call(adapter, messages)
assert %Message{} = response
assert response.role == :assistant
assert is_list(response.content)
assert [%ContentPart{type: :text, content: text}] = response.content
assert is_binary(text)
end
end2. Integration Test with GettextTranslator
# test/integration/translation_test.exs
defmodule GettextTranslatorIntegrationTest do
use ExUnit.Case
test "translates with custom adapter" do
provider = %{
ignored_languages: ["en"],
persona: "Professional translator",
style: "Casual",
endpoint: %{
config: %{
"api_key" => "test-key"
},
adapter: MyApp.CustomLLMAdapter,
model: "test-model",
temperature: 0
}
}
opts = %{
language_code: "es",
message: "Hello, world!"
}
assert {:ok, translation} = GettextTranslator.Processor.LLM.translate(provider, opts)
assert is_binary(translation)
assert translation != ""
end
end3. Manual Testing
# Run translation with your custom adapter
mix gettext_translator.run
# Check logs for errors
tail -f log/dev.log | grep -i "error\|translat"
4. Verify Response Format
Add logging to verify your adapter returns the correct format:
def call(adapter, messages, _functions) do
case make_api_request(adapter, messages) do
{:ok, response_text} ->
message = %Message{
role: :assistant,
content: [%ContentPart{type: :text, content: response_text}],
status: :complete
}
# Debug logging
require Logger
Logger.debug("Adapter response: #{inspect(message)}")
{:ok, message}
{:error, reason} ->
Logger.error("Adapter error: #{inspect(reason)}")
{:error, nil, reason}
end
endCommon Issues
Issue 1: "content is not a list"
Error:
** (FunctionClauseError) no function clause matching in ContentPart.parts_to_string/1Cause: Your adapter is returning content as a string instead of a list of ContentPart structs.
Solution:
# ❌ WRONG
content: "translated text"
# ✅ CORRECT
content: [%ContentPart{type: :text, content: "translated text"}]Issue 2: "undefined function ContentPart.new/1"
Error:
** (UndefinedFunctionError) function LangChain.Message.ContentPart.new/1 is undefinedCause: Trying to use ContentPart.new/1 which doesn't exist.
Solution: Use struct syntax instead:
# ❌ WRONG
ContentPart.new(%{type: :text, content: "text"})
# ✅ CORRECT
%ContentPart{type: :text, content: "text"}Issue 3: "pattern match failed on {:ok, result}"
Error:
** (MatchError) no match of right hand side value: {:ok, %Message{...}}Cause: GettextTranslator expects {:ok, %{last_message: %Message{}}} but your adapter returns {:ok, %Message{}}.
Solution: Make sure your LLMChain implementation wraps the message properly:
# Your adapter's call/2 should return:
{:ok, %Message{...}}
# LLMChain will wrap it as:
{:ok, %LLMChain{last_message: %Message{...}}}Issue 4: Empty translations returned
Symptoms: Translations complete but return empty strings.
Possible Causes:
- API errors being silently caught
- Response parsing errors
- Content extraction failing
Debug Steps:
# Add detailed logging in your adapter
def call(adapter, messages, _functions) do
Logger.debug("Sending messages: #{inspect(messages)}")
case make_api_request(adapter, payload) do
{:ok, response_text} ->
Logger.debug("Received response: #{inspect(response_text)}")
# ... rest of code
{:error, reason} ->
Logger.error("API error: #{inspect(reason)}")
{:error, nil, reason}
end
endIssue 5: Streaming not working
Symptoms: Streaming responses timeout or fail.
Checklist:
- [ ] Endpoint supports Server-Sent Events (SSE)
- [ ]
Accept: text/event-streamheader is set - [ ] SSE parsing handles
data:prefix correctly - [ ] Stream timeout is sufficient (30+ seconds)
- [ ] Deltas are properly merged
- [ ] Final delta has
status: :complete
Examples
Example 1: OpenAI-Compatible Endpoint
Many providers offer OpenAI-compatible APIs. You can use them with the built-in adapter:
config :gettext_translator, GettextTranslator,
endpoint: LangChain.ChatModels.ChatOpenAI,
endpoint_model: "your-model-name",
endpoint_temperature: 0,
endpoint_config: %{
"openai_key" => System.get_env("API_KEY"),
"openai_endpoint" => "https://your-provider.com/v1/chat/completions"
},
persona: "Professional translator",
style: "Casual",
ignored_languages: ["en"]Example 2: Azure OpenAI
config :gettext_translator, GettextTranslator,
endpoint: LangChain.ChatModels.ChatOpenAI,
endpoint_model: "gpt-4",
endpoint_temperature: 0,
endpoint_config: %{
"openai_key" => System.get_env("AZURE_OPENAI_KEY"),
"openai_endpoint" => "https://your-resource.openai.azure.com/openai/deployments/your-deployment/chat/completions?api-version=2023-05-15"
},
persona: "Professional translator",
style: "Casual",
ignored_languages: ["en"]Example 3: Local LLM with Ollama
# Note: Ollama support may be limited in LangChain 0.4.0
config :gettext_translator, GettextTranslator,
endpoint: LangChain.ChatModels.ChatOllamaAI,
endpoint_model: "llama3.2:latest",
endpoint_temperature: 0,
endpoint_config: %{
# Empty if using default local endpoint
},
persona: "Professional translator",
style: "Casual",
ignored_languages: ["en"]Example 4: Multiple Providers (Environment-Based)
# config/config.exs
config :gettext_translator, GettextTranslator,
endpoint: System.get_env("LLM_PROVIDER", "ollama") |> provider_module(),
endpoint_model: System.get_env("LLM_MODEL", "llama3.2:latest"),
endpoint_temperature: 0,
endpoint_config: provider_config(),
persona: "Professional translator",
style: "Casual",
ignored_languages: ["en"]
defp provider_module("openai"), do: LangChain.ChatModels.ChatOpenAI
defp provider_module("anthropic"), do: LangChain.ChatModels.ChatAnthropic
defp provider_module("gemini"), do: LangChain.ChatModels.ChatGoogleAI
defp provider_module("ollama"), do: LangChain.ChatModels.ChatOllamaAI
defp provider_module(_), do: LangChain.ChatModels.ChatOpenAI
defp provider_config do
case System.get_env("LLM_PROVIDER", "ollama") do
"openai" ->
%{"openai_key" => System.get_env("OPENAI_API_KEY")}
"anthropic" ->
%{"anthropic_key" => System.get_env("ANTHROPIC_API_KEY")}
"gemini" ->
%{"google_ai_key" => System.get_env("GOOGLE_AI_KEY")}
_ ->
%{}
end
endBuilding Your Own LLM Gateway
If you're running LLMs locally (e.g., in Docker) and want to provide an HTTP API that works with LangChain:
📘 See LLM_GATEWAY_EXAMPLE.md for a complete, production-ready Elixir implementation that includes:
- OpenAI-compatible HTTP API endpoint
- Queue system with GenStage for backpressure control
- Support for multiple LLM backends (Ollama, vLLM, TGI)
- API key authentication and rate limiting
- Both streaming (SSE) and synchronous responses
- Complete Docker deployment setup
This gateway sits between LangChain clients and your local LLM, handling queuing, authentication, and protocol translation.
Additional Resources
- LangChain Elixir Documentation
- LangChain 0.4.0 Changelog
- GettextTranslator Repository
- OpenAI API Documentation
- Anthropic API Documentation
- Ollama API Documentation
- vLLM Documentation
- Text Generation Inference
Support
If you encounter issues with custom endpoints:
- Check this guide for common issues
- Enable debug logging in your adapter
- Verify response format matches LangChain 0.4.0 requirements
- Open an issue on GitHub with:
- Your adapter code (sanitized)
- Error messages and stack traces
- LangChain version
- Example request/response payloads