Elixir client for the Hugging Face Inference API.
Mirrors the full feature set of the @huggingface/inference npm package:
- 24+ inference providers (Groq, Together, Replicate, Fal.ai, Nebius, …)
- 30+ ML tasks (chat completion, image generation, ASR, embeddings, …)
- Streaming via Server-Sent Events
- Automatic provider routing via the HF Hub API
Quick start
# Create a client
client = HuggingfaceClient.client("hf_your_token")
# Chat completion
{:ok, resp} = HuggingfaceClient.chat_completion(client, %{
model: "meta-llama/Llama-3.1-8B-Instruct",
messages: [%{role: "user", content: "Hello!"}]
})
IO.puts(resp["choices"] |> hd() |> get_in(["message", "content"]))
# Streaming chat completion
{:ok, stream} = HuggingfaceClient.chat_completion_stream(client, %{
model: "meta-llama/Llama-3.1-8B-Instruct",
messages: [%{role: "user", content: "Tell me a story"}]
})
Enum.each(stream, fn chunk ->
IO.write(get_in(chunk, ["choices", Access.at(0), "delta", "content"]) || "")
end)
# Use a different provider
{:ok, resp} = HuggingfaceClient.chat_completion(client, %{
model: "meta-llama/Llama-3.1-8B-Instruct",
provider: "groq",
messages: [%{role: "user", content: "Hi from Groq!"}]
})
# Text-to-image
{:ok, image_bytes} = HuggingfaceClient.text_to_image(client, %{
model: "stabilityai/stable-diffusion-2",
inputs: "a scenic mountain lake at sunset"
})
File.write!("output.png", image_bytes)
# Embeddings
{:ok, embeddings} = HuggingfaceClient.feature_extraction(client, %{
model: "sentence-transformers/all-MiniLM-L6-v2",
inputs: ["Hello world", "Bonjour le monde"]
})Using a dedicated endpoint
endpoint_client = HuggingfaceClient.endpoint_client(
"hf_token",
"https://xyz.eu-west-1.aws.endpoints.huggingface.cloud/my-model"
)
{:ok, resp} = HuggingfaceClient.text_generation(endpoint_client, %{
inputs: "The answer is"
})Configuration
config :huggingface_client,
hub_url: "https://huggingface.co", # default
router_url: "https://router.huggingface.co", # default
finch_opts: [
pools: %{
"https://api.groq.com" => [size: 25]
}
]
Summary
Functions
Applies a Jinja2 chat template to a list of messages.
Audio classification. Returns label + score pairs.
Audio-to-audio transformation (source separation, enhancement).
Automatic speech recognition / transcription.
Returns the list of providers that currently support a model.
Chat completion (OpenAI-compatible /v1/chat/completions).
Streaming chat completion. Returns {:ok, stream} where each element is a delta chunk.
Creates a new inference client with the given access token.
Collects all content tokens from a streaming chat completion into a single string.
Returns a lazy stream of plain content token strings from a chat completion stream.
Estimates monocular depth from an image. Returns a depth map useful for 3D reconstruction, AR, robotics.
Document question answering from scanned documents.
Creates a client tied to a specific inference endpoint.
Dense embedding / feature extraction.
Fill-mask (masked language modelling).
Image classification. Returns [%{"label" => ..., "score" => ...}].
Image segmentation. Returns list of %{label, mask, score} maps.
Image + text → image (multimodal). Takes an image input and text prompt, returns a generated image.
Multimodal vision-language: image + text prompt → text response (GPT-4V style).
Used for visual QA, chart understanding, document analysis, multi-turn vision conversations.
Different from image_to_text/2 which captions without a text prompt.
Image + text → video (multimodal). Takes an image input and text prompt, returns a generated video.
Image-to-image transformation (style transfer, inpainting, super-resolution).
Image captioning / image-to-text.
Animate a still image into a short video clip.
Generates segmentation masks (SAM / segment-anything style). Returns masks for all objects detected in an image.
Fetches model metadata and available inference providers from the Hub.
Object detection with bounding boxes.
Extractive question answering.
Renders a Jinja2 template string with the given variables.
Raw inference request — sends inputs directly to a provider with no task-level
validation or response transformation.
Sentence similarity scoring.
Abstractive summarisation.
Table question answering (TAPAS / TaBERT).
Tabular data classification. Returns predicted class indices.
Tabular data regression. Returns predicted float values.
Text classification (sentiment analysis, topic, etc.).
Text generation (completion, non-chat).
Streaming text generation. Returns {:ok, stream} of delta chunks.
Text-to-audio generation (music, effects). Returns audio bytes.
Text-to-image generation.
Text-to-speech synthesis. Returns audio bytes.
Text-to-video generation. Returns video bytes.
Token / entity classification (NER).
Neural machine translation.
Classifies a video clip into predefined categories.
Visual question answering — answer questions about an image.
Zero-shot text classification.
Zero-shot image classification with candidate labels.
Functions
@spec apply_chat_template(String.t(), [map()], map()) :: {:ok, String.t()} | {:error, Exception.t()}
Applies a Jinja2 chat template to a list of messages.
Delegates to HuggingfaceClient.Jinja.apply_chat_template/3.
Most HuggingFace models include a chat template in their
tokenizer_config.json under the "chat_template" key.
Example
template = ~s({% for m in messages %}<{{ m["role"] }}>{{ m["content"] }}</{{ m["role"] }}>{% endfor %})
{:ok, text} = HuggingfaceClient.apply_chat_template(template, [
%{"role" => "user", "content" => "Hello!"}
])
@spec audio_classification(HuggingfaceClient.Client.t(), map()) :: {:ok, list()} | {:error, Exception.t()}
Audio classification. Returns label + score pairs.
@spec audio_to_audio(HuggingfaceClient.Client.t(), map()) :: {:ok, list()} | {:error, Exception.t()}
Audio-to-audio transformation (source separation, enhancement).
@spec automatic_speech_recognition(HuggingfaceClient.Client.t(), map()) :: {:ok, map()} | {:error, Exception.t()}
Automatic speech recognition / transcription.
@spec available_providers( String.t(), keyword() ) :: {:ok, [String.t()]} | {:error, Exception.t()}
Returns the list of providers that currently support a model.
Delegates to HuggingfaceClient.Inference.ModelInfo.available_providers/2.
Example
{:ok, providers} = HuggingfaceClient.available_providers("meta-llama/Llama-3.1-8B-Instruct")
# => ["groq", "together", "nebius", ...]
@spec chat_completion(HuggingfaceClient.Client.t(), map()) :: {:ok, map()} | {:error, Exception.t()}
Chat completion (OpenAI-compatible /v1/chat/completions).
Supports all providers that expose the chat completions API.
The provider: "auto" (default) routes to the first available provider
for the model, sorted by the user's preference in https://hf.co/settings/inference-providers.
Arguments
:model— HuggingFace model ID (e.g."meta-llama/Llama-3.1-8B-Instruct"):messages— list of%{role: string, content: string}maps:provider— override the provider (e.g."groq","together"):max_tokens— maximum output tokens:temperature— sampling temperature:tools— list of function tool definitions- any other OpenAI chat-completion parameters
Returns
{:ok, %{"choices" => [...], "model" => ..., ...}} or {:error, exception}.
@spec chat_completion_stream(HuggingfaceClient.Client.t(), map()) :: {:ok, Enumerable.t()} | {:error, Exception.t()}
Streaming chat completion. Returns {:ok, stream} where each element is a delta chunk.
Example
{:ok, stream} = HuggingfaceClient.chat_completion_stream(client, %{
model: "meta-llama/Llama-3.1-8B-Instruct",
messages: [%{role: "user", content: "Write a haiku"}]
})
Enum.each(stream, fn chunk ->
IO.write(get_in(chunk, ["choices", Access.at(0), "delta", "content"]) || "")
end)
@spec client( String.t() | nil, keyword() ) :: HuggingfaceClient.Client.t()
Creates a new inference client with the given access token.
Options
:provider- Default inference provider. The default value isnil.:bill_to- HF organisation to bill requests to. The default value isnil.:endpoint_url- Custom endpoint URL. Overrides provider-based routing. The default value isnil.:retry_on_503(boolean/0) - Automatically retry once on HTTP 503 responses. The default value istrue.:req_opts(keyword/0) - Extra options forwarded to Req. The default value is[].
Examples
client = HuggingfaceClient.client("hf_your_token")
client = HuggingfaceClient.client("hf_token", provider: "groq", bill_to: "my-org")
@spec collect_content(Enumerable.t()) :: String.t()
Collects all content tokens from a streaming chat completion into a single string.
Delegates to HuggingfaceClient.Inference.StreamHelpers.collect_content/1.
Example
{:ok, stream} = HuggingfaceClient.chat_completion_stream(client, args)
text = HuggingfaceClient.collect_content(stream)
@spec content_stream(Enumerable.t()) :: Enumerable.t()
Returns a lazy stream of plain content token strings from a chat completion stream.
Delegates to HuggingfaceClient.Inference.StreamHelpers.content_stream/1.
@spec depth_estimation(HuggingfaceClient.Client.t(), map()) :: {:ok, term()} | {:error, Exception.t()}
Estimates monocular depth from an image. Returns a depth map useful for 3D reconstruction, AR, robotics.
Example
{:ok, result} = HuggingfaceClient.depth_estimation(client, %{
image: "https://example.com/scene.jpg"
})
@spec document_question_answering(HuggingfaceClient.Client.t(), map()) :: {:ok, map()} | {:error, Exception.t()}
Document question answering from scanned documents.
@spec endpoint_client(String.t() | nil, String.t(), keyword()) :: HuggingfaceClient.Client.t()
Creates a client tied to a specific inference endpoint.
Examples
client = HuggingfaceClient.endpoint_client(
"hf_token",
"https://xyz.eu-west-1.aws.endpoints.huggingface.cloud/gpt2"
)
@spec feature_extraction(HuggingfaceClient.Client.t(), map()) :: {:ok, list()} | {:error, Exception.t()}
Dense embedding / feature extraction.
Returns {:ok, [[float], ...]} — a list of embedding vectors.
@spec fill_mask(HuggingfaceClient.Client.t(), map()) :: {:ok, list()} | {:error, Exception.t()}
Fill-mask (masked language modelling).
@spec image_classification(HuggingfaceClient.Client.t(), map()) :: {:ok, list()} | {:error, Exception.t()}
Image classification. Returns [%{"label" => ..., "score" => ...}].
@spec image_segmentation(HuggingfaceClient.Client.t(), map()) :: {:ok, list()} | {:error, Exception.t()}
Image segmentation. Returns list of %{label, mask, score} maps.
@spec image_text_to_image(HuggingfaceClient.Client.t(), map()) :: {:ok, binary()} | {:error, Exception.t()}
Image + text → image (multimodal). Takes an image input and text prompt, returns a generated image.
Recommended model: black-forest-labs/FLUX.1-dev
@spec image_text_to_text(HuggingfaceClient.Client.t(), map()) :: {:ok, term()} | {:error, Exception.t()}
Multimodal vision-language: image + text prompt → text response (GPT-4V style).
Used for visual QA, chart understanding, document analysis, multi-turn vision conversations.
Different from image_to_text/2 which captions without a text prompt.
Example
client = HuggingfaceClient.client(token, model: "llava-hf/llava-1.5-7b-hf")
{:ok, resp} = HuggingfaceClient.image_text_to_text(client, %{
image: "https://example.com/chart.png",
prompt: "What does this chart show?"
})
@spec image_text_to_video(HuggingfaceClient.Client.t(), map()) :: {:ok, binary()} | {:error, Exception.t()}
Image + text → video (multimodal). Takes an image input and text prompt, returns a generated video.
Recommended model: Lightricks/LTX-Video
@spec image_to_image(HuggingfaceClient.Client.t(), map()) :: {:ok, binary()} | {:error, Exception.t()}
Image-to-image transformation (style transfer, inpainting, super-resolution).
@spec image_to_text(HuggingfaceClient.Client.t(), map()) :: {:ok, map()} | {:error, Exception.t()}
Image captioning / image-to-text.
@spec image_to_video(HuggingfaceClient.Client.t(), map()) :: {:ok, binary()} | {:error, Exception.t()}
Animate a still image into a short video clip.
@spec mask_generation(HuggingfaceClient.Client.t(), map()) :: {:ok, term()} | {:error, Exception.t()}
Generates segmentation masks (SAM / segment-anything style). Returns masks for all objects detected in an image.
Example
{:ok, masks} = HuggingfaceClient.mask_generation(client, %{
image: "https://example.com/photo.jpg"
})
@spec model_info( String.t(), keyword() ) :: {:ok, HuggingfaceClient.Inference.ModelInfo.t()} | {:error, Exception.t()}
Fetches model metadata and available inference providers from the Hub.
Delegates to HuggingfaceClient.Inference.ModelInfo.fetch/2.
Example
{:ok, info} = HuggingfaceClient.model_info("meta-llama/Llama-3.1-8B-Instruct",
access_token: "hf_..."
)
IO.inspect(info.providers)
@spec object_detection(HuggingfaceClient.Client.t(), map()) :: {:ok, list()} | {:error, Exception.t()}
Object detection with bounding boxes.
@spec question_answering(HuggingfaceClient.Client.t(), map()) :: {:ok, map()} | {:error, Exception.t()}
Extractive question answering.
@spec render_template(String.t(), map()) :: {:ok, String.t()} | {:error, Exception.t()}
Renders a Jinja2 template string with the given variables.
Delegates to HuggingfaceClient.Jinja.render/2.
Example
{:ok, text} = HuggingfaceClient.render_template(
"Hello, {{ name }}!",
%{"name" => "World"}
)
@spec request(HuggingfaceClient.Client.t(), map()) :: {:ok, term()} | {:error, Exception.t()}
Raw inference request — sends inputs directly to a provider with no task-level
validation or response transformation.
Useful for:
- Custom fine-tuned models with non-standard I/O
- Providers not yet covered by a dedicated task
- Debugging raw provider responses
Example
{:ok, result} = HuggingfaceClient.request(client, %{
model: "my-user/my-custom-model",
inputs: "some raw text",
parameters: %{custom_param: 42}
})
@spec sentence_similarity(HuggingfaceClient.Client.t(), map()) :: {:ok, list()} | {:error, Exception.t()}
Sentence similarity scoring.
@spec summarization(HuggingfaceClient.Client.t(), map()) :: {:ok, map()} | {:error, Exception.t()}
Abstractive summarisation.
@spec table_question_answering(HuggingfaceClient.Client.t(), map()) :: {:ok, map()} | {:error, Exception.t()}
Table question answering (TAPAS / TaBERT).
@spec tabular_classification(HuggingfaceClient.Client.t(), map()) :: {:ok, list()} | {:error, Exception.t()}
Tabular data classification. Returns predicted class indices.
@spec tabular_regression(HuggingfaceClient.Client.t(), map()) :: {:ok, list()} | {:error, Exception.t()}
Tabular data regression. Returns predicted float values.
@spec text_classification(HuggingfaceClient.Client.t(), map()) :: {:ok, list()} | {:error, Exception.t()}
Text classification (sentiment analysis, topic, etc.).
Returns {:ok, [{"label", "score"}, ...]}.
@spec text_generation(HuggingfaceClient.Client.t(), map()) :: {:ok, map()} | {:error, Exception.t()}
Text generation (completion, non-chat).
Returns {:ok, %{"generated_text" => string}}.
@spec text_generation_stream(HuggingfaceClient.Client.t(), map()) :: {:ok, Enumerable.t()} | {:error, Exception.t()}
Streaming text generation. Returns {:ok, stream} of delta chunks.
@spec text_to_audio(HuggingfaceClient.Client.t(), map()) :: {:ok, binary()} | {:error, Exception.t()}
Text-to-audio generation (music, effects). Returns audio bytes.
@spec text_to_image(HuggingfaceClient.Client.t(), map()) :: {:ok, binary()} | {:error, Exception.t()}
Text-to-image generation.
Returns {:ok, binary} (raw image bytes) by default.
Pass output_type: :url for a URL string (where provider supports it).
@spec text_to_speech(HuggingfaceClient.Client.t(), map()) :: {:ok, binary()} | {:error, Exception.t()}
Text-to-speech synthesis. Returns audio bytes.
@spec text_to_video(HuggingfaceClient.Client.t(), map()) :: {:ok, binary()} | {:error, Exception.t()}
Text-to-video generation. Returns video bytes.
@spec token_classification(HuggingfaceClient.Client.t(), map()) :: {:ok, list()} | {:error, Exception.t()}
Token / entity classification (NER).
@spec translation(HuggingfaceClient.Client.t(), map()) :: {:ok, map()} | {:error, Exception.t()}
Neural machine translation.
@spec video_classification(HuggingfaceClient.Client.t(), map()) :: {:ok, term()} | {:error, Exception.t()}
Classifies a video clip into predefined categories.
Example
{:ok, results} = HuggingfaceClient.video_classification(client, %{
video: File.read!("action.mp4"),
top_k: 5
})
@spec visual_question_answering(HuggingfaceClient.Client.t(), map()) :: {:ok, map()} | {:error, Exception.t()}
Visual question answering — answer questions about an image.
@spec zero_shot_classification(HuggingfaceClient.Client.t(), map()) :: {:ok, list()} | {:error, Exception.t()}
Zero-shot text classification.
@spec zero_shot_image_classification(HuggingfaceClient.Client.t(), map()) :: {:ok, list()} | {:error, Exception.t()}
Zero-shot image classification with candidate labels.