# `HuggingfaceClient.Inference.Inference`
[🔗](https://github.com/huggingface/huggingface_client/blob/v0.1.0/lib/huggingface_client/inference/inference.ex#L1)

Elixir client for the Hugging Face Inference API.

Mirrors the full feature set of the `@huggingface/inference` npm package:
- **24+ inference providers** (Groq, Together, Replicate, Fal.ai, Nebius, …)
- **30+ ML tasks** (chat completion, image generation, ASR, embeddings, …)
- **Streaming** via Server-Sent Events
- **Automatic provider routing** via the HF Hub API

## Quick start

    # Create a client
    client = HuggingfaceClient.client("hf_your_token")

    # Chat completion
    {:ok, resp} = HuggingfaceClient.chat_completion(client, %{
      model: "meta-llama/Llama-3.1-8B-Instruct",
      messages: [%{role: "user", content: "Hello!"}]
    })
    IO.puts(resp["choices"] |> hd() |> get_in(["message", "content"]))

    # Streaming chat completion
    {:ok, stream} = HuggingfaceClient.chat_completion_stream(client, %{
      model: "meta-llama/Llama-3.1-8B-Instruct",
      messages: [%{role: "user", content: "Tell me a story"}]
    })
    Enum.each(stream, fn chunk ->
      IO.write(get_in(chunk, ["choices", Access.at(0), "delta", "content"]) || "")
    end)

    # Use a different provider
    {:ok, resp} = HuggingfaceClient.chat_completion(client, %{
      model: "meta-llama/Llama-3.1-8B-Instruct",
      provider: "groq",
      messages: [%{role: "user", content: "Hi from Groq!"}]
    })

    # Text-to-image
    {:ok, image_bytes} = HuggingfaceClient.text_to_image(client, %{
      model: "stabilityai/stable-diffusion-2",
      inputs: "a scenic mountain lake at sunset"
    })
    File.write!("output.png", image_bytes)

    # Embeddings
    {:ok, embeddings} = HuggingfaceClient.feature_extraction(client, %{
      model: "sentence-transformers/all-MiniLM-L6-v2",
      inputs: ["Hello world", "Bonjour le monde"]
    })

## Using a dedicated endpoint

    endpoint_client = HuggingfaceClient.endpoint_client(
      "hf_token",
      "https://xyz.eu-west-1.aws.endpoints.huggingface.cloud/my-model"
    )
    {:ok, resp} = HuggingfaceClient.text_generation(endpoint_client, %{
      inputs: "The answer is"
    })

## Configuration

    config :huggingface_client,
      hub_url: "https://huggingface.co",      # default
      router_url: "https://router.huggingface.co",  # default
      finch_opts: [
        pools: %{
          "https://api.groq.com" => [size: 25]
        }
      ]

# `apply_chat_template`

```elixir
@spec apply_chat_template(String.t(), [map()], map()) ::
  {:ok, String.t()} | {:error, Exception.t()}
```

Applies a Jinja2 chat template to a list of messages.

Delegates to `HuggingfaceClient.Jinja.apply_chat_template/3`.

Most HuggingFace models include a chat template in their
`tokenizer_config.json` under the `"chat_template"` key.

## Example

    template = ~s({% for m in messages %}<{{ m["role"] }}>{{ m["content"] }}</{{ m["role"] }}>{% endfor %})

    {:ok, text} = HuggingfaceClient.apply_chat_template(template, [
      %{"role" => "user", "content" => "Hello!"}
    ])

# `audio_classification`

```elixir
@spec audio_classification(HuggingfaceClient.Client.t(), map()) ::
  {:ok, list()} | {:error, Exception.t()}
```

Audio classification. Returns label + score pairs.

# `audio_to_audio`

```elixir
@spec audio_to_audio(HuggingfaceClient.Client.t(), map()) ::
  {:ok, list()} | {:error, Exception.t()}
```

Audio-to-audio transformation (source separation, enhancement).

# `automatic_speech_recognition`

```elixir
@spec automatic_speech_recognition(HuggingfaceClient.Client.t(), map()) ::
  {:ok, map()} | {:error, Exception.t()}
```

Automatic speech recognition / transcription.

# `available_providers`

```elixir
@spec available_providers(
  String.t(),
  keyword()
) :: {:ok, [String.t()]} | {:error, Exception.t()}
```

Returns the list of providers that currently support a model.

Delegates to `HuggingfaceClient.Inference.ModelInfo.available_providers/2`.

## Example

    {:ok, providers} = HuggingfaceClient.available_providers("meta-llama/Llama-3.1-8B-Instruct")
    # => ["groq", "together", "nebius", ...]

# `chat_completion`

```elixir
@spec chat_completion(HuggingfaceClient.Client.t(), map()) ::
  {:ok, map()} | {:error, Exception.t()}
```

Chat completion (OpenAI-compatible `/v1/chat/completions`).

Supports all providers that expose the chat completions API.
The `provider: "auto"` (default) routes to the first available provider
for the model, sorted by the user's preference in https://hf.co/settings/inference-providers.

## Arguments

- `:model` — HuggingFace model ID (e.g. `"meta-llama/Llama-3.1-8B-Instruct"`)
- `:messages` — list of `%{role: string, content: string}` maps
- `:provider` — override the provider (e.g. `"groq"`, `"together"`)
- `:max_tokens` — maximum output tokens
- `:temperature` — sampling temperature
- `:tools` — list of function tool definitions
- any other OpenAI chat-completion parameters

## Returns

`{:ok, %{"choices" => [...], "model" => ..., ...}}` or `{:error, exception}`.

# `chat_completion_stream`

```elixir
@spec chat_completion_stream(HuggingfaceClient.Client.t(), map()) ::
  {:ok, Enumerable.t()} | {:error, Exception.t()}
```

Streaming chat completion. Returns `{:ok, stream}` where each element is a delta chunk.

## Example

    {:ok, stream} = HuggingfaceClient.chat_completion_stream(client, %{
      model: "meta-llama/Llama-3.1-8B-Instruct",
      messages: [%{role: "user", content: "Write a haiku"}]
    })
    Enum.each(stream, fn chunk ->
      IO.write(get_in(chunk, ["choices", Access.at(0), "delta", "content"]) || "")
    end)

# `client`

```elixir
@spec client(
  String.t() | nil,
  keyword()
) :: HuggingfaceClient.Client.t()
```

Creates a new inference client with the given access token.

## Options

* `:provider` - Default inference provider. The default value is `nil`.

* `:bill_to` - HF organisation to bill requests to. The default value is `nil`.

* `:endpoint_url` - Custom endpoint URL. Overrides provider-based routing. The default value is `nil`.

* `:retry_on_503` (`t:boolean/0`) - Automatically retry once on HTTP 503 responses. The default value is `true`.

* `:req_opts` (`t:keyword/0`) - Extra options forwarded to Req. The default value is `[]`.

## Examples

    client = HuggingfaceClient.client("hf_your_token")
    client = HuggingfaceClient.client("hf_token", provider: "groq", bill_to: "my-org")

# `collect_content`

```elixir
@spec collect_content(Enumerable.t()) :: String.t()
```

Collects all content tokens from a streaming chat completion into a single string.

Delegates to `HuggingfaceClient.Inference.StreamHelpers.collect_content/1`.

## Example

    {:ok, stream} = HuggingfaceClient.chat_completion_stream(client, args)
    text = HuggingfaceClient.collect_content(stream)

# `content_stream`

```elixir
@spec content_stream(Enumerable.t()) :: Enumerable.t()
```

Returns a lazy stream of plain content token strings from a chat completion stream.

Delegates to `HuggingfaceClient.Inference.StreamHelpers.content_stream/1`.

# `depth_estimation`

```elixir
@spec depth_estimation(HuggingfaceClient.Client.t(), map()) ::
  {:ok, term()} | {:error, Exception.t()}
```

Estimates monocular depth from an image.
Returns a depth map useful for 3D reconstruction, AR, robotics.

## Example
    {:ok, result} = HuggingfaceClient.depth_estimation(client, %{
      image: "https://example.com/scene.jpg"
    })

# `document_question_answering`

```elixir
@spec document_question_answering(HuggingfaceClient.Client.t(), map()) ::
  {:ok, map()} | {:error, Exception.t()}
```

Document question answering from scanned documents.

# `endpoint_client`

```elixir
@spec endpoint_client(String.t() | nil, String.t(), keyword()) ::
  HuggingfaceClient.Client.t()
```

Creates a client tied to a specific inference endpoint.

## Examples

    client = HuggingfaceClient.endpoint_client(
      "hf_token",
      "https://xyz.eu-west-1.aws.endpoints.huggingface.cloud/gpt2"
    )

# `feature_extraction`

```elixir
@spec feature_extraction(HuggingfaceClient.Client.t(), map()) ::
  {:ok, list()} | {:error, Exception.t()}
```

Dense embedding / feature extraction.

Returns `{:ok, [[float], ...]}` — a list of embedding vectors.

# `fill_mask`

```elixir
@spec fill_mask(HuggingfaceClient.Client.t(), map()) ::
  {:ok, list()} | {:error, Exception.t()}
```

Fill-mask (masked language modelling).

# `image_classification`

```elixir
@spec image_classification(HuggingfaceClient.Client.t(), map()) ::
  {:ok, list()} | {:error, Exception.t()}
```

Image classification. Returns `[%{"label" => ..., "score" => ...}]`.

# `image_segmentation`

```elixir
@spec image_segmentation(HuggingfaceClient.Client.t(), map()) ::
  {:ok, list()} | {:error, Exception.t()}
```

Image segmentation. Returns list of `%{label, mask, score}` maps.

# `image_text_to_image`

```elixir
@spec image_text_to_image(HuggingfaceClient.Client.t(), map()) ::
  {:ok, binary()} | {:error, Exception.t()}
```

Image + text → image (multimodal). Takes an image input and text prompt, returns a generated image.

Recommended model: `black-forest-labs/FLUX.1-dev`

# `image_text_to_text`

```elixir
@spec image_text_to_text(HuggingfaceClient.Client.t(), map()) ::
  {:ok, term()} | {:error, Exception.t()}
```

Multimodal vision-language: image + text prompt → text response (GPT-4V style).
Used for visual QA, chart understanding, document analysis, multi-turn vision conversations.
Different from `image_to_text/2` which captions without a text prompt.

## Example
    client = HuggingfaceClient.client(token, model: "llava-hf/llava-1.5-7b-hf")
    {:ok, resp} = HuggingfaceClient.image_text_to_text(client, %{
      image: "https://example.com/chart.png",
      prompt: "What does this chart show?"
    })

# `image_text_to_video`

```elixir
@spec image_text_to_video(HuggingfaceClient.Client.t(), map()) ::
  {:ok, binary()} | {:error, Exception.t()}
```

Image + text → video (multimodal). Takes an image input and text prompt, returns a generated video.

Recommended model: `Lightricks/LTX-Video`

# `image_to_image`

```elixir
@spec image_to_image(HuggingfaceClient.Client.t(), map()) ::
  {:ok, binary()} | {:error, Exception.t()}
```

Image-to-image transformation (style transfer, inpainting, super-resolution).

# `image_to_text`

```elixir
@spec image_to_text(HuggingfaceClient.Client.t(), map()) ::
  {:ok, map()} | {:error, Exception.t()}
```

Image captioning / image-to-text.

# `image_to_video`

```elixir
@spec image_to_video(HuggingfaceClient.Client.t(), map()) ::
  {:ok, binary()} | {:error, Exception.t()}
```

Animate a still image into a short video clip.

# `mask_generation`

```elixir
@spec mask_generation(HuggingfaceClient.Client.t(), map()) ::
  {:ok, term()} | {:error, Exception.t()}
```

Generates segmentation masks (SAM / segment-anything style).
Returns masks for all objects detected in an image.

## Example
    {:ok, masks} = HuggingfaceClient.mask_generation(client, %{
      image: "https://example.com/photo.jpg"
    })

# `model_info`

```elixir
@spec model_info(
  String.t(),
  keyword()
) :: {:ok, HuggingfaceClient.Inference.ModelInfo.t()} | {:error, Exception.t()}
```

Fetches model metadata and available inference providers from the Hub.

Delegates to `HuggingfaceClient.Inference.ModelInfo.fetch/2`.

## Example

    {:ok, info} = HuggingfaceClient.model_info("meta-llama/Llama-3.1-8B-Instruct",
      access_token: "hf_..."
    )
    IO.inspect(info.providers)

# `object_detection`

```elixir
@spec object_detection(HuggingfaceClient.Client.t(), map()) ::
  {:ok, list()} | {:error, Exception.t()}
```

Object detection with bounding boxes.

# `question_answering`

```elixir
@spec question_answering(HuggingfaceClient.Client.t(), map()) ::
  {:ok, map()} | {:error, Exception.t()}
```

Extractive question answering.

# `render_template`

```elixir
@spec render_template(String.t(), map()) ::
  {:ok, String.t()} | {:error, Exception.t()}
```

Renders a Jinja2 template string with the given variables.

Delegates to `HuggingfaceClient.Jinja.render/2`.

## Example

    {:ok, text} = HuggingfaceClient.render_template(
      "Hello, {{ name }}!",
      %{"name" => "World"}
    )

# `request`

```elixir
@spec request(HuggingfaceClient.Client.t(), map()) ::
  {:ok, term()} | {:error, Exception.t()}
```

Raw inference request — sends `inputs` directly to a provider with no task-level
validation or response transformation.

Useful for:
- Custom fine-tuned models with non-standard I/O
- Providers not yet covered by a dedicated task
- Debugging raw provider responses

## Example

    {:ok, result} = HuggingfaceClient.request(client, %{
      model: "my-user/my-custom-model",
      inputs: "some raw text",
      parameters: %{custom_param: 42}
    })

# `sentence_similarity`

```elixir
@spec sentence_similarity(HuggingfaceClient.Client.t(), map()) ::
  {:ok, list()} | {:error, Exception.t()}
```

Sentence similarity scoring.

# `summarization`

```elixir
@spec summarization(HuggingfaceClient.Client.t(), map()) ::
  {:ok, map()} | {:error, Exception.t()}
```

Abstractive summarisation.

# `table_question_answering`

```elixir
@spec table_question_answering(HuggingfaceClient.Client.t(), map()) ::
  {:ok, map()} | {:error, Exception.t()}
```

Table question answering (TAPAS / TaBERT).

# `tabular_classification`

```elixir
@spec tabular_classification(HuggingfaceClient.Client.t(), map()) ::
  {:ok, list()} | {:error, Exception.t()}
```

Tabular data classification. Returns predicted class indices.

# `tabular_regression`

```elixir
@spec tabular_regression(HuggingfaceClient.Client.t(), map()) ::
  {:ok, list()} | {:error, Exception.t()}
```

Tabular data regression. Returns predicted float values.

# `text_classification`

```elixir
@spec text_classification(HuggingfaceClient.Client.t(), map()) ::
  {:ok, list()} | {:error, Exception.t()}
```

Text classification (sentiment analysis, topic, etc.).

Returns `{:ok, [{"label", "score"}, ...]}`.

# `text_generation`

```elixir
@spec text_generation(HuggingfaceClient.Client.t(), map()) ::
  {:ok, map()} | {:error, Exception.t()}
```

Text generation (completion, non-chat).

Returns `{:ok, %{"generated_text" => string}}`.

# `text_generation_stream`

```elixir
@spec text_generation_stream(HuggingfaceClient.Client.t(), map()) ::
  {:ok, Enumerable.t()} | {:error, Exception.t()}
```

Streaming text generation. Returns `{:ok, stream}` of delta chunks.

# `text_to_audio`

```elixir
@spec text_to_audio(HuggingfaceClient.Client.t(), map()) ::
  {:ok, binary()} | {:error, Exception.t()}
```

Text-to-audio generation (music, effects). Returns audio bytes.

# `text_to_image`

```elixir
@spec text_to_image(HuggingfaceClient.Client.t(), map()) ::
  {:ok, binary()} | {:error, Exception.t()}
```

Text-to-image generation.

Returns `{:ok, binary}` (raw image bytes) by default.
Pass `output_type: :url` for a URL string (where provider supports it).

# `text_to_speech`

```elixir
@spec text_to_speech(HuggingfaceClient.Client.t(), map()) ::
  {:ok, binary()} | {:error, Exception.t()}
```

Text-to-speech synthesis. Returns audio bytes.

# `text_to_video`

```elixir
@spec text_to_video(HuggingfaceClient.Client.t(), map()) ::
  {:ok, binary()} | {:error, Exception.t()}
```

Text-to-video generation. Returns video bytes.

# `token_classification`

```elixir
@spec token_classification(HuggingfaceClient.Client.t(), map()) ::
  {:ok, list()} | {:error, Exception.t()}
```

Token / entity classification (NER).

# `translation`

```elixir
@spec translation(HuggingfaceClient.Client.t(), map()) ::
  {:ok, map()} | {:error, Exception.t()}
```

Neural machine translation.

# `video_classification`

```elixir
@spec video_classification(HuggingfaceClient.Client.t(), map()) ::
  {:ok, term()} | {:error, Exception.t()}
```

Classifies a video clip into predefined categories.

## Example
    {:ok, results} = HuggingfaceClient.video_classification(client, %{
      video: File.read!("action.mp4"),
      top_k: 5
    })

# `visual_question_answering`

```elixir
@spec visual_question_answering(HuggingfaceClient.Client.t(), map()) ::
  {:ok, map()} | {:error, Exception.t()}
```

Visual question answering — answer questions about an image.

# `zero_shot_classification`

```elixir
@spec zero_shot_classification(HuggingfaceClient.Client.t(), map()) ::
  {:ok, list()} | {:error, Exception.t()}
```

Zero-shot text classification.

# `zero_shot_image_classification`

```elixir
@spec zero_shot_image_classification(HuggingfaceClient.Client.t(), map()) ::
  {:ok, list()} | {:error, Exception.t()}
```

Zero-shot image classification with candidate labels.

---

*Consult [api-reference.md](api-reference.md) for complete listing*