Image classification — what's in this image?
Pass a Vix.Vips.Image.t/0 to classify/2 or labels/2 and
get back human-readable labels like "sports car" or
"Blenheim spaniel". Pass it to embed/2 to get a fixed-size
feature vector you can use for similarity search or downstream
learning.
Quick start
iex> puppy = Image.open!("./test/support/images/puppy.webp")
iex> [label | _rest] = Image.Classification.labels(puppy)
iex> label
"Blenheim spaniel"Default models
The defaults are chosen for permissive licensing (Apache 2.0), reasonable size (<400 MB), and broad applicability:
Classification —
facebook/convnext-tiny-224. ~110 MB, ~82.1% top-1 ImageNet, Apache 2.0. Returns one of 1000 ImageNet labels with a confidence score.Embedding —
facebook/dinov2-base. ~340 MB, Apache 2.0. Returns a 768-dim feature vector. Useful for "find similar images", clustering, or as input to a custom classifier.
Power users can override every default through configuration or
classifier/1 options — see the configuration section below.
Configuration
Both classifier and embedder are configurable independently. The defaults are:
# config/runtime.exs
config :image_vision, :classifier,
model: {:hf, "facebook/convnext-tiny-224"},
featurizer: {:hf, "facebook/convnext-tiny-224"},
model_options: [],
featurizer_options: [],
batch_size: 10,
name: Image.Classification.Server,
autostart: true
config :image_vision, :embedder,
model: {:hf, "facebook/dinov2-base"},
featurizer: {:hf, "facebook/dinov2-base"},
model_options: [],
featurizer_options: [],
batch_size: 10,
name: Image.Classification.EmbeddingServer,
autostart: falseServings and supervision
Bumblebee servings are heavyweight processes — a model load can take several seconds and consume hundreds of megabytes. Each classification or embedding entry point runs against a named serving process so the model loads once and is reused.
By default the classifier serving is autostarted by
ImageVision.Supervisor when the :image_vision application
starts. The embedding serving is not autostarted (most apps don't
need it).
To run a serving in your own supervision tree, set
autostart: false and use classifier/1 or embedder/1 to get
a child spec:
# application.ex
def start(_type, _args) do
children = [
Image.Classification.classifier(),
Image.Classification.embedder(model: {:hf, "facebook/dinov2-large"})
]
Supervisor.start_link(children, strategy: :one_for_one)
endOptional dependency
This module is only available when Bumblebee,
Nx, and an Nx compiler such as
EXLA are configured in your
application's mix.exs.
Summary
Functions
Returns a child spec suitable for starting an image classification process as part of a supervision tree.
Classifies an image and returns the full prediction map.
Computes a feature vector embedding of an image.
Returns a child spec suitable for starting an image embedding process as part of a supervision tree.
Classifies an image and returns the labels that meet a minimum confidence score.
Functions
@spec classifier(configuration :: Keyword.t()) :: {Nx.Serving, Keyword.t()} | {:error, Image.error()}
Returns a child spec suitable for starting an image classification process as part of a supervision tree.
Arguments
configurationis a keyword list merged over the default configuration.
Options
:modelis any image classification model supported by Bumblebee. The default is{:hf, "facebook/convnext-tiny-224"}.:featurizeris any image featurizer supported by Bumblebee. The default is{:hf, "facebook/convnext-tiny-224"}.:model_optionsis a keyword list of options passed toBumblebee.load_model/2. The default is[].:featurizer_optionsis a keyword list of options passed toBumblebee.load_featurizer/2. The default is[].:nameis the name of the serving process. The default isImage.Classification.Server.:batch_sizeis the maximum batch size, passed toBumblebee.Vision.image_classification/3. The default is10.
Returns
A child spec tuple suitable for
Supervisor.start_link/2, or{:error, reason}if the model could not be loaded.
@spec classify(image :: Vix.Vips.Image.t(), Keyword.t()) :: %{predictions: [%{label: String.t(), score: float()}]} | {:error, Image.error()}
Classifies an image and returns the full prediction map.
Arguments
imageis anyVix.Vips.Image.t/0.optionsis a keyword list of options.
Options
:backendis any valid Nx backend. The default isNx.default_backend/0.:serveris the name of the serving process. The default isImage.Classification.Server.
Returns
A map of the form
%{predictions: [%{label: String.t(), score: float()}]}, or{:error, reason}.
Examples
iex> puppy = Image.open!("./test/support/images/puppy.webp")
iex> %{predictions: [%{label: _label, score: _score} | _rest]} =
...> Image.Classification.classify(puppy)
@spec embed(image :: Vix.Vips.Image.t(), options :: Keyword.t()) :: Nx.Tensor.t() | {:error, Image.error()}
Computes a feature vector embedding of an image.
Embeddings are fixed-size dense vectors. Two images with similar visual content will have similar embeddings, making this useful for similarity search, clustering, or as input to a downstream classifier.
Arguments
imageis anyVix.Vips.Image.t/0.optionsis a keyword list of options.
Options
:backendis any valid Nx backend. The default isNx.default_backend/0.:serveris the name of the embedding serving process. The default isImage.Classification.EmbeddingServer.
Returns
An
Nx.Tensorof shape{embedding_size}(e.g.{768}for DINOv2-base), or{:error, reason}.
Examples
iex> puppy = Image.open!("./test/support/images/puppy.webp")
iex> embedding = Image.Classification.embed(puppy)
iex> Nx.shape(embedding)
{768}
@spec embedder(configuration :: Keyword.t()) :: {Nx.Serving, Keyword.t()} | {:error, Image.error()}
Returns a child spec suitable for starting an image embedding process as part of a supervision tree.
Embeddings are fixed-size feature vectors useful for similarity search, clustering, or as input to a downstream classifier.
Arguments
configurationis a keyword list merged over the default configuration.
Options
:modelis any image embedding model supported by Bumblebee. The default is{:hf, "facebook/dinov2-base"}.:featurizeris any image featurizer supported by Bumblebee. The default is{:hf, "facebook/dinov2-base"}.:model_optionsis a keyword list of options passed toBumblebee.load_model/2. The default is[].:featurizer_optionsis a keyword list of options passed toBumblebee.load_featurizer/2. The default is[].:nameis the name of the serving process. The default isImage.Classification.EmbeddingServer.:batch_sizeis the maximum batch size. The default is10.
Returns
A child spec tuple suitable for
Supervisor.start_link/2, or{:error, reason}if the model could not be loaded.
@spec labels(image :: Vix.Vips.Image.t(), options :: Keyword.t()) :: [String.t()] | {:error, Image.error()}
Classifies an image and returns the labels that meet a minimum confidence score.
Arguments
imageis anyVix.Vips.Image.t/0.optionsis a keyword list of options.
Options
:backendis any valid Nx backend. The default isNx.default_backend/0.:min_scoreis the minimum score, a float between0.0and1.0, that a label must meet to be returned. The default is0.5.:serveris the name of the serving process. The default isImage.Classification.Server.
Returns
A list of labels. The list may be empty if no prediction meets
:min_score.{:error, reason}.
Examples
iex> car = Image.open!("./test/support/images/lamborghini-forsennato-concept.jpg")
iex> Image.Classification.labels(car)
["sports car", "sport car"]