Object detection — where are the objects in this image?
Pass a Vix.Vips.Image.t/0 to detect/2 and get back a list
of detected objects with their class labels, confidence scores,
and bounding boxes.
Quick start
iex> car = Image.open!("./test/support/images/lamborghini-forsennato-concept.jpg")
iex> [%{label: _, score: _, box: _} | _] = Image.Detection.detect(car)Default model
The default is RT-DETR —
a real-time, transformer-based detector that beats YOLOv8 on
COCO and is Apache 2.0 licensed (unlike YOLOv8/11 which are
AGPL). The ONNX export is hosted at
onnx-community/rtdetr_r50vd and is downloaded on first call
via ImageVision.ModelCache.
Model:
onnx-community/rtdetr_r50vd/onnx/model.onnx(~175 MB).Classes: 80 standard COCO classes (
person,bicycle,car, …).Output: per-query class scores (sigmoid) and
cxcywhbounding boxes. RT-DETR is NMS-free by design — no Non-Maximum Suppression post-processing is required.
Drawing detections
Use draw_bbox_with_labels/2 to overlay detections on the
original image:
image
|> Image.Detection.detect()
|> Image.Detection.draw_bbox_with_labels(image)Optional dependency
This module is only available when Ortex
is configured in your application's mix.exs.
Summary
Types
A single detected object.
Functions
Returns the list of class labels the default model can detect.
Detects objects in an image and returns a list of detections sorted by descending confidence.
Draws bounding boxes with class labels onto an image.
Types
@type detection() :: %{ label: String.t(), score: float(), box: {non_neg_integer(), non_neg_integer(), pos_integer(), pos_integer()} }
A single detected object.
:labelis one of the 80 COCO class names, e.g."person".:scoreis the confidence score, a float in[0.0, 1.0].:boxis{x, y, width, height}in pixel coordinates of the original image.(x, y)is the top-left corner.
Functions
@spec classes() :: [String.t()]
Returns the list of class labels the default model can detect.
Returns
- A list of 80 COCO class names as binaries, in the order used by RT-DETR.
@spec detect(image :: Vix.Vips.Image.t(), options :: Keyword.t()) :: [detection()]
Detects objects in an image and returns a list of detections sorted by descending confidence.
Arguments
imageis anyVix.Vips.Image.t/0.optionsis a keyword list of options.
Options
:min_scoreis the minimum confidence score, a float in[0.0, 1.0], that a detection must meet to be returned. The default is0.5.:repois the HuggingFace repository for the model. The default is"onnx-community/rtdetr_r50vd".:filenameis the ONNX file path within the repository. The default is"onnx/model.onnx". Use"onnx/model_quantized.onnx"(~45 MB INT8) for a much smaller model with some accuracy loss.
Returns
- A list of
detection/0maps, sorted by descending:score.
Examples
iex> car = Image.open!("./test/support/images/lamborghini-forsennato-concept.jpg")
iex> [%{label: _, score: _, box: _} | _] =
...> Image.Detection.detect(car, min_score: 0.5)
@spec draw_bbox_with_labels([detection()], Vix.Vips.Image.t(), Keyword.t()) :: Vix.Vips.Image.t()
Draws bounding boxes with class labels onto an image.
Builds an SVG overlay — one box and label per detection — and composites it onto the image. Each distinct class label gets a consistent colour so multiple detections of the same class are easy to identify at a glance.
Arguments
detectionsis the list returned fromdetect/2.imageis the image upon which detection was run.optionsis a keyword list of options.
Options
:opacityis the opacity of the label background, a float in[0.0, 1.0]. The default is0.85. Use1.0for fully opaque label backgrounds.:stroke_widthis the bounding box stroke width in pixels. The default is2.:font_sizeis the label text size in pixels. The default is13.:paletteis a list of CSS colour strings used to assign colours to labels. Cycles if there are more labels than colours. The default is a 10-colour high-contrast palette.
Returns
- The annotated
Vix.Vips.Image.t/0.
Examples
iex> car = Image.open!("./test/support/images/lamborghini-forsennato-concept.jpg")
iex> annotated =
...> car
...> |> Image.Detection.detect()
...> |> Image.Detection.draw_bbox_with_labels(car)
iex> match?(%Vix.Vips.Image{}, annotated)
true