# `Image.Detection`
[🔗](https://github.com/elixir-image/image_vision/blob/v0.2.0/lib/detection.ex#L2)

Object detection — where are the objects in this image?

Pass a `t:Vix.Vips.Image.t/0` to `detect/2` and get back a list
of detected objects with their class labels, confidence scores,
and bounding boxes.

## Quick start

    iex> car = Image.open!("./test/support/images/lamborghini-forsennato-concept.jpg")
    iex> [%{label: _, score: _, box: _} | _] = Image.Detection.detect(car)

## Default model

The default is [RT-DETR](https://huggingface.co/onnx-community/rtdetr_r50vd) —
a real-time, transformer-based detector that beats YOLOv8 on
COCO and is **Apache 2.0 licensed** (unlike YOLOv8/11 which are
AGPL). The ONNX export is hosted at
`onnx-community/rtdetr_r50vd` and is downloaded on first call
via `ImageVision.ModelCache`.

* Model: `onnx-community/rtdetr_r50vd` / `onnx/model.onnx` (~175 MB).

* Classes: 80 standard COCO classes (`person`, `bicycle`, `car`, …).

* Output: per-query class scores (sigmoid) and `cxcywh` bounding
  boxes. RT-DETR is NMS-free by design — no Non-Maximum
  Suppression post-processing is required.

## Drawing detections

Use `draw_bbox_with_labels/2` to overlay detections on the
original image:

    image
    |> Image.Detection.detect()
    |> Image.Detection.draw_bbox_with_labels(image)

## Optional dependency

This module is only available when [Ortex](https://hex.pm/packages/ortex)
is configured in your application's `mix.exs`.

# `detection`

```elixir
@type detection() :: %{
  label: String.t(),
  score: float(),
  box: {non_neg_integer(), non_neg_integer(), pos_integer(), pos_integer()}
}
```

A single detected object.

* `:label` is one of the 80 COCO class names, e.g. `"person"`.

* `:score` is the confidence score, a float in `[0.0, 1.0]`.

* `:box` is `{x, y, width, height}` in pixel coordinates of the
  original image. `(x, y)` is the top-left corner.

# `classes`

```elixir
@spec classes() :: [String.t()]
```

Returns the list of class labels the default model can detect.

### Returns

* A list of 80 COCO class names as binaries, in the order used
  by RT-DETR.

# `detect`

```elixir
@spec detect(image :: Vix.Vips.Image.t(), options :: Keyword.t()) :: [detection()]
```

Detects objects in an image and returns a list of detections
sorted by descending confidence.

### Arguments

* `image` is any `t:Vix.Vips.Image.t/0`.

* `options` is a keyword list of options.

### Options

* `:min_score` is the minimum confidence score, a float in
  `[0.0, 1.0]`, that a detection must meet to be returned. The
  default is `0.5`.

* `:repo` is the HuggingFace repository for the model. The
  default is `"onnx-community/rtdetr_r50vd"`.

* `:filename` is the ONNX file path within the repository. The
  default is `"onnx/model.onnx"`. Use `"onnx/model_quantized.onnx"`
  (~45 MB INT8) for a much smaller model with some accuracy loss.

### Returns

* A list of `t:detection/0` maps, sorted by descending `:score`.

### Examples

    iex> car = Image.open!("./test/support/images/lamborghini-forsennato-concept.jpg")
    iex> [%{label: _, score: _, box: _} | _] =
    ...>   Image.Detection.detect(car, min_score: 0.5)

# `draw_bbox_with_labels`

```elixir
@spec draw_bbox_with_labels([detection()], Vix.Vips.Image.t(), Keyword.t()) ::
  Vix.Vips.Image.t()
```

Draws bounding boxes with class labels onto an image.

Builds an SVG overlay — one box and label per detection — and
composites it onto the image. Each distinct class label gets a
consistent colour so multiple detections of the same class are
easy to identify at a glance.

### Arguments

* `detections` is the list returned from `detect/2`.

* `image` is the image upon which detection was run.

* `options` is a keyword list of options.

### Options

* `:opacity` is the opacity of the label background, a float in
  `[0.0, 1.0]`. The default is `0.85`. Use `1.0` for fully
  opaque label backgrounds.

* `:stroke_width` is the bounding box stroke width in pixels.
  The default is `2`.

* `:font_size` is the label text size in pixels. The default
  is `13`.

* `:palette` is a list of CSS colour strings used to assign
  colours to labels. Cycles if there are more labels than
  colours. The default is a 10-colour high-contrast palette.

### Returns

* The annotated `t:Vix.Vips.Image.t/0`.

### Examples

    iex> car = Image.open!("./test/support/images/lamborghini-forsennato-concept.jpg")
    iex> annotated =
    ...>   car
    ...>   |> Image.Detection.detect()
    ...>   |> Image.Detection.draw_bbox_with_labels(car)
    iex> match?(%Vix.Vips.Image{}, annotated)
    true

---

*Consult [api-reference.md](api-reference.md) for complete listing*