# `Image.Segmentation`
[🔗](https://github.com/elixir-image/image_vision/blob/v0.2.0/lib/segmentation.ex#L2)

Image segmentation — which pixels belong to which object?

Two complementary entry points cover different use cases:

* `segment/2` — **promptable segmentation** via SAM 2. Click a
  point or draw a box and get back a precise mask for that object.
  Great for "cut out this foreground" or "mask this product".

* `segment_panoptic/2` — **class-labeled segmentation** via
  DETR-panoptic. Every region in the image gets a class label
  (`person`, `car`, `sky`, `road`…). Great for "what's in this
  image and where?"

## Quick start

    # Mask the object at the centre of the image
    iex> image = Image.open!("photo.jpg")
    iex> %{mask: mask, score: _} = Image.Segmentation.segment(image)

    # Mask the object at a specific point
    iex> %{mask: mask} = Image.Segmentation.segment(image, prompt: {:point, 320, 240})

    # Class-label every region
    iex> segments = Image.Segmentation.segment_panoptic(image)
    iex> Enum.map(segments, & &1.label)
    ["person", "car", "road", "sky"]

## Composing results with an image

    # Make the masked object the only visible content (alpha mask)
    iex> cutout = Image.Segmentation.apply_mask(image, mask)

    # Colour-coded overlay of all segments
    iex> overlay = Image.Segmentation.compose_overlay(image, segments)

## Default models

* **Promptable** — `SharpAI/sam2-hiera-tiny-onnx` (SAM 2 Tiny,
  Apache 2.0, encoder ~128 MB + decoder ~20 MB). Downloaded on
  first call via `ImageVision.ModelCache`.

* **Class-labeled** — `Xenova/detr-resnet-50-panoptic` (DETR
  ResNet-50, Apache 2.0, ~172 MB). 250 COCO panoptic classes
  covering everyday things and stuff.

Both can be overridden via options — see `segment/2` and
`segment_panoptic/2` for details.

## Optional dependency

This module is only available when
[Ortex](https://hex.pm/packages/ortex) is configured in your
application's `mix.exs`.

# `mask_result`

```elixir
@type mask_result() :: %{score: float(), mask: Vix.Vips.Image.t()}
```

A mask returned by `segment/2`.

* `:score` — SAM IoU prediction score.
* `:mask` — single-band `t:Vix.Vips.Image.t/0` in original image
  dimensions; white pixels are the segmented object.

# `segment`

```elixir
@type segment() :: %{label: String.t(), score: float(), mask: Vix.Vips.Image.t()}
```

A segmented region returned by `segment_panoptic/2`.

* `:label` — COCO panoptic class name, e.g. `"person"` or `"road"`.
* `:score` — confidence score in `[0.0, 1.0]`.
* `:mask` — single-band `t:Vix.Vips.Image.t/0`; white (255) pixels
  belong to this segment, black (0) do not.

# `apply_mask`

```elixir
@spec apply_mask(Vix.Vips.Image.t(), Vix.Vips.Image.t()) ::
  {:ok, Vix.Vips.Image.t()} | {:error, Image.error()}
```

Applies a mask as the alpha channel of an image.

White pixels in the mask become fully opaque; black pixels become
fully transparent. The result is an RGBA image suitable for
compositing or exporting with transparency.

### Arguments

* `image` is any `t:Vix.Vips.Image.t/0`.

* `mask` is a single-band `t:Vix.Vips.Image.t/0` of the same
  dimensions, such as the `:mask` field of `t:mask_result/0` or
  `t:segment/0`.

### Returns

* `{:ok, image}` — an RGBA `t:Vix.Vips.Image.t/0`, or

* `{:error, reason}`.

### Examples

    iex> image = Image.open!("./test/support/images/puppy.webp")
    iex> %{mask: mask} = Image.Segmentation.segment(image)
    iex> {:ok, cutout} = Image.Segmentation.apply_mask(image, mask)
    iex> Image.bands(cutout)
    4

# `compose_overlay`

```elixir
@spec compose_overlay(Vix.Vips.Image.t(), [segment() | mask_result()], Keyword.t()) ::
  Vix.Vips.Image.t()
```

Overlays colour-coded segment masks on an image.

Each segment gets a distinct colour. Useful for visualising the
output of `segment_panoptic/2`.

### Arguments

* `image` is any `t:Vix.Vips.Image.t/0`.

* `segments` is the list returned by `segment_panoptic/2`, or any
  list of maps with `:mask` and `:label` keys.

* `options` is a keyword list of options.

### Options

* `:alpha` — opacity of the overlay as a float in `[0.0, 1.0]`.
  The default is `0.5`.

### Returns

* The annotated `t:Vix.Vips.Image.t/0`.

### Examples

    iex> image = Image.open!("./test/support/images/puppy.webp")
    iex> segments = Image.Segmentation.segment_panoptic(image)
    iex> overlay = Image.Segmentation.compose_overlay(image, segments)
    iex> match?(%Vix.Vips.Image{}, overlay)
    true

# `segment`

```elixir
@spec segment(Vix.Vips.Image.t(), Keyword.t()) :: mask_result() | [mask_result()]
```

Segments an object in an image using SAM 2.

Accepts an optional point or box prompt to select which object to
segment. With no prompt, the centre of the image is used.

### Arguments

* `image` is any `t:Vix.Vips.Image.t/0`.

* `options` is a keyword list of options.

### Options

* `:prompt` selects what to segment:
  * `:auto` — segment the object at the image centre (default).
  * `{:point, x, y}` — segment the object at pixel `(x, y)`.
  * `{:box, x, y, w, h}` — segment the object inside the box.
  * A list of `{:point, x, y}` tuples for multi-point prompting.

* `:multimask` — when `true`, returns all three SAM candidate masks
  as a list sorted by descending score. When `false` (default),
  returns only the best mask as a single `t:mask_result/0`.

* `:min_score` — minimum IoU score to return when `:multimask` is
  `true`. The default is `0.0`.

* `:repo` — HuggingFace repo for the SAM 2 ONNX models. Default
  is `"SharpAI/sam2-hiera-tiny-onnx"`.

* `:encoder_file` — encoder ONNX filename within the repo. Default
  is `"encoder.onnx"`.

* `:decoder_file` — decoder ONNX filename within the repo. Default
  is `"decoder.onnx"`.

### Returns

* A `t:mask_result/0` map when `:multimask` is `false`.

* A list of `t:mask_result/0` maps sorted by descending `:score`
  when `:multimask` is `true`.

### Examples

    iex> image = Image.open!("./test/support/images/puppy.webp")
    iex> %{score: score, mask: mask} = Image.Segmentation.segment(image)
    iex> score > 0.0
    true
    iex> match?(%Vix.Vips.Image{}, mask)
    true

# `segment_panoptic`

```elixir
@spec segment_panoptic(Vix.Vips.Image.t(), Keyword.t()) :: [segment()]
```

Segments and labels every region in an image using DETR-panoptic.

Returns one segment per detected object or region, each with a
class label, confidence score, and a binary mask. Covers 250 COCO
panoptic categories including everyday objects (`person`, `car`,
`dog`) and background regions (`sky`, `road`, `grass`).

### Arguments

* `image` is any `t:Vix.Vips.Image.t/0`.

* `options` is a keyword list of options.

### Options

* `:min_score` — minimum confidence score to include a segment.
  The default is `0.5`.

* `:repo` — HuggingFace repo for the ONNX model. Default is
  `"Xenova/detr-resnet-50-panoptic"`.

* `:model_file` — ONNX filename within the repo. Default is
  `"onnx/model.onnx"`. Use `"onnx/model_quantized.onnx"` (~44 MB)
  for a much smaller model with some accuracy loss.

### Returns

* A list of `t:segment/0` maps sorted by descending `:score`.
  May be empty if no segment meets `:min_score`.

### Examples

    iex> image = Image.open!("./test/support/images/puppy.webp")
    iex> segments = Image.Segmentation.segment_panoptic(image)
    iex> is_list(segments)
    true
    iex> Enum.all?(segments, &match?(%{label: _, score: _, mask: _}, &1))
    true

---

*Consult [api-reference.md](api-reference.md) for complete listing*