Image.Segmentation produces pixel-level masks: which pixels belong to a given object or region.

Two functions cover different use cases:

  • segment/2promptable: click a point or draw a box to cut out a specific object.
  • segment_panoptic/2class-labeled: every region in the image gets a label (person, car, sky…).

Promptable segmentation (SAM 2)

Segment the centre object

With no prompt, segment/2 segments whatever is at the centre of the image:

iex> image = Image.open!("product_photo.jpg")
iex> %{mask: mask, score: score} = Image.Segmentation.segment(image)
iex> score
0.94

Segment by point

iex> %{mask: mask} = Image.Segmentation.segment(image, prompt: {:point, 320, 240})

Segment by bounding box

iex> %{mask: mask} = Image.Segmentation.segment(image, prompt: {:box, 100, 50, 200, 300})

The box is {x, y, width, height} in pixel coordinates of the original image.

Multiple prompts

Pass a list of {:point, x, y} tuples to guide the model toward a specific object when a single point is ambiguous:

iex> %{mask: mask} = Image.Segmentation.segment(image,
...>   prompt: [{:point, 320, 240}, {:point, 340, 260}])

Getting all candidate masks

SAM produces three mask candidates for every prompt. Retrieve them all with multimask: true:

iex> masks = Image.Segmentation.segment(image, multimask: true)
iex> length(masks)
3
iex> hd(masks).score
0.97

Class-labeled segmentation (DETR-panoptic)

segment_panoptic/2 returns one segment per detected region, each with a class label and a binary mask:

iex> street = Image.open!("street.jpg")
iex> segments = Image.Segmentation.segment_panoptic(street)
iex> Enum.map(segments, & {&1.label, Float.round(&1.score, 2)})
[{"person", 0.97}, {"car", 0.93}, {"road", 0.88}, {"sky", 0.85}]

Uses 250 COCO panoptic categories covering everyday objects and background regions.

Composing results with the original image

Cut out a segmented object

apply_mask/2 makes the mask the alpha channel — white pixels become opaque, black pixels transparent:

iex> %{mask: mask} = Image.Segmentation.segment(image)
iex> {:ok, cutout} = Image.Segmentation.apply_mask(image, mask)
iex> Image.save!(cutout, "cutout.png")

Colour-coded overlay

compose_overlay/3 draws a colour-coded overlay of all segments:

iex> overlay = Image.Segmentation.compose_overlay(street, segments)
iex> Image.save!(overlay, "segmented.jpg")

Adjust transparency with :alpha (default 0.5):

iex> overlay = Image.Segmentation.compose_overlay(street, segments, alpha: 0.3)

Using a different model

Both segment/2 and segment_panoptic/2 accept options to swap models. They are passed per call rather than via app config — neither function uses a long-running serving, so there is no autostart cost to overriding on a single call.

Promptable (SAM 2)

# Use a larger SAM 2 variant for better quality on small or thin objects
iex> Image.Segmentation.segment(image,
...>   prompt: {:point, 320, 240},
...>   repo: "SharpAI/sam2-hiera-small-onnx")

segment/2 accepts:

  • :repo — any HuggingFace repo containing a SAM 2 ONNX export with separate encoder and decoder files
  • :encoder_file — encoder filename within the repo (default "encoder.onnx")
  • :decoder_file — decoder filename within the repo (default "decoder.onnx")

The protocol matches SharpAI/sam2-hiera-tiny-onnx (separate encoder/decoder, the standard SAM 2 ONNX export shape). Repos that bundle both into a single file or use a different I/O layout will not work without changes to the wrapper.

Class-labeled (DETR-panoptic)

# Quantized variant — much smaller, some accuracy cost
iex> Image.Segmentation.segment_panoptic(image, model_file: "onnx/model_quantized.onnx")

# A different ONNX-exported DETR-panoptic repo
iex> Image.Segmentation.segment_panoptic(image, repo: "your-org/detr-panoptic-onnx")

segment_panoptic/2 accepts:

  • :repo — any HuggingFace repo with a DETR-panoptic ONNX export and a config.json providing id2label
  • :model_file — ONNX filename within the repo (default "onnx/model.onnx")

Labels are read from the repo's config.json. Where that config has placeholder LABEL_n entries, the wrapper falls back to the canonical COCO panoptic taxonomy, so common stuff classes (sky-other-merged, mountain-merged, grass-merged, …) resolve correctly even on repos with incomplete configs.

Pre-downloading

To populate the cache before first use:

mix image_vision.download_models --segment

This fetches the configured defaults. For non-default repos, the cache populates on first call to segment/2 or segment_panoptic/2.

Dependencies

Segmentation requires :ortex. Add to mix.exs:

{:ortex, "~> 0.1"}

Model weights (~150 MB for SAM 2, ~175 MB for DETR) are downloaded on first call and cached. Configure the cache directory with:

config :image_vision, :cache_dir, "/path/to/cache"