Image.FaceDetection (image_vision v0.2.0)

Face detection — where are the faces in this image?

Returns one entry per detected face, each with a bounding box, a confidence score, and five facial landmarks (right eye, left eye, nose tip, right mouth corner, left mouth corner).

Quick start

iex> image = Image.open!("./test/support/images/group.jpg")
iex> [%{box: _, score: _, landmarks: _} | _] = Image.FaceDetection.detect(image)

Default model

YuNet — the OpenCV team's production face detector. Roughly 340 KB on disk, MIT licensed, real-time on CPU. The 2023-March export hosted at opencv/face_detection_yunet produces decoded boxes + keypoints + scores directly.

Override the default via :repo and :model_file. The output shape this module decodes is the YuNet 2023-March convention; SCRFD and BlazeFace exports differ enough that they need a different post-processor.

Drawing detections

Use draw_boxes/3 to overlay rectangles + landmarks on the original image:

image
|> Image.FaceDetection.detect()
|> Image.FaceDetection.draw_boxes(image)

Face-aware crop

crop_largest/2 is a convenience for the common "crop to the most prominent face" case (used by CDN parameters like ImageKit z-, Cloudflare face-zoom, and gravity: :face):

{:ok, portrait} = Image.FaceDetection.crop_largest(image, padding: 0.2)

Optional dependency

This module is only available when Ortex is configured in your application's mix.exs.

Summary

Types

face()

A single detected face.

Functions

boxes(image, options \\ [])

Returns just the bounding boxes of detected faces, sorted by descending confidence. Convenience over detect/2 when landmarks aren't needed.

crop_largest(image, options \\ [])

Crops the image to the largest detected face.

detect(image, options \\ [])

Detects faces in an image and returns a list of detections sorted by descending confidence.

draw_boxes(detections, image, options \\ [])

Draws bounding boxes and the five facial landmarks for each detection onto an image.

Types

face()

@type face() :: %{
  box: {non_neg_integer(), non_neg_integer(), pos_integer(), pos_integer()},
  score: float(),
  landmarks: [{number(), number()}]
}

A single detected face.

:box is {x, y, width, height} in pixel coordinates of the original image. (x, y) is the top-left corner.
:score is the confidence score, a float in [0.0, 1.0].
:landmarks is a list of five {x, y} tuples in pixel coordinates: right eye, left eye, nose tip, right mouth corner, left mouth corner — in that order.

Functions

boxes(image, options \\ [])

@spec boxes(image :: Vix.Vips.Image.t(), options :: Keyword.t()) :: [
  {non_neg_integer(), non_neg_integer(), pos_integer(), pos_integer()}
]

Returns just the bounding boxes of detected faces, sorted by descending confidence. Convenience over detect/2 when landmarks aren't needed.

Arguments

image is any Vix.Vips.Image.t/0.
options is forwarded to detect/2.

Returns

A list of {x, y, width, height} tuples in pixel coordinates of the original image.

crop_largest(image, options \\ [])

@spec crop_largest(image :: Vix.Vips.Image.t(), options :: Keyword.t()) ::
  {:ok, Vix.Vips.Image.t()} | {:error, :no_face_detected}

Crops the image to the largest detected face.

The largest face is chosen by bounding-box area. The crop is expanded by :padding (a fraction of each face dimension) to leave breathing room around the face, then clipped to the image bounds. If no face is detected, returns {:error, :no_face_detected}.

Used as the wire-in point for face-aware crop bias in image_plug (gravity: :face, ImageKit z-, Cloudflare face-zoom).

Arguments

image is any Vix.Vips.Image.t/0.
options is a keyword list. Detection options (:min_score, :nms_iou, :input_size, etc.) are forwarded to detect/2.

Options

:padding is a float in [0.0, 5.0] controlling how much room is kept around the face. 0.0 is a tight crop to the bounding box; 0.5 adds 50% on each side; 1.0 doubles the bounding box. Default 0.2.

Returns

{:ok, cropped_image} or
{:error, :no_face_detected} if no detection met :min_score.

detect(image, options \\ [])

@spec detect(image :: Vix.Vips.Image.t(), options :: Keyword.t()) :: [face()]

Detects faces in an image and returns a list of detections sorted by descending confidence.

Arguments

image is any Vix.Vips.Image.t/0.
options is a keyword list of options.

Options

:min_score is the minimum confidence score, a float in [0.0, 1.0], that a detection must meet to be returned. The default is 0.6.
:nms_iou is the IoU threshold for non-maximum suppression. Detections that overlap more than this threshold are collapsed. Lower values keep fewer overlapping faces. The default is 0.3.
:repo is the HuggingFace repository for the YuNet ONNX export. Default "opencv/face_detection_yunet".
:model_file is the ONNX filename within the repository. Default "face_detection_yunet_2023mar.onnx".

Returns

A list of face/0 maps, sorted by descending :score. Empty list when no face meets the threshold.

Examples

iex> image = Image.open!("./test/support/images/group.jpg")
iex> faces = Image.FaceDetection.detect(image, min_score: 0.7)
iex> is_list(faces) and Enum.all?(faces, &match?(%{box: _, score: _, landmarks: _}, &1))
true

draw_boxes(detections, image, options \\ [])

@spec draw_boxes([face()], Vix.Vips.Image.t(), Keyword.t()) :: Vix.Vips.Image.t()

Draws bounding boxes and the five facial landmarks for each detection onto an image.

Builds an SVG overlay (one box + five dots per face) and composites it onto the image. The score is rendered as a percentage label above each box.

Arguments

detections is the list returned from detect/2.
image is the image upon which detection was run.
options is a keyword list of options.

Options

:color is the CSS colour used for boxes and landmarks. Default "#3cb44b" (a high-contrast green).
:stroke_width is the bounding-box stroke width in pixels. Default 2.
:landmark_radius is the radius of each landmark dot in pixels. Default 3.
:font_size is the score-label text size in pixels. Default 13.
:show_landmarks? when false skips drawing the five landmark dots. Default true.

Returns

The annotated Vix.Vips.Image.t/0.