View Source YOLO.Models (YOLO v0.2.0)

This module handles loading YOLO models and running object detection on images. The YOLO.Model behaviour can be implemented for various YOLO variants. The supported models are:

Main Functions

The key functions you'll use are:

  • YOLO.Models.load/1: Loads a YOLO model with required options

    YOLO.Models.load(model_path: "path/to/model.onnx",
                    classes_path: "path/to/classes.json",
                    model_impl: YOLO.Models.Ultralytics)
  • YOLO.Models.detect/3: Runs object detection on an image

    YOLO.Models.detect(model, image, prob_threshold: 0.5)

Summary

Functions

Performs object detection on an image using a loaded YOLO model.

Loads a YOLO model from an ONNX file.

Loads class labels from a JSON file and adds them to an existing model.

Runs inference on a preprocessed input tensor.

Functions

detect(model, image, opts \\ [])

@spec detect(model :: YOLO.Model.t(), image :: term(), opts :: Keyword.t()) :: [
  [float()]
]

Performs object detection on an image using a loaded YOLO model.

Arguments

  • model - A loaded YOLO.Model.t() struct
  • image - Input image in the format expected by the frame scaler (e.g. Evision.Mat)
  • opts - Detection options

Options

  • prob_threshold - Minimum probability threshold for detections (default: 0.25)
  • iou_threshold - IoU threshold for non-maximum suppression (default: 0.45)
  • frame_scaler - Module implementing YOLO.FrameScaler behaviour (default: YOLO.FrameScalers.EvisionScaler)

Returns

A list of detections, where each detection is a list [cx, cy, w, h, prob, class_idx]:

  • cx, cy - Center coordinates of bounding box
  • w, h - Width and height of bounding box
  • prob - Detection probability
  • class_idx - Class index

The output can be converted to structured maps using to_detected_objects/1.

Example

model
|> YOLO.Model.detect(image, prob_threshold: 0.5)
|> YOLO.Model.to_detected_objects()

load(options)

@spec load(Keyword.t()) :: YOLO.Model.t()

Loads a YOLO model from an ONNX file.

Required Options

  • model_path - Path to the .onnx model file

Optional Options

  • model_impl - Module implementing the YOLO.Model behaviour (default: YOLO.Models.Ultralytics)

  • classes_path - Path to the .json file containing class labels. If not provided, classes will not be loaded.

  • eps - List of execution providers to pass to Ortex (e.g. [:coreml], [:cuda], [:tensorrt], [:directml]), default: [:cpu]

  • json_decoder - Function to decode JSON strings (default: &:json.decode/1)

Returns

A YOLO.Model.t() struct containing:

  • ref - Reference to the loaded ONNX model
  • model_impl - The module implementing the model version
  • classes - Map of class indices to labels
  • shapes - Input/output tensor shapes
  • model_data - Model-specific data

Examples

  yolox_model = YOLO.Model.load(
    model_path: "models/yolox_s.onnx",
    classes_path: "models/coco_classes.json",
    model_impl: YOLO.Models.YOLOX
  )
  ultralytics_model = YOLO.Model.load(
    model_path: "models/yolo11n.onnx",
    classes_path: "models/coco_classes.json",
    model_impl: YOLO.Models.Ultralytics
  )

load_classes(model, classes_path, options \\ [])

@spec load_classes(YOLO.Model.t(), String.t(), Keyword.t()) :: YOLO.Model.t()

Loads class labels from a JSON file and adds them to an existing model.

Arguments

  • model - A YOLO.Model.t() struct to update with class labels
  • classes_path - Path to the JSON file containing class labels
  • options - Keyword list of options

Options

  • json_decoder - Function to decode JSON strings (default: &:json.decode/1)

Returns

An updated YOLO.Model.t() struct with the classes field populated.

run(model, image_tensor)

@spec run(YOLO.Model.t(), Nx.Tensor.t()) :: Nx.Tensor.t()

Runs inference on a preprocessed input tensor.

Arguments

  • model - A loaded YOLO.Model.t() struct
  • image_tensor - Preprocessed input tensor matching model's expected shape

Returns

The raw output tensor from the model. For YOLOv8n:

  • Input shape: {1, 3, 640, 640}
  • Output shape: {1, 84, 8400}

This is typically used internally by detect/3 and shouldn't need to be called directly.