View Source YOLO.Models.Ultralytics (YOLO v0.1.2)
Ultralytics model implementation for preprocessing input images and postprocessing detections using non-maximum suppression (NMS).
Supports YOLOv8 and YOLOv11 models trained on the COCO dataset (80 classes).
Summary
Functions
Post-processes the model's raw output to produce a filtered list of detected objects.
Preprocesses an input image to match the model's required format.
Functions
@spec postprocess( model :: YOLO.Model.t(), model_output :: Nx.Tensor.t(), scaling_config :: YOLO.FrameScalers.ScalingConfig.t(), opts :: Keyword.t() ) :: [[float()]]
Post-processes the model's raw output to produce a filtered list of detected objects.
The raw output tensor has shape {1, 84, 8400} where:
- First dimension (1) is the batch size
- Second dimension (84) contains 4 bbox coordinates + 80 class probabilities per detection
- Third dimension (8400) is the number of candidate detections
The processing steps are:
- Reshapes and transposes output to {8400, 84} format
- Applies non-maximum suppression (NMS) to filter overlapping detections
- Scales bounding boxes back to original image dimensions
Arguments
model
- YOLO.Model struct containing model metadatamodel_output
- Raw output tensor from model inference{1, 84, 8400}
scaling_config
- Scaling configuration from preprocessing stepopts
- Keyword list of options:prob_threshold
- Minimum probability threshold for detectionsiou_threshold
- IoU threshold for non-maximum suppressionnms_fun
- Optional custom NMS function (defaults toYOLO.NMS.run/3
)
Returns
List of detections where each detection is a list [cx, cy, w, h, prob, class_idx]:
- cx, cy - Center coordinates of bounding box
- w, h - Width and height of bounding box
- prob - Detection probability
- class_idx - Class index (0-79 corresponding to COCO classes)
@spec preprocess(YOLO.Model.t(), term(), Keyword.t()) :: {Nx.Tensor.t(), YOLO.FrameScalers.ScalingConfig}
Preprocesses an input image to match the model's required format.
The preprocessing steps are:
- Scales and pads the image to match model input dimensions while preserving aspect ratio
- Converts RGB to BGR color format (Ultralytics models expect BGR input)
- Normalizes pixel values to [0,1] range by dividing by 255
- Transposes dimensions to match model's expected format
- Adds batch dimension
Arguments
model
- YOLO.Model struct containing model metadataimage
- Input image in implementation's native formatoptions
- Keyword list of options including:frame_scaler
module
Returns
{input_tensor, scaling_config}
tuple where:input_tensor
has shape {1, 3, height, width}scaling_config
contains scaling/padding info for postprocessing