View Source Converting ONNX models to Axon

    {:axon, ">= 0.5.0"},
    {:exla, ">= 0.5.0"},
    {:axon_onnx, ">= 0.4.0"},
    {:stb_image, ">= 0.6.0"},
    {:kino, ">= 0.9.0"},
    {:req, ">= 0.3.8"}
  # for Nvidia GPU change to "cuda111" for CUDA 11.1+ or "cuda118" for CUDA 11.8
  # CUDA 12.x not supported by XLA
  # or you can put this value in ENV variables in Livebook settings
  # XLA_TARGET=cuda111
  # system_env: %{"XLA_TARGET" => xla_target}

Converting an ONNX model into Axon

Axon is a new machine learning capability, specific to Elixir. We would like to take advantage of a large amount of models that have been written in other languages and machine learning frameworks. Let's take a look at how we could use a model developed in another language.

Converting models developed by data scientists into a production capable implementation is a challenge for all languages and frameworks. ONNX is an interchange format that allows models written in one language or framework to be converted into another language and framework.

The source model must use constructs mapped into ONNX. Also, the destination framework must support the model's ONNX constructs. From an Elixir focus, we are interested in ONNX models that axon_onnx can convert into Axon models.

Why is ONNX important to Axon?

Elixir can get access to thousands of public models and your organization may have private models written in other languages and frameworks. Axon will be hard pressed to quickly repeat the countless person-hours spent on developing models in other languages like Tensorflow and PyTorch. However, if the model can be converted into ONNX and then into Axon, we can directly run the model in Elixir.

Setting up our environment

Axon runs on top of Nx (Numerical Elixir). Nx has backends for both Google's XLA (via EXLA) and PyTorch (via Torchx). In this guide, we will use EXLA. We'll also convert from an ONNX model into an Axon model using axon_onnx.

You can find all dependencies in the installation cell at the top of the notebook. In there, you will also find the XLA_TARGET environment variable which you can set to "cuda111" or "rocm" if you have any of those GPUs available. Let's also configure Nx to store tensors in EXLA by default:

#  Nx.default_backend(EXLA.Backend)

We'll also need local access to ONNX files. For this notebook, the models/onnx folder contains the ONNX model file. This notebook assumes the output file location will be in models axon. Copy your ONNX model files into the models/onnx folder.

This opinionated module presents a simple API for loading in an ONNX file and saving the converted Axon model in the provided directory. This API will allow us to save multiple models pretty quickly.

defmodule OnnxToAxon do
  @moduledoc """
  Helper module from ONNX to Axon.

  @doc """
  Loads an ONNX model into Axon and saves the model

  ## Examples

      OnnxToAxon.onnx_axon(path_to_onnx_file, path_to_axon_dir)

  def onnx_axon(path_to_onnx_file, path_to_axon_dir) do
    axon_name = axon_name_from_onnx_path(path_to_onnx_file)
    path_to_axon = Path.join(path_to_axon_dir, axon_name)

    {model, parameters} = AxonOnnx.import(path_to_onnx_file)
    model_bytes = Axon.serialize(model, parameters)
    File.write!(path_to_axon, model_bytes)

  defp axon_name_from_onnx_path(onnx_path) do
    model_root = onnx_path |> Path.basename() |> Path.rootname()

ONNX model

For this example, we'll use a couple ONNX models that have been saved in the Huggingface Hub.

The ONNX models were trained in (PyTorch) using the following notebooks:

To repeat this notebook, the onnx files for this notebook can be found on huggingface hub. Download the onnx models from:

Download the files and place them in a directory of your choice. By default, we will assume you downloaded them to the same directory as the notebook:!(__DIR__)

Now let's convert an ONNX model into Axon

path_to_onnx_file = "cats_v_dogs.onnx"
path_to_axon_dir = "."
OnnxToAxon.onnx_axon(path_to_onnx_file, path_to_axon_dir)
path_to_onnx_file = "cat_dog_breeds.onnx"
path_to_axon_dir = "."
OnnxToAxon.onnx_axon(path_to_onnx_file, path_to_axon_dir)

Inference on ONNX derived models

To run inference on the model, you'll need 10 images focused on cats or dogs. You can download the images used in training the model at:


Or you can find or use your own images. In this notebook, we are going to use the local copies of the Oxford Pets dataset that was used in training the model.

Let's load the Axon model.

cats_v_dogs =!("cats_v_dogs.axon")
{cats_v_dogs_model, cats_v_dogs_params} = Axon.deserialize(cats_v_dogs)

We need a tensor representation of an image. Let's start by looking at samples of our data.!("oxford-iiit-pet/images/havanese_71.jpg")

To manipulate the images, we will use the StbImage library:

{:ok, img} = StbImage.read_file("oxford-iiit-pet/images/havanese_71.jpg")
%StbImage{data: binary, shape: shape, type: type} = StbImage.resize(img, 224, 224)

Now let's work on a batch of images and convert them to tensors. Here are the images we will work with:

file_names = [

Next we resize the images:

resized_images =, fn file_name ->
    ("oxford-iiit-pet/images/" <> file_name)
    |> IO.inspect(label: file_name)
    |> StbImage.read_file!()
    |> StbImage.resize(224, 224)

And finally convert them into tensors by using StbImage.to_nx/1. The created tensor will have three axes, named :height, :width, and :channel respectively. Our goal is to stack the tensors, then normalize and transpose their axes to the order expected by the neural network:

img_tensors =
  |> Nx.stack(name: :index)
  |> Nx.divide(255.0)
  |> Nx.transpose(axes: [:index, :channels, :height, :width])

With our input data, it is finally time to work on predictions. First let's define a helper module:

defmodule Predictions do
  @doc """
  When provided a Tensor of single label predictions, returns the best vocabulary match for
  each row in the prediction tensor.

  ## Examples

     # iex> Predictions.sindle_label_prediction(path_to_onnx_file, path_to_axon_dir)
     # ["dog", "cat", "dog"]

  def single_label_classification(predictions_batch, vocabulary) do
    IO.inspect(Nx.shape(predictions_batch), label: "predictions batch shape")

    for prediction_tensor <- Nx.to_batched(predictions_batch, 1) do
      {_prediction_value, prediction_label} =
        |> Nx.to_flat_list()
        |> Enum.max()


Now we deserialize the model

{cats_v_dogs_model, cats_v_dogs_params} = Axon.deserialize(cats_v_dogs)

run a prediction using the EXLA compiler for performance

tensor_of_predictions =
  Axon.predict(cats_v_dogs_model, cats_v_dogs_params, img_tensors, compiler: EXLA)

and finally retrieve the predicted label

dog_cat_vocabulary = [

Predictions.single_label_classification(tensor_of_predictions, dog_cat_vocabulary)

Let's repeat the above process for the dog and cat breed model.

cat_dog_vocabulary = [
cat_dog_breeds =!("cat_dog_breeds.axon")
{cat_dog_breeds_model, cat_dog_breeds_params} = Axon.deserialize(cat_dog_breeds)
Axon.predict(cat_dog_breeds_model, cat_dog_breeds_params, img_tensors)
|> Predictions.single_label_classification(cat_dog_vocabulary)

For cat and dog breeds, the model performed pretty well, but it was not perfect.