View Source Ortex.Serving (Ortex v0.1.10)
Ortex.Serving Documentation
This is a lightweight wrapper for using Nx.Serving behaviour with Ortex. Using jit and
defn functions in this are not supported, it is strictly for serving batches to
an Ortex.Model for inference.
examples
Examples
inline-serverless-workflow
Inline/serverless workflow
To quickly create an Ortex.Serving and run it
iex> model = Ortex.load("./models/resnet50.onnx")
iex> serving = Nx.Serving.new(Ortex.Serving, model)
iex> batch = Nx.Batch.stack([{Nx.broadcast(0.0, {3, 224, 224})}])
iex> {result} = Nx.Serving.run(serving, batch)
iex> result |> Nx.backend_transfer |> Nx.argmax(axis: 1)
#Nx.Tensor<
s64[1]
[499]
>
stateful-process-workflow
Stateful/process workflow
An Ortex.Serving can also be started in your Application's supervision tree
model = Ortex.load("./models/resnet50.onnx")
children = [
{Nx.Serving,
serving: Nx.Serving.new(Ortex.Serving, model),
name: MyServing,
batch_size: 10,
batch_timeout: 100}
]
opts = [strategy: :one_for_one, name: OrtexServing.Supervisor]
Supervisor.start_link(children, opts)With the application started, batches can now be sent to the Ortex.Serving process
iex> Nx.Serving.batched_run(MyServing, Nx.Batch.stack([{Nx.broadcast(0.0, {3, 224, 224})}]))
...> {#Nx.Tensor<
f32[1][1000]
Ortex.Backend
[
[...]
]
>}