Nx.Vulkan.Node (nx_vulkan v0.1.0)

Copy Markdown View Source

Long-lived per-machine GPU node. A named GenServer that owns the spirit VkPipelineCache, the persistent buffer registry, and the watchdog/timeout layer. Clients submit work via with_node/2 (or the lower-level exec/2); the node serializes execution and reports timeouts/dead-server/etc. as error tuples.

This is the GPU-only generic core. MCMC / NUTS / sampler-specific dispatch logic lives in Exmc.NUTS.Vulkan.Dispatch (or any other client) and calls into this node via with_node/2.

Lifecycle

Start under your application's supervisor:

children = [
  {Nx.Vulkan.Node, []}
]

Or for ad-hoc use (REPL, benchmarks):

{:ok, _pid} = Nx.Vulkan.Node.start_link()

Generic dispatch

with_node/2 runs an arbitrary 0-arity function inside the GenServer process. The function has access to the spirit pipeline cache (loaded once at init) and may stash per-shader buffer state in process dict.

result = Nx.Vulkan.Node.with_node(fn ->
  # any GPU work — uses Nx.Vulkan.Native NIFs directly,
  # serialized through this GenServer process.
  Nx.Vulkan.Native.leapfrog_chain_synth(q, p, m, push, k, spv)
end)

Returns the function's return value, or {:error, reason} on watchdog timeout / dead node.

Watchdog

Reads Application.get_env(:nx_vulkan, :node_timeout_ms, :infinity). On timeout the calling process gets {:error, :node_timeout} and is free to fall back to a CPU/EXLA path. The GenServer process itself remains blocked on the in-flight NIF call until that returns — by design; cancelling Vulkan dispatches mid-flight is unsafe.

Summary

Functions

Whether the named node is alive.

Returns a specification to start this module under a supervisor.

Quick status read — uptime + total exec count.

Run a 0-arity function inside the node's GenServer process. Used for any GPU work that needs to share the pipeline cache and buffer state with other callers.

Functions

alive?(name \\ Nx.Vulkan.Node)

Whether the named node is alive.

child_spec(init_arg)

Returns a specification to start this module under a supervisor.

See Supervisor.

start_link(opts \\ [])

status(name \\ Nx.Vulkan.Node)

Quick status read — uptime + total exec count.

with_node(fun, name \\ Nx.Vulkan.Node)

Run a 0-arity function inside the node's GenServer process. Used for any GPU work that needs to share the pipeline cache and buffer state with other callers.

Returns the function's return value, or:

  • {:error, :node_timeout} if the call exceeded :nx_vulkan/:node_timeout_ms
  • {:error, :node_dead} if no node is registered under name