Snakepit.Telemetry.GPUProfiler (Snakepit v0.13.0)

Copy Markdown View Source

GPU memory and utilization profiler.

Legacy Optional Module

Snakepit does not call this module internally. It remains available for compatibility and may be removed in v0.16.0 or later.

Prefer host-managed GPU sampling and direct :telemetry emission for new integrations.

Periodically samples GPU metrics and emits telemetry events. Supports NVIDIA CUDA GPUs via nvidia-smi.

Summary

Functions

Returns a specification to start this module under a supervisor.

Disables GPU sampling.

Enables GPU sampling.

Returns profiler statistics.

Triggers an immediate GPU sample.

Updates the sampling interval.

Starts the GPU profiler.

Types

state()

@type state() :: %{
  interval_ms: pos_integer(),
  enabled: boolean(),
  sample_count: non_neg_integer(),
  last_sample_time: integer() | nil,
  timer_ref: reference() | nil,
  devices: [Snakepit.Hardware.Selector.device()],
  sampler_fun: (Snakepit.Hardware.Selector.device() ->
                  {:ok, map()} | {:error, term()}),
  sample_task_ref: reference() | nil,
  sample_task_pid: pid() | nil,
  sample_now_waiters: [GenServer.from()]
}

Functions

child_spec(init_arg)

Returns a specification to start this module under a supervisor.

See Supervisor.

disable(server \\ __MODULE__)

@spec disable(GenServer.server()) :: :ok

Disables GPU sampling.

enable(server \\ __MODULE__)

@spec enable(GenServer.server()) :: :ok

Enables GPU sampling.

get_stats(server \\ __MODULE__)

@spec get_stats(GenServer.server()) :: map()

Returns profiler statistics.

sample_now(server \\ __MODULE__)

@spec sample_now(GenServer.server()) :: :ok | {:error, :no_gpu}

Triggers an immediate GPU sample.

set_interval(server \\ __MODULE__, interval_ms)

@spec set_interval(GenServer.server(), pos_integer()) ::
  :ok | {:error, :invalid_interval}

Updates the sampling interval.

start_link(opts \\ [])

@spec start_link(keyword()) :: GenServer.on_start()

Starts the GPU profiler.

Options

  • :interval_ms - Sampling interval in milliseconds (default: 5000)
  • :enabled - Whether to start sampling immediately (default: true)
  • :name - GenServer name (default: MODULE)