Vllm.ModelInspection (VLLM v0.3.0)

Copy Markdown View Source

Model inspection utilities for vLLM.

Version

  • Requested: 0.14.0
  • Observed at generation: 0.14.0

Runtime Options

All functions accept a __runtime__ option for controlling execution behavior:

Vllm.ModelInspection.some_function(args, __runtime__: [timeout: 120_000])

Supported runtime options

  • :timeout - Call timeout in milliseconds (default: 120,000ms / 2 minutes)
  • :timeout_profile - Use a named profile (:default, :ml_inference, :batch_job, :streaming)
  • :stream_timeout - Timeout for streaming operations (default: 1,800,000ms / 30 minutes)
  • :session_id - Override the session ID for this call
  • :pool_name - Target a specific Snakepit pool (multi-pool setups)
  • :affinity - Override session affinity (:hint, :strict_queue, :strict_fail_fast)

Timeout Profiles

  • :default - 2 minute timeout for regular calls
  • :ml_inference - 10 minute timeout for ML/LLM workloads
  • :batch_job - Unlimited timeout for long-running jobs
  • :streaming - 2 minute timeout, 30 minute stream_timeout

Example with timeout override

# For a long-running ML inference call
Vllm.ModelInspection.predict(data, __runtime__: [timeout_profile: :ml_inference])

# Or explicit timeout
Vllm.ModelInspection.predict(data, __runtime__: [timeout: 600_000])

# Route to a pool and enforce strict affinity
Vllm.ModelInspection.predict(data, __runtime__: [pool_name: :strict_pool, affinity: :strict_queue])

See SnakeBridge.Defaults for global timeout configuration.

Summary

Functions

Format indices into range notation (e.g., [0,1,2,4,5,6] -> '0-2, 4-6').

Format a module tree with indentation, grouping identical layers.

Get a signature for a child module to detect duplicates.

Get info string for a module.

Format a model into a transformers-style hierarchical string.

Functions

_format_index_ranges(indices, opts \\ [])

@spec _format_index_ranges(
  [integer()],
  keyword()
) :: {:ok, String.t()} | {:error, Snakepit.Error.t()}

Format indices into range notation (e.g., [0,1,2,4,5,6] -> '0-2, 4-6').

Parameters

  • indices (list(integer()))

Returns

  • String.t()

_format_module_tree(module)

@spec _format_module_tree(term()) ::
  {:ok, [String.t()]} | {:error, Snakepit.Error.t()}

Format a module tree with indentation, grouping identical layers.

Produces output like: (layers): ModuleList(

(0-27, 29-47): 47 x LlamaDecoderLayer(
  ...
)
(28, 48): 2 x DifferentDecoderLayer(
  ...
)

)

Parameters

  • module (term())
  • name (String.t() default: '')
  • indent (integer() default: 0)

Returns

  • list(String.t())

_get_child_signature(child, opts \\ [])

@spec _get_child_signature(
  term(),
  keyword()
) :: {:ok, String.t()} | {:error, Snakepit.Error.t()}

Get a signature for a child module to detect duplicates.

Parameters

  • child (term())

Returns

  • String.t()

_get_module_info(module, opts \\ [])

@spec _get_module_info(
  term(),
  keyword()
) :: {:ok, String.t()} | {:error, Snakepit.Error.t()}

Get info string for a module.

Parameters

  • module (term())

Returns

  • String.t()

format_model_inspection(model, opts \\ [])

@spec format_model_inspection(
  term(),
  keyword()
) :: {:ok, String.t()} | {:error, Snakepit.Error.t()}

Format a model into a transformers-style hierarchical string.

Parameters

  • model (term())

Returns

  • String.t()