Model inspection utilities for vLLM.
Version
- Requested: 0.14.0
- Observed at generation: 0.14.0
Runtime Options
All functions accept a __runtime__ option for controlling execution behavior:
Vllm.ModelInspection.some_function(args, __runtime__: [timeout: 120_000])Supported runtime options
:timeout- Call timeout in milliseconds (default: 120,000ms / 2 minutes):timeout_profile- Use a named profile (:default,:ml_inference,:batch_job,:streaming):stream_timeout- Timeout for streaming operations (default: 1,800,000ms / 30 minutes):session_id- Override the session ID for this call:pool_name- Target a specific Snakepit pool (multi-pool setups):affinity- Override session affinity (:hint,:strict_queue,:strict_fail_fast)
Timeout Profiles
:default- 2 minute timeout for regular calls:ml_inference- 10 minute timeout for ML/LLM workloads:batch_job- Unlimited timeout for long-running jobs:streaming- 2 minute timeout, 30 minute stream_timeout
Example with timeout override
# For a long-running ML inference call
Vllm.ModelInspection.predict(data, __runtime__: [timeout_profile: :ml_inference])
# Or explicit timeout
Vllm.ModelInspection.predict(data, __runtime__: [timeout: 600_000])
# Route to a pool and enforce strict affinity
Vllm.ModelInspection.predict(data, __runtime__: [pool_name: :strict_pool, affinity: :strict_queue])See SnakeBridge.Defaults for global timeout configuration.
Summary
Functions
Format indices into range notation (e.g., [0,1,2,4,5,6] -> '0-2, 4-6').
Format a module tree with indentation, grouping identical layers.
Get a signature for a child module to detect duplicates.
Get info string for a module.
Format a model into a transformers-style hierarchical string.
Functions
@spec _format_index_ranges( [integer()], keyword() ) :: {:ok, String.t()} | {:error, Snakepit.Error.t()}
Format indices into range notation (e.g., [0,1,2,4,5,6] -> '0-2, 4-6').
Parameters
indices(list(integer()))
Returns
String.t()
@spec _format_module_tree(term()) :: {:ok, [String.t()]} | {:error, Snakepit.Error.t()}
Format a module tree with indentation, grouping identical layers.
Produces output like: (layers): ModuleList(
(0-27, 29-47): 47 x LlamaDecoderLayer(
...
)
(28, 48): 2 x DifferentDecoderLayer(
...
))
Parameters
module(term())name(String.t() default: '')indent(integer() default: 0)
Returns
list(String.t())
@spec _get_child_signature( term(), keyword() ) :: {:ok, String.t()} | {:error, Snakepit.Error.t()}
Get a signature for a child module to detect duplicates.
Parameters
child(term())
Returns
String.t()
@spec _get_module_info( term(), keyword() ) :: {:ok, String.t()} | {:error, Snakepit.Error.t()}
Get info string for a module.
Parameters
module(term())
Returns
String.t()
@spec format_model_inspection( term(), keyword() ) :: {:ok, String.t()} | {:error, Snakepit.Error.t()}
Format a model into a transformers-style hierarchical string.
Parameters
model(term())
Returns
String.t()