Vllm (VLLM v0.1.1)

View Source

SnakeBridge bindings for vllm.

Runtime Options

All functions accept a __runtime__ option for controlling execution behavior:

Elixir.Vllm.some_function(args, __runtime__: [timeout: 120_000])

Supported runtime options

  • :timeout - Call timeout in milliseconds (default: 120,000ms / 2 minutes)
  • :timeout_profile - Use a named profile (:default, :ml_inference, :batch_job, :streaming)
  • :stream_timeout - Timeout for streaming operations (default: 1,800,000ms / 30 minutes)
  • :session_id - Override the session ID for this call

Timeout Profiles

  • :default - 2 minute timeout for regular calls
  • :ml_inference - 10 minute timeout for ML/LLM workloads
  • :batch_job - Unlimited timeout for long-running jobs
  • :streaming - 2 minute timeout, 30 minute stream_timeout

Example with timeout override

# For a long-running ML inference call
Elixir.Vllm.predict(data, __runtime__: [timeout_profile: :ml_inference])

# Or explicit timeout
Elixir.Vllm.predict(data, __runtime__: [timeout: 600_000])

See SnakeBridge.Defaults for global timeout configuration.