Vllm.Config.PassConfig (VLLM v0.3.0)

Copy Markdown View Source

Configuration for custom Inductor passes.

This is separate from general CompilationConfig so that inductor passes don't all have access to full configuration - that would create a cycle as the PassManager is set as a property of config.

You must pass PassConfig to VLLMConfig constructor via the CompilationConfig constructor. VLLMConfig's post_init does further initialization. If used outside of the VLLMConfig, some fields may be left in an improper state.

Summary

Types

t()

@opaque t()

Functions

_skip_none_validation(ref, value, handler, opts \\ [])

@spec _skip_none_validation(SnakeBridge.Ref.t(), term(), term(), keyword()) ::
  {:ok, term()} | {:error, Snakepit.Error.t()}

Skip validation if the value is None when initialisation is delayed.

Parameters

  • value (term())
  • handler (term())

Returns

  • term()

compute_hash(ref, opts \\ [])

@spec compute_hash(
  SnakeBridge.Ref.t(),
  keyword()
) :: {:ok, String.t()} | {:error, Snakepit.Error.t()}

Produces a hash unique to the pass configuration.

Any new fields that affect compilation should be added to the hash. Any future fields that don't affect compilation should be excluded.

Returns

  • String.t()

default_fi_allreduce_fusion_max_size_mb(ref, opts \\ [])

@spec default_fi_allreduce_fusion_max_size_mb(
  SnakeBridge.Ref.t(),
  keyword()
) :: {:ok, %{optional(integer()) => float()}} | {:error, Snakepit.Error.t()}

Python method PassConfig.default_fi_allreduce_fusion_max_size_mb.

Returns

  • %{optional(integer()) => float()}

eliminate_noops(ref)

@spec eliminate_noops(SnakeBridge.Ref.t()) ::
  {:ok, term()} | {:error, Snakepit.Error.t()}

enable_qk_norm_rope_fusion(ref)

@spec enable_qk_norm_rope_fusion(SnakeBridge.Ref.t()) ::
  {:ok, term()} | {:error, Snakepit.Error.t()}

enable_sp(ref)

@spec enable_sp(SnakeBridge.Ref.t()) :: {:ok, term()} | {:error, Snakepit.Error.t()}

fi_allreduce_fusion_max_size_mb(ref)

@spec fi_allreduce_fusion_max_size_mb(SnakeBridge.Ref.t()) ::
  {:ok, term()} | {:error, Snakepit.Error.t()}

flashinfer_max_size(ref, world_size, opts \\ [])

@spec flashinfer_max_size(SnakeBridge.Ref.t(), integer(), keyword()) ::
  {:ok, term()} | {:error, Snakepit.Error.t()}

Returns the max communication size in bytes for flashinfer

allreduce fusion for the given world size. Returns None if world size is not supported by configs as it's not supported by flashinfer.

Parameters

  • world_size (integer())

Returns

  • term()

fuse_act_quant(ref)

@spec fuse_act_quant(SnakeBridge.Ref.t()) ::
  {:ok, term()} | {:error, Snakepit.Error.t()}

fuse_allreduce_rms(ref)

@spec fuse_allreduce_rms(SnakeBridge.Ref.t()) ::
  {:ok, term()} | {:error, Snakepit.Error.t()}

fuse_attn_quant(ref)

@spec fuse_attn_quant(SnakeBridge.Ref.t()) ::
  {:ok, term()} | {:error, Snakepit.Error.t()}

fuse_gemm_comms(ref)

@spec fuse_gemm_comms(SnakeBridge.Ref.t()) ::
  {:ok, term()} | {:error, Snakepit.Error.t()}

fuse_norm_quant(ref)

@spec fuse_norm_quant(SnakeBridge.Ref.t()) ::
  {:ok, term()} | {:error, Snakepit.Error.t()}

new(dataclass_self__, args, kwargs, opts \\ [])

@spec new(term(), term(), term(), keyword()) ::
  {:ok, SnakeBridge.Ref.t()} | {:error, Snakepit.Error.t()}

Constructs PassConfig.

Parameters

  • dataclass_self__ (term())
  • args (term())
  • kwargs (term())