Summary

Types

cache()

Callbacks

Returns a configuration module for extra model-specific generation attributes to extend the base Bumblebee.Text.GenerationConfig.

init_cache(spec, batch_size, max_length, inputs)

Initializes an opaque cache input for iterative inference.

traverse_cache(spec, cache, function)

Traverses all batched tensors in the cache.

Functions

build_generate(model, spec, config, opts \\ [])

Builds a numerical definition that generates sequences of tokens using the given language model.

extra_config_module(spec)

Returns a configuration module for extra model-specific generation attributes to extend the base Bumblebee.Text.GenerationConfig.

init_cache(spec, batch_size, max_length, inputs)

Initializes an opaque cache input for iterative inference.

traverse_cache(spec, cache, fun)

Calls fun for every batched tensor in the cache.

Types

cache()

@type cache() :: Nx.Tensor.t() | Nx.Container.t()

Callbacks

extra_config_module(spec)

(optional)

@callback extra_config_module(spec :: Bumblebee.ModelSpec.t()) :: module()

Returns a configuration module for extra model-specific generation attributes to extend the base Bumblebee.Text.GenerationConfig.

init_cache(spec, batch_size, max_length, inputs)

@callback init_cache(
  spec :: Bumblebee.ModelSpec.t(),
  batch_size :: pos_integer(),
  max_length :: pos_integer(),
  inputs :: map()
) :: cache()

Initializes an opaque cache input for iterative inference.

traverse_cache(spec, cache, function)

@callback traverse_cache(
  spec :: Bumblebee.ModelSpec.t(),
  cache(),
  (Nx.Tensor.t() -> Nx.Tensor.t())
) :: cache()

Traverses all batched tensors in the cache.

This function is used when the cache needs to be inflated or deflated for a different batch size.

Functions

build_generate(model, spec, config, opts \\ [])

@spec build_generate(
  Axon.t(),
  Bumblebee.ModelSpec.t(),
  Bumblebee.Text.GenerationConfig.t(),
  keyword()
) :: (params :: map(), inputs :: map() -> Nx.t())

Builds a numerical definition that generates sequences of tokens using the given language model.

The model should be either a decoder or an encoder-decoder. The tokens are generated by iterative inference using the decoder (autoregression), until the termination criteria are met.

In case of encoder-decoder models, the corresponding encoder is run only once and the intermediate state is reused during all iterations.

The generation is controlled by a number of options given as %Bumblebee.Text.GenerationConfig{}, see the corresponding docs for more details.

Options

:seed - random seed to use when sampling. By default the current timestamp is used
:logits_processors - a list of numerical functions to modify predicted scores at each generation step. The functions are applied in order, after all default processors

extra_config_module(spec)

@spec extra_config_module(Bumblebee.ModelSpec.t()) :: module() | nil

Returns a configuration module for extra model-specific generation attributes to extend the base Bumblebee.Text.GenerationConfig.

init_cache(spec, batch_size, max_length, inputs)

@spec init_cache(Bumblebee.ModelSpec.t(), pos_integer(), pos_integer(), map()) ::
  cache()

Initializes an opaque cache input for iterative inference.

traverse_cache(spec, cache, fun)

@spec traverse_cache(
  Bumblebee.ModelSpec.t(),
  cache(),
  (Nx.Tensor.t() -> Nx.Tensor.t())
) :: cache()

Calls fun for every batched tensor in the cache.

Settings View Source Bumblebee.Text.Generation behaviour (Bumblebee v0.4.2)

cache()

extra_config_module(spec)

init_cache(spec, batch_size, max_length, inputs)

traverse_cache(spec, cache, function)

build_generate(model, spec, config, opts \\ [])

extra_config_module(spec)

init_cache(spec, batch_size, max_length, inputs)

traverse_cache(spec, cache, fun)

View Source Bumblebee.Text.Generation behaviour (Bumblebee v0.4.2)