View Source Bumblebee (Bumblebee v0.4.0)

Pre-trained Axon models for easy inference and boosted training.

Bumblebee provides state-of-the-art, configurable Axon models. On top of that, it streamlines the process of loading pre-trained models by integrating with Hugging Face Hub and 🤗 Transformers.

Usage

You can load one of the supported models by specifying the model repository:

{:ok, model_info} = Bumblebee.load_model({:hf, "bert-base-uncased"})
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "bert-base-uncased"})

Then you are ready to make predictions:

inputs = Bumblebee.apply_tokenizer(tokenizer, "Hello Bumblebee!")
outputs = Axon.predict(model_info.model, model_info.params, inputs)

Tasks

On top of bare models, Bumblebee provides a number of "servings" that act as end-to-end pipelines for specific tasks.

serving = Bumblebee.Text.fill_mask(model_info, tokenizer)
Nx.Serving.run(serving, "The capital of [MASK] is Paris.")
#=> %{
#=>   predictions: [
#=>     %{score: 0.9279842972755432, token: "france"},
#=>     %{score: 0.008412551134824753, token: "brittany"},
#=>     %{score: 0.007433671969920397, token: "algeria"},
#=>     %{score: 0.004957548808306456, token: "department"},
#=>     %{score: 0.004369721747934818, token: "reunion"}
#=>   ]
#=> }

As you can see the serving takes care of pre-processing the text input, runs the model and also post-processes its output into more structured data. In the above example we run serving on the fly, however for production usage you can start serving as a process and it will automatically batch requests from multiple clients. Processing inputs in batches is usually much more efficient, since it can take advantage of parallel capabilities of the target device, which is particularly relevant in case of GPU. For more details read the Nx.Serving docs.

For more examples see the Examples notebook.

Note

The models are generally large, so make sure to configure an efficient Nx backend, such as EXLA or Torchx.

Summary

Types

A model together with its state and metadata.

A location to fetch model files from.

Models

Builds an Axon model according to the given specification.

Loads a pre-trained model from a model repository.

Loads model specification from a model repository.

Featurizers

Featurizes input with the given featurizer.

Loads featurizer from a model repository.

Tokenizers

Tokenizes and encodes input with the given tokenizer.

Loads tokenizer from a model repository.

Schedulers

Loads scheduler from a model repository.

Initializes state for a new scheduler loop.

Predicts sample at the previous timestep using the given scheduler.

Functions

Returns the directory where downloaded files are stored.

Builds or updates a configuration object with the given options.

Loads generation config from a model repository.

Types

@type model_info() :: %{model: Axon.t(), params: map(), spec: Bumblebee.ModelSpec.t()}

A model together with its state and metadata.

@type repository() ::
  {:hf, String.t()} | {:hf, String.t(), keyword()} | {:local, Path.t()}

A location to fetch model files from.

Can be either:

  • {:hf, repository_id} - the repository on Hugging Face. Options may be passed as the third element:

    • :revision - the specific model version to use, it can be any valid git identifier, such as branch name, tag name, or a commit hash

    • :cache_dir - the directory to store the downloaded files in. Defaults to the standard cache location for the given operating system. You can also configure it globally by setting the BUMBLEBEE_CACHE_DIR environment variable

    • :offline - if true, only cached files are accessed and missing files result in an error. You can also configure it globally by setting the BUMBLEBEE_OFFLINE environment variable to true

    • :auth_token - the token to use as HTTP bearer authorization for remote files

    • :subdir - the directory within the repository where the files are located

  • {:local, directory} - the directory containing model files

Models

@spec build_model(Bumblebee.ModelSpec.t()) :: Axon.t()

Builds an Axon model according to the given specification.

Example

spec = Bumblebee.configure(Bumblebee.Vision.ResNet, architecture: :base, embedding_size: 128)
model = Bumblebee.build_model(spec)
Link to this function

load_model(repository, opts \\ [])

View Source
@spec load_model(
  repository(),
  keyword()
) :: {:ok, model_info()} | {:error, String.t()}

Loads a pre-trained model from a model repository.

Options

  • :spec - the model specification to use when building the model. By default the specification is loaded using load_spec/2

  • :module - the model specification module. By default it is inferred from the configuration file, if that is not possible, it must be specified explicitly

  • :architecture - the model architecture, must be supported by :module. By default it is inferred from the configuration file

  • :params_filename - the file with the model parameters to be loaded

  • :log_params_diff - whether to log missing, mismatched and unused parameters. By default diff is logged only if some parameters cannot be loaded

  • :backend - the backend to allocate the tensors on. It is either an atom or a tuple in the shape {backend, options}

Examples

By default the model type is inferred from configuration, so loading is as simple as:

{:ok, resnet} = Bumblebee.load_model({:hf, "microsoft/resnet-50"})
%{model: model, params: params, spec: spec} = resnet

You can explicitly specify a different architecture, in which case matching parameters are still loaded:

{:ok, resnet} = Bumblebee.load_model({:hf, "microsoft/resnet-50"}, architecture: :base)

To further customize the model, you can also pass the specification:

{:ok, spec} = Bumblebee.load_spec({:hf, "microsoft/resnet-50"})
spec = Bumblebee.configure(spec, num_labels: 10)
{:ok, resnet} = Bumblebee.load_model({:hf, "microsoft/resnet-50"}, spec: spec)
Link to this function

load_spec(repository, opts \\ [])

View Source
@spec load_spec(
  repository(),
  keyword()
) :: {:ok, Bumblebee.ModelSpec.t()} | {:error, String.t()}

Loads model specification from a model repository.

Options

  • :module - the model specification module. By default it is inferred from the configuration file, if that is not possible, it must be specified explicitly

  • :architecture - the model architecture, must be supported by :module. By default it is inferred from the configuration file

Examples

{:ok, spec} = Bumblebee.load_spec({:hf, "microsoft/resnet-50"})

You can explicitly specify a different architecture:

{:ok, spec} = Bumblebee.load_spec({:hf, "microsoft/resnet-50"}, architecture: :base)

Featurizers

Link to this function

apply_featurizer(featurizer, input, opts \\ [])

View Source
@spec apply_featurizer(Bumblebee.Featurizer.t(), any(), keyword()) :: any()

Featurizes input with the given featurizer.

Options

  • :defn_options - the options for JIT compilation. Note that this is only relevant for featurizers implemented with Nx. Defaults to []

Examples

featurizer = Bumblebee.configure(Bumblebee.Vision.ConvNextFeaturizer)
{:ok, img} = StbImage.read_file(path)
inputs = Bumblebee.apply_featurizer(featurizer, [img])
Link to this function

load_featurizer(repository, opts \\ [])

View Source
@spec load_featurizer(
  repository(),
  keyword()
) :: {:ok, Bumblebee.Featurizer.t()} | {:error, String.t()}

Loads featurizer from a model repository.

Options

  • :module - the featurizer module. By default it is inferred from the preprocessor configuration file, if that is not possible, it must be specified explicitly

Examples

{:ok, featurizer} = Bumblebee.load_featurizer({:hf, "microsoft/resnet-50"})

Tokenizers

Link to this function

apply_tokenizer(tokenizer, input, opts \\ [])

View Source

Tokenizes and encodes input with the given tokenizer.

Options

  • :add_special_tokens - whether to add special tokens. Defaults to true

  • :pad_direction - the padding direction, either :right or :left. Defaults to :right

  • :return_attention_mask - whether to return attention mask for encoded sequence. Defaults to true

  • :return_token_type_ids - whether to return token type ids for encoded sequence. Defaults to true

  • :return_special_tokens_mask - whether to return special tokens mask for encoded sequence. Defaults to false

  • :return_offsets - whether to return token offsets for encoded sequence. Defaults to false

  • :length - applies fixed length padding or truncation to the given input if set. Can be either a specific number or a list of numbers. When a list is given, the smallest number that exceeds all input lengths is used as the padding length

Examples

tokenizer = Bumblebee.load_tokenizer({:hf, "bert-base-uncased"})
inputs = Bumblebee.apply_tokenizer(tokenizer, ["The capital of France is [MASK]."])
Link to this function

load_tokenizer(repository, opts \\ [])

View Source
@spec load_tokenizer(
  repository(),
  keyword()
) :: {:ok, Bumblebee.Tokenizer.t()} | {:error, String.t()}

Loads tokenizer from a model repository.

Options

  • :module - the tokenizer module. By default it is inferred from the configuration files, if that is not possible, it must be specified explicitly

Examples

{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "bert-base-uncased"})

Schedulers

Link to this function

load_scheduler(repository, opts \\ [])

View Source
@spec load_scheduler(
  repository(),
  keyword()
) :: {:ok, Bumblebee.Scheduler.t()} | {:error, String.t()}

Loads scheduler from a model repository.

Options

  • :module - the scheduler module. By default it is inferred from the scheduler configuration file, if that is not possible, it must be specified explicitly

Examples

{:ok, scheduler} =
  Bumblebee.load_scheduler({:hf, "CompVis/stable-diffusion-v1-4", subdir: "scheduler"})
Link to this function

scheduler_init(scheduler, num_steps, sample_shape)

View Source

Initializes state for a new scheduler loop.

Returns a pair of {state, timesteps}, where state is an opaque container expected by scheduler_step/4 and timesteps is a sequence of subsequent timesteps for model forward pass.

Note that the number of timesteps may not match num_steps exactly. num_steps parameterizes sampling points, however depending on the method, sampling certain points may require multiple forward passes of the model and each element in timesteps corresponds to a single forward pass.

Link to this function

scheduler_step(scheduler, state, sample, prediction)

View Source

Predicts sample at the previous timestep using the given scheduler.

Takes the current sample and prediction (usually noise) returned by the model at the current timestep. Returns {state, prev_sample}, where state is the updated scheduler loop state and prev_sample is the predicted sample at the previous timestep.

Note that some schedulers require several forward passes of the model (and a couple calls to this function) to make an actual prediction for the previous sample.

Functions

@spec cache_dir() :: String.t()

Returns the directory where downloaded files are stored.

Link to this function

configure(config, options \\ [])

View Source

Builds or updates a configuration object with the given options.

Expects a configuration struct or a module supporting configuration. These are usually configurable:

Examples

To build a new configuration, pass a module:

featurizer = Bumblebee.configure(Bumblebee.Vision.ConvNextFeaturizer)
spec = Bumblebee.configure(Bumblebee.Vision.ResNet, architecture: :for_image_classification)

Similarly, you can update an existing configuration:

featurizer = Bumblebee.configure(featurizer, resize_method: :bilinear)
spec = Bumblebee.configure(spec, embedding_size: 128)
Link to this function

load_generation_config(repository, opts \\ [])

View Source

Loads generation config from a model repository.

Generation config includes a number of model-specific properties, so it is usually best to load the config and further configure, rather than building from scratch.

See Bumblebee.Text.GenerationConfig for all the available options.

Options

  • :spec_module - the model specification module. By default it is inferred from the configuration file, if that is not possible, it must be specified explicitly. Some models have extra options related to generations and those are loaded into a separate struct, stored under the :extra_config attribute

Examples

{:ok, generation_config} = Bumblebee.load_generation_config({:hf, "gpt2"})

generation_config = Bumblebee.configure(generation_config, max_new_tokens: 10)