View Source Bumblebee.Text.GenerationConfig (Bumblebee v0.5.3)
A set of configuration options controlling text generation.
This struct is expected by Bumblebee.Text.Generation.build_generate/3
.
Configuration
Options controlling length
:max_new_tokens
- the maximum number of tokens to be generated, ignoring the number of tokens in the prompt. Defaults to20
:min_new_tokens
- the minimum number of tokens to be generated, ignoring the number of tokens in the prompt:max_length
- the maximum length of the sequence to be generated. Note that this length includes the length of the input prompt (including padding). In general, prefer:max_new_tokens
, which ignores the number of tokens in the prompt:min_length
- the minimum length of the sequence to be generated. Note that this length includes the length of the input prompt (including padding). In general, prefer:min_new_tokens
, which ignores the number of tokens in the prompt
Options controlling strategy
:strategy
- the method deciding how tokens are selected, it has a significant impact on the quality of the generated sequence. Should be a map with:type
and strategy-specific options.:greedy_search
- the most straightforward approach, where in every iteration the most probable token (as given by the model) is taken.Example:
%{type: :greedy_search}
.:contrastive_search
- state-of-the-art decoding method, capable of producing high quality, coherent sequences. The results are deterministic. See this article for more details.:top_k
(required) - the number of highest probability vocabulary tokens considered as a continuation:alpha
(required) - the weight of degeneration penalty. It balances the model confidence and the penalty
Example:
%{type: :contrastive_search, top_k: 4, alpha: 0.6}
.:multinomial_sampling
- this method samples tokens according to the probability distribution given by the model. The results are nondeterministic, unless a seed is specified.:top_k
(optional) - when specified, restricts sampling to top-k most probable candidates:top_p
(optional) - when specified, restricts sampling to tokens which probabilities add up to top-p
. Defaults to
%{type: :greedy_search}
Options controlling generated tokens
:decoder_start_token_id
- the id of the initial token when generating from scratch, in case of encoder-decoder models:forced_bos_token_id
- the id of the token to force as the first generated token:forced_eos_token_id
- the id of the token to force as the last generated token when:max_length
is reached:forced_token_ids
- a list of{index, token_id}
pairs forcingtoken_id
to appear atindex
in the generated sequence. Defaults to[]
:suppressed_token_ids
- a list of token ids to suppress during generation. Defaults to[]
:no_repeat_ngram_length
- when set, n-grams of the given length can occur only once in the generated sequence:temperature
- enables exponential scaling of the output probability distribution. The temperature value effectively determines the randomness of the predicted tokens. Values smaller than 1.0 decrease the randomness, while bigger values increase it. Note that this is only relevant for generation:strategy
that does sampling based on the output probability distribution
Special tokens used during generation
:bos_token_id
- the id of the beginning-of-sequence token:eos_token_id
- the id of the end-of-sequence token:pad_token_id
- the id of the padding token
Summary
Types
@type t() :: %Bumblebee.Text.GenerationConfig{ bos_token_id: term(), decoder_start_token_id: term(), eos_token_id: term(), extra_config: term(), forced_bos_token_id: term(), forced_eos_token_id: term(), forced_token_ids: term(), max_length: term(), max_new_tokens: term(), min_length: term(), min_new_tokens: term(), no_repeat_ngram_length: term(), pad_token_id: term(), strategy: term(), suppressed_token_ids: term(), temperature: term() }