Estimates token counts using a chars/4 heuristic.
Good enough for budget decisions (when to compact, whether we're approaching limits). Not meant for billing accuracy.
Accounts for all content block types including text, tool use, tool results, and extended thinking/reasoning blocks. Thinking block tokens count toward the context budget even though they are not part of the final response text.
Summary
Functions
Estimates tokens for a single message, handling both string and block content.
Estimates tokens for a string or a list of messages.
Returns the context window limit for a given model name. Falls back to 200000 for unknown models.
Returns true if the estimated token count of messages is within the given ratio of the max_tokens budget (default 0.9).
Functions
@spec estimate_message_tokens(Alloy.Message.t()) :: non_neg_integer()
Estimates tokens for a single message, handling both string and block content.
@spec estimate_tokens(String.t() | [Alloy.Message.t()]) :: non_neg_integer()
Estimates tokens for a string or a list of messages.
@spec model_limit(String.t()) :: pos_integer()
Returns the context window limit for a given model name. Falls back to 200000 for unknown models.
@spec within_budget?([Alloy.Message.t()], pos_integer(), float()) :: boolean()
Returns true if the estimated token count of messages is within the given ratio of the max_tokens budget (default 0.9).