Gemini.RateLimiter.State (GeminiEx v0.9.0)

Copy Markdown View Source

ETS-based state management for rate limiting.

Tracks per-model/location/metric state including:

  • retry_until timestamps derived from 429 RetryInfo
  • Token usage sliding windows for budget estimation
  • Concurrency permits for gating

State is keyed by {model, location, metric} tuples for fine-grained tracking.

Summary

Functions

Build a state key from model, location, and metric.

Clear the retry state for a key (called after successful request).

Get current usage within the sliding window.

Get the current retry state details for a key.

Get the current retry_until timestamp for a given key.

Initialize the ETS table for state storage.

Reconcile a reservation with actual usage, returning surplus or charging shortfall.

Record token usage in the sliding window.

Remove a reservation without adding usage (e.g., when the request never executed).

Reset all state (useful for testing).

Update the retry_until state from a 429 response with RetryInfo.

Atomically reserve tokens in the current window.

Types

reservation_ctx()

@type reservation_ctx() :: %{
  reserved_tokens: non_neg_integer(),
  estimated_tokens: non_neg_integer(),
  window_start: DateTime.t() | nil,
  window_end: DateTime.t() | nil,
  budget: non_neg_integer() | nil
}

retry_state()

@type retry_state() :: %{
  retry_until: DateTime.t() | nil,
  quota_metric: String.t() | nil,
  quota_id: String.t() | nil,
  quota_dimensions: map() | nil,
  quota_value: term() | nil,
  last_429_at: DateTime.t() | nil
}

state_key()

@type state_key() :: {model :: String.t(), location :: String.t(), metric :: atom()}

usage_window()

@type usage_window() :: %{
  input_tokens: non_neg_integer(),
  output_tokens: non_neg_integer(),
  reserved_tokens: non_neg_integer(),
  window_start: DateTime.t(),
  window_duration_ms: pos_integer()
}

Functions

build_key(model, location, metric)

@spec build_key(String.t(), String.t() | nil, atom()) :: state_key()

Build a state key from model, location, and metric.

clear_retry_state(key)

@spec clear_retry_state(state_key()) :: :ok

Clear the retry state for a key (called after successful request).

get_current_usage(key)

@spec get_current_usage(state_key()) :: usage_window() | nil

Get current usage within the sliding window.

get_retry_state(key)

@spec get_retry_state(state_key()) :: retry_state() | nil

Get the current retry state details for a key.

get_retry_until(key)

@spec get_retry_until(state_key()) :: DateTime.t() | nil

Get the current retry_until timestamp for a given key.

Returns nil if no retry is needed or the timestamp has passed.

init()

@spec init() :: :ok

Initialize the ETS table for state storage.

Called automatically when the RateLimitManager starts, but also lazily initialized on first access to support direct calls without the supervisor running.

reconcile_reservation(key, reservation_ctx, usage_map, opts \\ [])

@spec reconcile_reservation(state_key(), reservation_ctx(), map() | nil, keyword()) ::
  usage_window()

Reconcile a reservation with actual usage, returning surplus or charging shortfall.

record_usage(key, input_tokens, output_tokens, opts \\ [])

@spec record_usage(state_key(), non_neg_integer(), non_neg_integer(), keyword()) ::
  :ok

Record token usage in the sliding window.

Parameters

  • key - State key tuple
  • input_tokens - Number of input tokens used
  • output_tokens - Number of output tokens used
  • opts - Options including:
    • :window_duration_ms - Custom window duration (default: 60_000)

release_reservation(key, reservation_ctx, opts \\ [])

@spec release_reservation(state_key(), reservation_ctx(), keyword()) :: usage_window()

Remove a reservation without adding usage (e.g., when the request never executed).

reset_all()

@spec reset_all() :: :ok

Reset all state (useful for testing).

set_retry_state(key, retry_info)

@spec set_retry_state(state_key(), map()) :: :ok

Update the retry_until state from a 429 response with RetryInfo.

Parameters

  • key - State key tuple
  • retry_info - Map containing retry delay and quota information

RetryInfo format from Gemini API

%{
  "retryDelay" => "60s",
  "quotaMetric" => "...",
  "quotaId" => "...",
  "quotaDimensions" => %{...}
}

try_reserve_budget(key, estimated_total_tokens, budget, opts \\ [])

@spec try_reserve_budget(
  state_key(),
  non_neg_integer(),
  non_neg_integer() | nil,
  keyword()
) ::
  {:ok, reservation_ctx()} | {:error, {:over_budget, map()}}

Atomically reserve tokens in the current window.

Returns {:ok, reservation_ctx} when the reservation fits, or {:error, {:over_budget, details}} when it would exceed the configured budget.