Per-model concurrency gating using semaphore-like permits.
Throttles request bursts by limiting concurrent requests per model. Supports adaptive mode that adjusts concurrency based on 429 responses.
Features
- Configurable per-model concurrency limits
- Adaptive mode: starts low, raises until 429, then backs off
- Non-blocking mode support for immediate returns
- ETS-based permit tracking for cross-process visibility
Summary
Functions
Acquire a permit for the given model.
Get the number of available permits for a model.
Get current permit state for a model.
Initialize the ETS table for permit tracking.
Release a permit for the given model.
Reset all state (useful for testing).
Signal that a 429 was received for adaptive backoff.
Signal that a request succeeded for adaptive raise.
Types
@type model_key() :: String.t()
@type permit_state() :: %{ current: non_neg_integer(), max: pos_integer(), adaptive_max: pos_integer() | nil, waiting: [pid()], holders: %{required(pid()) => {non_neg_integer(), pid()}} }
Functions
@spec acquire(model_key(), Gemini.RateLimiter.Config.t()) :: :ok | {:error, atom()}
Acquire a permit for the given model.
Returns immediately if a permit is available. If no permit is available:
- In blocking mode: waits until a permit becomes available
- In non-blocking mode: returns
{:error, :no_permit_available}
Parameters
model- Model nameconfig- Rate limiter configuration
Returns
:ok- Permit acquired{:error, :no_permit_available}- No permit available (non-blocking mode){:error, :concurrency_disabled}- Concurrency gating is disabled
@spec available_permits(model_key(), Gemini.RateLimiter.Config.t()) :: non_neg_integer()
Get the number of available permits for a model.
@spec get_state(model_key()) :: permit_state() | nil
Get current permit state for a model.
@spec init() :: :ok
Initialize the ETS table for permit tracking.
Called automatically when the RateLimitManager starts, but also lazily initialized on first access to support direct calls without the supervisor running.
@spec release(model_key()) :: :ok
Release a permit for the given model.
Called after a request completes (success or failure).
@spec reset_all() :: :ok
Reset all state (useful for testing).
@spec signal_429(model_key(), Gemini.RateLimiter.Config.t()) :: :ok
Signal that a 429 was received for adaptive backoff.
In adaptive mode, reduces the effective max concurrency.
@spec signal_success(model_key(), Gemini.RateLimiter.Config.t()) :: :ok
Signal that a request succeeded for adaptive raise.
In adaptive mode, gradually increases concurrency up to the ceiling.