# `Resiliency.Hedged.Tracker`
[🔗](https://github.com/yoavgeva/resiliency/blob/v0.6.0/lib/resiliency/hedged/tracker.ex#L1)

Adaptive delay tracker with token-bucket hedge throttling.

Maintains a rolling window of latency samples and computes a target
percentile to use as the hedge delay. A token bucket limits the overall
hedge rate: each request credits a small amount, each hedge costs more,
so hedging naturally throttles under load.

## How it works

The tracker is a GenServer that holds two pieces of mutable state: a
`Resiliency.Hedged.Percentile` circular buffer of recent latency samples,
and a floating-point token bucket.

**Adaptive delay** — After every completed request, the caller records the
observed latency via `record/2`. The sample is added to the circular buffer
(see `Resiliency.Hedged.Percentile`). When `get_config/1` is called, the
tracker computes the configured percentile (e.g., p95) of the buffered
samples and clamps the result to `[min_delay, max_delay]`. Until at least
`:min_samples` observations have been recorded, the tracker returns
`:initial_delay` instead — a sensible default while the system warms up.

**Token bucket** — Each completed request credits `:token_success_credit`
tokens (default 0.1). Each hedge that fires costs `:token_hedge_cost`
tokens (default 1.0). Hedging is only allowed when the bucket contains at
least `:token_threshold` tokens. Because a hedge costs 10x what a success
earns, hedging naturally throttles to roughly 10% of traffic under
sustained load. If hedges consistently win (indicating a real latency
problem rather than a transient spike), the bucket refills quickly and
hedging continues. If hedges rarely help, the bucket drains and hedging
pauses — protecting the downstream service from unnecessary duplicate load.

**Statistics** — `stats/1` returns a snapshot of counters (total requests,
hedged requests, hedge wins), percentiles (p50, p95, p99), the current
adaptive delay, and the token bucket level. This is useful for dashboards
and alerting.

## Algorithm Complexity

| Function | Time | Space |
|---|---|---|
| `start_link/1` | O(1) | O(1) — empty buffer and initial token bucket |
| `get_config/1` | O(1) — percentile lookup is O(1) via tuple indexing | O(1) |
| `record/2` | O(n) where n = `buffer_size` — sorted insert/delete on the internal sorted list | O(n) — the circular buffer holds at most n samples |
| `stats/1` | O(1) — percentile lookups are O(1) | O(1) |

## Usage

    {:ok, _} = Resiliency.Hedged.Tracker.start_link(name: MyTracker)

    # Query the current adaptive delay and whether hedging is allowed
    {delay, allow?} = Resiliency.Hedged.Tracker.get_config(MyTracker)

    # Record an observation after a request completes
    Resiliency.Hedged.Tracker.record(MyTracker, %{latency_ms: 42, hedged?: false, hedge_won?: false})

    # Inspect tracker state
    Resiliency.Hedged.Tracker.stats(MyTracker)

In most cases you won't call these functions directly — `Resiliency.Hedged.run/3`
does it automatically when you pass a tracker name.

## Options

  * `:name` — required, the registered name for the tracker process
  * `:percentile` — target percentile for adaptive delay (default: `95`)
  * `:buffer_size` — max latency samples to keep (default: `1000`)
  * `:min_delay` — floor for adaptive delay in ms (default: `1`)
  * `:max_delay` — ceiling for adaptive delay in ms (default: `5_000`)
  * `:initial_delay` — delay used before enough samples are collected (default: `100`)
  * `:min_samples` — samples needed before switching from `:initial_delay` to adaptive (default: `10`)
  * `:token_max` — token bucket capacity (default: `10`)
  * `:token_success_credit` — tokens earned per completed request (default: `0.1`)
  * `:token_hedge_cost` — tokens spent when a hedge fires (default: `1.0`)
  * `:token_threshold` — minimum tokens required to allow hedging (default: `1.0`)

# `t`

```elixir
@type t() :: %Resiliency.Hedged.Tracker{
  buffer: term(),
  initial_delay: term(),
  max_delay: term(),
  min_delay: term(),
  min_samples: term(),
  percentile_target: term(),
  stats: term(),
  token_hedge_cost: term(),
  token_max: term(),
  token_success_credit: term(),
  token_threshold: term(),
  tokens: term()
}
```

Internal state of the tracker GenServer.

# `child_spec`

Returns a specification to start this module under a supervisor.

See `Supervisor`.

# `get_config`

```elixir
@spec get_config(GenServer.server()) :: {non_neg_integer(), boolean()}
```

Returns `{delay_ms, allow_hedge?}` based on current adaptive state.

The delay is the configured percentile of recent latency samples, clamped
to `[min_delay, max_delay]`. Before `:min_samples` observations are
recorded, `:initial_delay` is returned instead.

Hedging is allowed when the token bucket has at least `:token_threshold`
tokens remaining.

## Parameters

* `server` -- the name or PID of a running `Resiliency.Hedged.Tracker` process.

## Returns

A tuple `{delay_ms, allow_hedge?}` where `delay_ms` is a non-negative integer representing the adaptive delay in milliseconds, and `allow_hedge?` is a boolean indicating whether the token bucket permits hedging.

# `record`

```elixir
@spec record(GenServer.server(), map()) :: :ok
```

Records an observation after a request completes.

Expects a map with the following keys:

  * `:latency_ms` — end-to-end latency of the winning response in milliseconds
  * `:hedged?` — whether a hedge request was actually dispatched
  * `:hedge_won?` — whether the hedge (not the original) produced the winning response

The latency sample feeds the percentile buffer, while `:hedged?` and
`:hedge_won?` update the token bucket and counters.

## Parameters

* `server` -- the name or PID of a running `Resiliency.Hedged.Tracker` process.
* `observation` -- a map containing `:latency_ms` (number), `:hedged?` (boolean), and `:hedge_won?` (boolean).

## Returns

`:ok`. The observation is processed asynchronously via `GenServer.cast/2`.

# `start_link`

```elixir
@spec start_link(keyword()) :: GenServer.on_start()
```

Starts a tracker process linked to the caller.

Requires a `:name` option. See module documentation for all options.

## Parameters

* `opts` -- keyword list of options. See the module documentation for the full list. The `:name` option is required.

## Returns

`{:ok, pid}` on success, or `{:error, reason}` if the process cannot be started.

## Raises

Raises `KeyError` if the required `:name` option is not provided.

## Examples

    {:ok, _pid} = Resiliency.Hedged.Tracker.start_link(name: MyTracker)

    Resiliency.Hedged.Tracker.start_link(name: MyTracker, percentile: 99, min_delay: 5)

# `stats`

```elixir
@spec stats(GenServer.server()) :: map()
```

Returns current stats including counters, percentiles, delay, and tokens.

The returned map contains:

  * `:total_requests` — number of observations recorded
  * `:hedged_requests` — number of observations where a hedge fired
  * `:hedge_won` — number of times the hedge beat the original
  * `:p50`, `:p95`, `:p99` — latency percentiles from the sample buffer
  * `:current_delay` — adaptive delay that would be returned by `get_config/1`
  * `:tokens` — current token bucket level

## Parameters

* `server` -- the name or PID of a running `Resiliency.Hedged.Tracker` process.

## Returns

A map with keys `:total_requests`, `:hedged_requests`, `:hedge_won`, `:p50`, `:p95`, `:p99`, `:current_delay`, and `:tokens`.

---

*Consult [api-reference.md](api-reference.md) for complete listing*
