Tinkex.QueueStateLogger (Tinkex v0.3.4)
View SourceShared logging utilities for queue state changes.
Provides human-readable messages matching Python SDK behavior,
with debouncing to avoid log spam. Used by SamplingClient and
TrainingClient to automatically log when queue state transitions
indicate rate limiting or capacity issues. Server-supplied reasons
take precedence when available.
Debouncing
Logs are rate-limited to once per 60 seconds (by default) per client
to prevent spam during sustained rate limiting. The maybe_log/5
function handles this automatically.
Message Format
Messages follow the Python SDK format:
[warning] Sampling is paused for sampler abc-123. Reason: concurrent sampler weights limit hit
[warning] Training is paused for model-xyz. Reason: Tinker backend is running short on capacity, please waitClient-Specific Reasons
- SamplingClient: "concurrent sampler weights limit hit" for rate limits
- TrainingClient: "concurrent training clients rate limit hit" for rate limits
- Both use "Tinker backend is running short on capacity, please wait" for capacity limits
Summary
Functions
Log a queue state change with appropriate human-readable reason.
Combined debouncing and logging in a single call.
Get human-readable reason for queue state.
Resolve reason string, preferring a non-empty server-supplied value.
Check if enough time has passed since last log.
Types
Functions
@spec log_state_change(queue_state(), client_type(), String.t(), String.t() | nil) :: :ok
Log a queue state change with appropriate human-readable reason.
Does not log for :active state. For non-active states, logs a warning
with a human-readable message including the identifier and reason. When
provided, server_reason takes precedence over client defaults.
Parameters
queue_state- One of:active,:paused_rate_limit,:paused_capacity,:unknownclient_type- Either:samplingor:trainingidentifier- Session ID for sampling, model ID for trainingserver_reason- Optional server-supplied reason string
Examples
iex> Tinkex.QueueStateLogger.log_state_change(:paused_rate_limit, :sampling, "session-123")
:ok
# Logs: [warning] Sampling is paused for session-123. Reason: concurrent sampler weights limit hit
@spec maybe_log( queue_state(), client_type(), String.t(), integer() | nil, String.t() | nil ) :: integer() | nil
Combined debouncing and logging in a single call.
Checks if enough time has passed since last_logged_at, and if so,
logs the queue state change and returns the new timestamp. Otherwise,
returns the original timestamp unchanged.
Does not log for :active state regardless of timestamp.
Parameters
queue_state- The current queue stateclient_type- Either:samplingor:trainingidentifier- Session ID or model IDlast_logged_at- Timestamp of last log, ornilserver_reason- Optional server-supplied reason to log
Returns
The timestamp to store for next comparison:
- If logged: new current timestamp
- If not logged: same
last_logged_atvalue
Examples
iex> old_time = System.monotonic_time(:millisecond) - 61_000
iex> new_time = Tinkex.QueueStateLogger.maybe_log(:paused_rate_limit, :sampling, "sess-1", old_time)
iex> new_time > old_time
true
# Also logs the warning
iex> recent = System.monotonic_time(:millisecond) - 30_000
iex> same = Tinkex.QueueStateLogger.maybe_log(:paused_rate_limit, :sampling, "sess-1", recent)
iex> same == recent
true
# No log output
@spec reason_for_state(queue_state(), client_type()) :: String.t()
Get human-readable reason for queue state.
Returns different messages for sampling vs training rate limits to match Python SDK behavior.
Examples
iex> Tinkex.QueueStateLogger.reason_for_state(:paused_rate_limit, :sampling)
"concurrent sampler weights limit hit"
iex> Tinkex.QueueStateLogger.reason_for_state(:paused_rate_limit, :training)
"concurrent training clients rate limit hit"
iex> Tinkex.QueueStateLogger.reason_for_state(:paused_capacity, :sampling)
"Tinker backend is running short on capacity, please wait"
@spec resolve_reason(queue_state(), client_type(), String.t() | nil) :: String.t()
Resolve reason string, preferring a non-empty server-supplied value.
Check if enough time has passed since last log.
Returns true if logging should occur, false if still within debounce interval.
Parameters
last_logged- Timestamp (monotonic milliseconds) of last log, ornilif never loggedinterval- Minimum milliseconds between logs (default: 60,000)
Examples
iex> Tinkex.QueueStateLogger.should_log?(nil)
true
iex> old_time = System.monotonic_time(:millisecond) - 61_000
iex> Tinkex.QueueStateLogger.should_log?(old_time)
true
iex> recent_time = System.monotonic_time(:millisecond) - 30_000
iex> Tinkex.QueueStateLogger.should_log?(recent_time)
false