This guide is the RUN-01 procedural companion to guides/telemetry.md. That file owns the ops event catalog for [:accrue, :ops, :*] — Accrue does not duplicate that table here. Use this document for ordered triage, Oban queue defaults, expanded Stripe verification, and the four mini-playbooks where sequence matters.
Library vs host: Accrue ships workers and suggested queue names; your host application configures and starts Oban (queues, concurrency, pruning). Queue names below are defaults Accrue documents in code — you may remap them in host config; treat symptoms and checks as patterns, not hard-coded production names.
Oban queue topology
Queue names are host-configurable; the table lists Accrue’s documented defaults from use Oban.Worker in accrue/lib today.
| Queue (default name) | Worker module | Role / when to look | Typical symptoms | Safe first checks |
|---|---|---|---|---|
:accrue_webhooks | Accrue.Webhook.DispatchWorker | Async webhook handler dispatch after ingest | Webhooks stuck :processing, DLQ growth, dead-letter ops | Inspect accrue_webhook_events, Oban retries for this queue, handler logs (no raw bodies) |
:accrue_mailers | Accrue.Workers.Mailer | Transactional email delivery | Mail backlog, PDF/email failures surfacing as ops | Oban job args shape, mailer adapter, ChromicPDF availability |
:accrue_meters | Accrue.Jobs.MeterEventsReconciler | Meter usage reconciliation | meter_reporting_failed ops, stale meter rows | Reconciler jobs, Stripe meter API health, accrue_meter_events |
:accrue_dunning | Accrue.Jobs.DunningSweeper | Subscription dunning sweeps | Unexpected dunning transitions | Scheduled runs, subscription state vs Stripe |
:accrue_reconcilers | Accrue.Jobs.ReconcileChargeFees | Fee reconciliation for charges | Fee drift vs Stripe balance | Reconciler errors, Stripe charge/balance transaction lookups |
:accrue_reconcilers | Accrue.Jobs.ReconcileRefundFees | Fee reconciliation for refunds | Refund fee mismatches | Same as above for refund path |
:accrue_scheduled | Accrue.Jobs.DetectExpiringCards | Card expiry notices / hygiene | Missing expiry emails, card warnings | Job schedule, customer PM metadata (PII-safe) |
:accrue_maintenance | Accrue.Webhook.Pruner | Webhook event retention pruning | Prune telemetry anomalies | Retention config, maintenance window, dry-run if offered |
Stripe verification pattern
Use a two-layer mental model whenever Stripe is involved:
- Accrue layer (operational): local rows (
accrue_*tables), telemetry andoperation_id, foreign keys and Stripe ids stored by Accrue (cus_*,sub_*,pi_*, Connect account ids, etc.). This is application state for billing workflows — useful for triage, not a substitute for Stripe’s financial records. For customer billing portal failures, correlate[:accrue, :billing, :billing_portal, :create]:stop/:exceptionlatency withaccrue.customer.idandoperation_idpertelemetry.md— do not paste%Accrue.BillingPortal.Session{}inspect output into tickets. For Stripe Checkout sessions created viaAccrue.Billing.create_checkout_session/2, use the same pattern on[:accrue, :billing, :checkout_session, :create], confirm whether the host runsAccrue.Processor.Fakevs live Stripe, and read the PII-safe metadata contract attelemetry.md#billing-checkout-session-create— do not paste session URLs orclient_secretvalues into tickets. - Stripe layer (verification): confirm each issue against the Stripe resource type + id using canonical documentation (e.g. Webhooks, Testing webhooks, Billing meter events) and functional Dashboard paths (e.g. Developers → Webhooks → event deliveries) rather than brittle deep links.
For finance and tax reporting, use Stripe Dashboard / reporting products as your source of truth; Accrue focuses on state, webhooks, and replay in your app.
Mini-playbook: [:accrue, :ops, :webhook_dlq, :dead_lettered]
- Confirm scope: identify
event_id/processor_event_idfrom telemetry or admin (do not paste full webhook payloads or secrets into tickets). - Inspect the
accrue_webhook_eventsrow and last error; decide fix vs replay before mutating data. - Check Oban for
Accrue.Webhook.DispatchWorkeron:accrue_webhooks(see Oban queue topology); ensure the host queue is running and not wedged. - If replay is required, prefer admin-gated or documented replay flows; use dry-run when available — avoid destructive deletes from this path.
- Cross-check the same event type in Stripe via Developers → Webhooks → recent deliveries (Webhook docs).
- After fix, enqueue or allow retry; watch
[:accrue, :ops, :webhook_dlq, :replay]and related metrics for confirmation.
Mini-playbook: [:accrue, :ops, :events_upcast_failed]
- Record
event_id,type, andschema_versionfrom the ops metadata (identifiers only). - Determine whether a deployed upcaster is missing vs bad persisted data — do not replay until the schema path is understood.
- Inspect
Accrue.Events/ event storage per your host (see catalog row intelemetry.md); align with code version in the running release. - Verify Oban or inline retry behavior will not amplify a bad version skew; pause automated replay if unsure.
- Queue topology for indirect jobs: see Oban queue topology if downstream dispatch is involved.
- Validate against Stripe only if the failing payload is a Stripe-sourced event; use Event object docs for shape, not as ledger truth.
Mini-playbook: [:accrue, :ops, :meter_reporting_failed]
Always read the contract (when the tuple fires and what each source means) at telemetry.md#meter-reporting-semantics before changing alert thresholds—this runbook is procedure only.
- Read
source(:sync,:webhook,:reconciler) plusmeter_event_id/event_namefrom metadata (identifiers only—no raw payloads). - Load the matching
accrue_meter_eventsrow and notestripe_status,stripe_error, and timestamps so you know whether the failure epoch is already terminal.
:sync (host request path)
- Correlate with the host request or job that called
Accrue.Billing.report_usage/3in the same transaction window; inspect logs aroundAccrue.Billing.MeterEventActionsfor processor errors surfaced synchronously. - Fix configuration or upstream Stripe errors, then retry the host operation with a fresh
operation_idonly when the business case requires a new attempt—idempotent replays should converge on the stored terminal row.
:reconciler (Oban :accrue_meters)
- Inspect Oban jobs for
Accrue.Jobs.MeterEventsReconcileron:accrue_meters(Oban queue topology); confirm the queue is running and not wedged behind retries. - After correcting Stripe meter setup or credentials, allow the reconciler to dequeue; watch
[:accrue, :ops, :meter_reporting_failed]and default metrics for confirmation.
:webhook (meter error report path)
- Trace the event through
accrue_webhook_eventsintoAccrue.Webhook.DefaultHandlerand the asyncAccrue.Webhook.DispatchWorkerpath; verify signature + dispatch health before mutating rows (Oban queue topology). - Resolve the upstream Stripe meter error, then replay or wait for the next reconciler pass; confirm the row leaves terminal
failedonly when business logic intentionally clears it.
Shared verification (all sources):
- Confirm API keys and Stripe meter configuration for the environment (no key material in logs).
- Cross-check Stripe usage reporting with Metered billing — operational alignment, not accounting close.
- After code or config fix, allow reconciler retry where applicable; watch ops counters and host metrics.
Mini-playbook: [:accrue, :ops, :revenue_loss]
- Capture
reason,subject_type,subject_id, and currency amounts from telemetry (aggregates / IDs only — no customer narrative in shared logs). - Triage Accrue rows (invoice, credit note, adjustment) that triggered the signal; avoid manual balance edits without a controlled procedure.
- Check related async work on
:accrue_reconcilersand:accrue_webhooksif the loss correlates with webhook or fee reconciliation (Oban queue topology). - In Stripe, locate the same business object (charge, refund, dispute) via Dashboard search or list filters; use Balance transactions categories as reference for classification, not as instructions to reproduce Sigma in-app.
- Document outcome in your ticketing system; escalate finance questions on Stripe’s side, not via Accrue as a ledger substitute.
RUN-01 coverage
- Full ops tuple list and one-line first actions live under
## Operator runbooks (first actions)intelemetry.md— bookmark that table for every RUN-01 class, including:connect_account_deauthorized,:connect_payout_failed,:dunning_exhaustion,:charge_failed,:incomplete_expired,:pdf_adapter_unavailable, replay (:webhook_dlq, :replay), and prune (:webhook_dlq, :prune). - This file adds depth only for the four mini-playbooks above (
:webhook_dlq, :dead_lettered,:events_upcast_failed,:meter_reporting_failed,:revenue_loss).
See also
guides/telemetry.md— ops catalog SSOT and Operator runbooks (first actions) tableAccrue.Telemetry.Ops—emit/3contract (lib/accrue/telemetry/ops.exin the repo; published API on Hexdocs)- Hexdocs path pattern:
https://hexdocs.pm/accrue/(pin the version to yourmix.lock)