All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

0.11.0 - 2026-05-07

Documentation

  • AGENTS.md — usage guide for AI coding assistants integrating PushX into projects: mental model (function-call API, no supervision-tree setup), decision tree (push / push_batch / push_data / instances), idiomatic patterns (token cleanup via :on_invalid_token, multi-tenant via PushX.Instance, web push topic IDs), and a curated list of mistakes commonly made (forgetting APNS topic:, push_data on APNS, mode mismatch, fcm_credentials as raw string, multiline apns_private_key via env). Shipped in the hex package and rendered on hexdocs.
  • CONTRIBUTING.md — repo orientation for contributors: layout, test commands, conventions for error semantics and telemetry. CLAUDE.md is a symlink to AGENTS.md for tool compatibility.
  • README banner pointing AI assistants at AGENTS.md.

Fixed

  • APNS/FCM crash on transient Finch pool errors — Finch's outer case in lib/finch.ex:516 only matches {:ok, …} or the 3-tuple {:error, err, _acc} shape. When NimblePool returns a 2-tuple error — {:error, :connection_process_went_down} (HTTP/2 connection process death under concurrent-request-limit pressure) is the one observed in production, but the same pattern can produce other atom reasons — Finch raises CaseClauseError on itself. The exception escaped past PushX.Retry, killed the sending Task, and (in batch sends with caller-side Enum.each) silently skipped every recipient after the failing one. Now rescued in both PushX.APNS and PushX.FCM: any CaseClauseError{term: {:error, reason}} where reason is an atom is converted to a retryable Response.error(_, :connection_error, _), so PushX.Retry handles reconnection normally. The previous narrow rescue only matched the literal :connection_process_went_down term and reraised any other 2-tuple shape.
  • APNS payload corruption when custom data uses an atom :aps keyMessage.to_apns_payload/1, APNS.notification_with_data/4, APNS.silent_notification/1, and APNS.web_notification_with_data/5 previously stripped only the string "aps" key from caller data. A map containing both atom :aps and the constructed string "aps" was JSON-encoded with two aps keys, which APNS could reject or interpret unpredictably. All four functions now drop both "aps" and :aps from custom data.
  • APNS URL injection via unvalidated device tokens — Device tokens were interpolated directly into the request URL (/3/device/<token>). A token containing /, ?, #, or whitespace could redirect the request to an unintended path. APNS.send/3, APNS.send_once/3, and the named-instance APNS path now reject tokens that contain anything other than alphanumerics, underscore, or hyphen with {:error, %Response{status: :invalid_token}}.
  • PushX.push_data/4 silently produced an invalid APNS payload for APNS named instances — Calling push_data(:my_apns_instance, …) previously routed through push/4 with a %{"data" => …} map, which APNS doesn't understand. Now rejected with {:error, %Response{status: :invalid_request, provider: :apns}} and a message pointing at push/4 with push_type: "background" for APNS silent push.
  • JWT refresh could deadlock if the lock holder was killed — The previous APNS JWT cache used :atomics as a mutex with try/after to release. If the holder was killed forcibly (e.g. Process.exit(pid, :kill)), the after clause did not run and every subsequent JWT request failed indefinitely with "JWT refresh timeout after 10 attempts". The cache is now a supervised GenServer (PushX.JWTCache) with lock-free ETS reads and serialized refresh through GenServer.call/3. A killed refresher only delays callers until the supervisor restarts the process.
  • APNS empty-string :topic was forwarded to Apple — Treating "" as a valid topic produced a remote MissingTopic error. Both static and named-instance APNS paths now treat nil and "" as missing and return :invalid_request locally.
  • Invalid APNS :mode raised FunctionClauseError — A typo'd apns_mode (e.g. :production) crashed the sending Task past the try/rescue (which only catches CaseClauseError). Mode is now validated upfront and returns {:error, %Response{status: :invalid_request}} cleanly.
  • push_batch/4 with :validate_tokens silently dropped invalid tokens — Callers got a result list shorter than their input list with no signal of which tokens were skipped, so iterating in lockstep (e.g. to mark tokens) misaligned. Invalid tokens now get {:error, %Response{status: :invalid_token, reason: "Invalid token format"}} instead, so the result list always matches the input length. Same option is now honored by APNS.send_batch/3 and FCM.send_batch/3.
  • HTTP.stringify_map/1 raised Protocol.UndefinedError on nested maps/lists — The previous to_string(v) worked only for binaries, atoms, and numbers. A nested map or list as an FCM data value crashed the calling process past the try/rescue. Nested maps and lists are now JSON-encoded so they survive transport as strings; PIDs and other non-stringable terms fall back to inspect/1.
  • JSON.encode! crashed the calling process on un-encodable terms — A payload containing a PID, ref, function, or tuple raised past the rescue block (which only catches CaseClauseError). Encoding now goes through PushX.HTTP.safe_encode/1; failures return {:error, %Response{status: :invalid_request, reason: "Failed to encode payload: ..."}} cleanly. Encoding also happens before JWT/OAuth acquisition so an oversized or un-encodable payload doesn't waste a credential round-trip.
  • push_batch/4 and push_batch!/4 type specs missed instance names — Both functions accept instance atoms but the spec was provider() :: :apns | :fcm. Dialyzer flagged legitimate calls. Specs now include instance_name().

  • Response.error(provider, …) could embed an instance atom in the response structpush_batch/4's :exit, :timeout branch used the caller-supplied provider atom directly, violating the Response.provider :: :apns | :fcm | :unknown typespec. Now mapped through response_provider/1 so instance atoms collapse to :unknown.

  • CircuitBreaker.record_failure/1 lost updates under concurrency:ets.lookup followed by :ets.insert is non-atomic, so concurrent failures undercounted and the real threshold was fuzzy. All circuit-breaker writes now route through the GenServer via GenServer.call/2, serializing them while reads stay lock-free.

Added

  • Pre-flight payload size check — APNS rejects payloads >4 KB (>5 KB for push_type: "voip") and FCM rejects payloads >4 KB locally, returning {:error, %Response{status: :payload_too_large}} instead of round-tripping a guaranteed-fail request.
  • HTTP-date Retry-After parsingHTTP.parse_retry_after/1 now handles RFC 1123 HTTP-date format (e.g. "Wed, 21 Oct 2015 07:28:00 GMT") in addition to delta-seconds, per RFC 7231 §7.1.3. Falls back to nil (default backoff) for malformed or past dates.
  • 25 new tests covering atom-:aps (3), URL-special characters (3), APNS-instance push_data guard (1), JWTCache GenServer (6), :validate_tokens error responses (3), empty :topic and unknown :mode (2), payload size and encode failures (2), and the PushX.HTTP module (5+ groups, 21 tests).
  • Total test count: 340 tests, 25 doctests.

Changed

  • Hot-path Logger.debug calls deferred — APNS and FCM debug log lines now use the function form, so PushX.Telemetry.truncate_token/1 no longer runs when debug logging is disabled. Measurable on high-volume batch sends.
  • Payload validation moved before credential acquisition — APNS and FCM now encode + size-check the payload before requesting a JWT or OAuth token. Saves one ES256 signing or OAuth round-trip per rejected request and gives faster local error feedback.
  • Internal: shared HTTP helpers extractedPushX.URLs centralizes APNS/FCM endpoint constants and PushX.HTTP consolidates header parsing, Retry-After parsing, FCM data stringification, and JSON encoding. Eliminates ~100 lines of duplication between PushX.APNS, PushX.FCM, PushX.Instance, and PushX.Application.

0.10.0 - 2026-02-19

Added

  • PushX.push_data/3,4 — Send data-only (silent) push notifications via both :fcm and named instances. Returns a clear error for :apns with guidance to use push/4 with push_type: "background".
  • PushX.Response.extract_fcm_error_code/1 — Public function to extract FCM-specific error codes from the details array in FCM v1 API responses. Eliminates duplicated parsing logic across modules.
  • 16 new tests (8 for extract_fcm_error_code, 4 for FCM data-only/structured payloads, 3 for push_data, 1 for NOT_FOUND mapping)
  • Total test count: 302 tests, 25 doctests

Fixed

  • FCM UNREGISTERED errors parsed as unknown_error — FCM v1 API wraps the real error code (e.g., UNREGISTERED) in a details array with NOT_FOUND as the top-level gRPC status. The parser only read the top-level status, so on_invalid_token callbacks never fired for unregistered tokens. Now extracts the FCM-specific errorCode from the details array. (Fixes #3)
  • FCM build_message always added notification keybuild_message hardcoded a "notification" key in the base map, making data-only messages impossible and sending "notification": null for empty Message structs. Now uses conditional logic to only include notification when content exists. (Fixes #2)
  • FCM structured payloads treated as notifications — Raw maps with "notification" and/or "data" keys were wrapped in another "notification" key instead of being passed through. Now detects structured payloads and preserves their structure.

0.9.0 - 2026-02-16

Added

  • Dynamic instances (runtime config) — Start, stop, reconfigure, enable/disable APNS and FCM instances at runtime without application restart. Each instance gets its own HTTP/2 pool, JWT cache, and OAuth process. Enables database-backed admin panels for multi-provider setups. See Dynamic Instances in the README.
  • New response statuses:invalid_request (missing required options like :topic) and :auth_error (JWT/credential failure). Both are non-retryable and don't trip the circuit breaker.
  • Credential rotation docs — README now documents how to hot-swap APNS/FCM credentials without restart for both static config and dynamic instances
  • HexDocs module groups — Modules are now organized into Core API, Providers, Runtime Instances, Infrastructure, and Observability groups
  • 45 new tests (Instance lifecycle, pool management, concurrent instances, error paths)
  • Total test count: 286 tests, 23 doctests

Fixed

  • APNS missing :topic no longer raises — Returns {:error, %Response{status: :invalid_request}} instead of raising ArgumentError, consistent with the error-tuple API contract
  • JWT generation failure no longer crashes — Returns {:error, %Response{status: :auth_error}} instead of raising, preventing process crashes from invalid private keys
  • JWT refresh no longer recurses infinitely — Added depth limit (10 retries, 500ms max wait) to prevent stack overflow if the atomic lock holder crashes

Changed

  • PushX.Response provider type now includes :unknown for instance-not-found/disabled errors

0.8.0 - 2026-02-13

Added

  • Circuit breaker — Opt-in circuit breaker tracks consecutive failures per provider and temporarily blocks requests when a provider is consistently failing. Configurable threshold and cooldown. See Circuit Breaker in the README.
  • PushX.health_check/0 — Returns configuration status and circuit breaker state for each provider
  • Per-request timeout overrides — Pass :receive_timeout and :pool_timeout as opts to individual send calls to override global config
  • Token cleanup callback — Configure on_invalid_token: {Mod, :fun, args} to automatically clean up invalid tokens from your database
  • PushX.Telemetry.truncate_token/1 is now a public function for use in custom logging
  • 23 doctests across 7 modules (Token, Telemetry, APNS, FCM, Message, Response, PushX)
  • Circuit breaker test suite (13 tests)
  • Integration tests for batch sending with mixed success/failure responses
  • Total test count: 241 tests, 23 doctests

Fixed

  • APNS payload injection — Custom data containing an "aps" key can no longer overwrite the notification payload in Message.to_apns_payload/1, notification_with_data/4, silent_notification/1, and web_notification_with_data/5
  • FCM send_data paritysend_data/3 and send_data_once/3 now have circuit breaker, telemetry, per-request timeouts, debug logging, and exception handling matching the regular send/3 path
  • Reconnect error logging — Retry logic now logs a warning if PushX.reconnect/0 fails instead of silently ignoring the error
  • Device tokens redacted in debug logs — APNS and FCM debug log messages now truncate tokens (first 8 + last 4 chars) matching the telemetry module's privacy behavior
  • Fixed incorrect doctest for Token.validate/2 (was :invalid_format, actually :invalid_length)

0.7.1 - 2026-02-11

Added

  • Automatic pool reconnect on connection errors — When the first retry attempt fails with a connection error (stale HTTP/2 connections), PushX now restarts the Finch pool to force fresh connections before retrying. This fixes the issue where retries on stale connections always fail with too_many_concurrent_requests.
  • PushX.reconnect/0 — Public function to manually restart the HTTP connection pool. Useful for recovering from persistent connection issues without restarting the app.
  • TCP keepalive on all connections — Enables OS-level dead connection detection on APNS and FCM pools, helping prevent zombie HTTP/2 connections on cloud infrastructure.
  • 4 new tests (reconnect, concurrent reconnect, retry-triggered reconnect, no reconnect on non-connection errors)
  • Total test count: 219 tests

Fixed

  • Retries on stale HTTP/2 connections no longer fail repeatedly with too_many_concurrent_requests — the pool is recycled on first connection error

0.7.0 - 2026-02-09

Fixed

  • FCM OAuth error handlingget_access_token/0 no longer raises on Goth failure, returns {:ok, token} | {:error, reason} instead

  • FCM data-only messages missing timeoutssend_data now uses configured receive_timeout and pool_timeout
  • JWT cache thundering herd — Added atomic compare-and-swap lock to prevent concurrent JWT refresh
  • Rate limiter O(n) scaling — Replaced timestamp list with O(1) fixed-window counter in ETS
  • Batch timeout loses token identity — Timed-out tokens now correctly reported via Enum.zip

Changed

  • Rewritten README — New structure with Quick Start, complete Usage Guide, and consolidated Configuration section
  • Deprecated request_timeout/0 (was never passed to Finch; use receive_timeout and pool_timeout)
  • Fixed CHANGELOG FCM token validation range (was 100-500, actually 20-500)

0.6.2 - 2026-02-04

Fixed

  • Logo now has solid white background (fixes transparency grid on GitHub)
  • Fixed HexDocs logo path configuration
  • README now uses GitHub raw URL for logo (works on both GitHub and HexDocs)

0.6.1 - 2026-02-04

Added

  • Configurable request timeouts — New configuration options to handle slow connections:
    • :request_timeout — Overall request timeout (default: 30s)
    • :receive_timeout — Timeout for receiving response data (default: 15s)
    • :pool_timeout — Timeout for acquiring connection from pool (default: 5s)
    • :connect_timeout — TCP connection timeout (default: 10s)
  • Timeouts are now passed to Finch for both APNS and FCM requests
  • Connection timeout configured at Finch pool level for better TCP handling
  • New logo — Modern purple bell/arrow logo added to README and HexDocs
  • 10 new config tests for timeout options
  • Total test count: 215 tests

Fixed

  • request_timeout errors when connecting to APNS from distant regions (e.g., EU to Apple's US servers)

0.6.0 - 2026-02-04

Changed

  • Increased default pool size from 10 to 25 connections per pool
  • Increased default pool count from 1 to 2 pools
  • Faster retry for connection errors — connection errors now use 1s base delay (was 10s) since these are typically transient network issues, not provider throttling
  • Added explicit FCM HTTP/2 pool — FCM endpoint now has dedicated HTTP/2 pool configuration (was using default pool)

Added

  • Troubleshooting section in README with solutions for common errors:
    • too_many_concurrent_requests — HTTP/2 stream limit exceeded
    • request_timeout — connection timeout issues
  • Pool sizing guide in README with recommendations by traffic level
  • Updated documentation for pool configuration options

Fixed

  • Connection errors (request_timeout, too_many_concurrent_requests) now retry faster with 1s/2s/4s delays instead of 10s/20s/40s

0.5.0 - 2026-01-22

Added

Changed

  • FCM token validation now accepts shorter web tokens (min 20 chars, was 100)
  • Updated Finch dependency to ~> 0.21
  • Updated documentation with Web Push examples

0.4.1 - 2026-01-22

Added

  • Expanded Config module test coverage to 100% (24 new tests)
  • Total test count: 185 tests

0.4.0 - 2026-01-22

Added

  • Batch sending — send to multiple tokens concurrently with configurable parallelism
  • Token validation — validate token format before sending
  • Rate limiting — optional client-side rate limiting
    • PushX.check_rate_limit/1 - Check if under rate limit
    • PushX.RateLimiter module with sliding window algorithm
    • Configurable per-provider limits via config
    • Automatic rate limit check before each request (when enabled)

Changed

  • Updated README with batch sending, token validation, and rate limiting documentation
  • Removed completed items from roadmap

0.3.3 - 2026-01-22

Fixed

  • Fixed release workflow cache conflict with ex_doc

0.3.2 - 2026-01-22 [YANKED]

Fixed

  • Fixed code formatting in retry tests

0.3.1 - 2026-01-22 [YANKED]

Fixed

  • Fixed release workflow to use MIX_ENV=dev for ex_doc availability

0.3.0 - 2026-01-22 [YANKED]

Added

  • Telemetry integration with events for monitoring push notification delivery:
    • [:pushx, :push, :start] - Request started
    • [:pushx, :push, :stop] - Request succeeded
    • [:pushx, :push, :error] - Request failed
    • [:pushx, :push, :exception] - Exception raised
    • [:pushx, :retry, :attempt] - Retry attempted
  • PushX.Telemetry module with documentation and examples
  • telemetry ~> 1.3 dependency
  • Comprehensive retry and telemetry test suites (116 total tests)
  • Credential rotation documentation in README
  • Retry configuration documentation in README

Changed

  • Made all examples generic (removed domain-specific references)
  • Updated README with telemetry usage examples and Telemetry.Metrics integration

0.2.4 - 2026-01-22

Added

  • Comprehensive API reference documentation with all functions, options, and types
  • Credential storage options guide (filesystem, env vars, Fly.io, AWS Secrets Manager)

0.2.3 - 2026-01-22

Added

  • GitHub Actions CI workflow (tests on Elixir 1.18/1.19 with OTP 26-28)
  • APNS and FCM credential setup guides
  • Roadmap and contributing sections

Changed

  • Updated Finch dependency to ~> 0.20
  • Improved CI with code quality checks, security audit, and unused deps check
  • Clarified test key comment to avoid false positive security alerts

0.2.2 - 2026-01-12

Added

  • Added CHANGELOG.md with full version history
  • Added Changelog link to hex.pm package

0.2.1 - 2026-01-12

Fixed

  • Fixed CI workflow for documentation generation
  • Fixed code formatting issues

Changed

  • Updated documentation examples to use generic messaging

0.2.0 - 2026-01-12

Added

  • Automatic retry with exponential backoff following Apple/Google best practices
  • PushX.Retry module for retry logic
  • send_once/3 functions for APNS and FCM (single attempt without retry)
  • retry_after field in PushX.Response struct
  • retryable?/1 helper function in PushX.Response
  • Configuration options for retry behavior:
    • retry_enabled - Enable/disable retry (default: true)
    • retry_max_attempts - Maximum retry attempts (default: 3)
    • retry_base_delay_ms - Base delay in milliseconds (default: 10_000)
    • retry_max_delay_ms - Maximum delay in milliseconds (default: 60_000)

Fixed

  • Fixed APNS sandbox URL (api.sandbox.push.apple.com)

0.1.1 - 2026-01-09

Fixed

  • Initial bug fixes and improvements

0.1.0 - 2026-01-09

Added

  • Initial release
  • APNS (Apple Push Notification Service) support with JWT authentication
  • FCM (Firebase Cloud Messaging) support with OAuth2 via Goth
  • Unified API for both providers (PushX.push/4)
  • Message builder API (PushX.Message)
  • Structured response handling (PushX.Response)
  • HTTP/2 connections via Finch
  • Zero external JSON dependency (uses Elixir 1.18+ built-in JSON)