Supervisor wrapper for individual workers that provides automatic restart capability.
This module implements the "Permanent Wrapper" pattern for managing workers that control external OS processes (Python gRPC servers).
Architecture Decision
See: docs/architecture/adr-001-worker-starter-supervision-pattern.md for
detailed rationale, alternatives considered, and trade-offs.
Why This Pattern?
TL;DR: Workers manage external Python processes, not just Elixir state. This pattern provides:
- Automatic restart without Pool intervention
- Atomic resource cleanup (worker + Python process)
- Future extensibility for per-worker resources
Trade-off: Extra process (~1KB) per worker for better encapsulation.
Architecture
DynamicSupervisor (WorkerSupervisor)
└── Worker.Starter (Supervisor, :permanent)
└── GRPCWorker (GenServer, :transient)
└── Port → Python grpc_server.pyLifecycle
When GRPCWorker crashes:
- Worker.Starter detects crash via :one_for_one strategy
- Worker.Starter automatically restarts GRPCWorker
- Pool notified via :DOWN but doesn't manage restart
- New GRPCWorker spawns new Python process and re-registers
When Worker.Starter terminates:
- GRPCWorker receives shutdown signal
- GRPCWorker.terminate sends SIGTERM to Python
- Python process exits gracefully
- Worker.Starter confirms all children stopped
- Clean atomic shutdown
This decouples Pool (availability management) from Worker lifecycle (crash/restart).
Related
- Issue #2: Community feedback questioning this complexity
- ADR-001: Full architecture decision record with alternatives
- External Process Design:
docs/20251007_external_process_supervision_design.md
Summary
Functions
Returns a specification to start this module under a supervisor.
Starts a worker starter supervisor.
Returns a via tuple for this starter supervisor.
Functions
Returns a specification to start this module under a supervisor.
See Supervisor.
Starts a worker starter supervisor.
Parameters
worker_id- Unique identifier for the worker
Returns a via tuple for this starter supervisor.