Rulestead explainability is for support, operators, and incident response. The goal is not raw internal trace dumping. The goal is a bounded answer to "why did this subject get this result in this environment?"
Two Explain Paths
Use the path that matches where you are standing:
Rulestead.explain(flag_payload, context)when you already have the authored flag payload and want a pure, human-readable explanationRulestead.explain_flag(flag_key, environment_key, context, opts \\ [])when an operator needs the mounted admin-safe runtime seam for one live flag
The root explain/2 call stays payload-first. The admin-safe explain_flag/4
adds environment lookup, authorization, and redaction at the package boundary.
What A Good Explanation Tells You
A useful explanation answers:
- which flag and environment were evaluated
- whether a rule matched or the default applied
- which rule matched
- whether deterministic bucketing affected the outcome
- what the final value or variant decision was
That is enough for support and operator workflows without exposing raw actor payloads.
Keep Context Bounded
Explain requests should carry only the bounded context fields the runtime uses:
context =
Rulestead.Context.new(
targeting_key: "user_123",
environment: "prod",
attributes: %{country: "US", plan: "pro"}
)Avoid passing whole application structs or raw user payloads. The explain path is designed around explicit context and redacted metadata.
Operator Workflow
From the mounted admin package, the stable explain route is:
/admin/flags/:key/simulate?env=:environment
Use it like this:
- choose the flag
- choose the environment through
?env= - enter the bounded targeting context
- read the explanation and matched-rule outcome
- share the operator-facing URL or summarize the explanation in a ticket
The URL and environment convention are stable. Internal LiveView implementation details are not.
Lifecycle Evidence For Support And SRE
Support and SRE should not use explainability in isolation when lifecycle questions appear. Use three bounded surfaces together:
- explain output for one decision path
- lifecycle evidence from mounted review or
mix rulestead.lifecycle - audit history for who changed what and why
That combination answers the real operator questions:
- is the flag still expected to be active?
- was it an archive candidate or blocked by missing evidence?
- did a recent cleanup or owner handoff happen?
- who changed the lifecycle posture?
This keeps lifecycle evidence, explain traces, and audit history aligned for support handoff without turning explainability into a second lifecycle system.
Redaction Rules
Explain and simulation workflows should stay redacted by default:
- do not surface raw traits or PII unless the host explicitly allowlists a bounded key
- prefer
targeting_keyand a small set of business-safe attributes - keep screenshots and support notes focused on the explanation, not the full input payload
The admin-safe explain seam returns redacted context metadata alongside the explanation so operators can confirm what was actually used without dumping the full trait bag.
Audience Trace In Explain Output
Explain and simulate output includes Audience trace steps for reusable
targeting: matched, missed, missing from snapshot, and archived.
Resolution is snapshot-local — no live database reads, mounted-admin
lookups, host identity resolution, or observability queries during audience
evaluation.
Support-safe explain permalinks include flag, environment, tenant, and targeting key only — never raw traits.
When audience questions exceed one explain call, escalate through explain + dependency inventory + audit history. Rulestead does not provide built-in observability dashboards or package-owned metrics for this path.
Simulation And Explain Belong Together
Simulation is the operator workflow for asking "what would happen for this context right now?" Explainability is the readable trace that answers it.
Use that pair when:
- support needs to answer a customer report
- an operator wants to verify a rollout step before publishing
- on-call needs to understand whether a flag or rule caused an incident
Escalation Boundary
If an explanation is not enough, escalate to:
- the timeline route for change history
- lifecycle evidence from
mix rulestead.lifecycleor the mounted queue - telemetry for aggregate runtime signals
- the authored ruleset itself for exact rule order and conditions
Do not escalate by depending on RulesteadAdmin.Live.* internals. That would
couple your workflow to implementation details the package does not stabilize.