Explainability

Rulestead explainability is for support, operators, and incident response. The goal is not raw internal trace dumping. The goal is a bounded answer to "why did this subject get this result in this environment?"

Two Explain Paths

Use the path that matches where you are standing:

Rulestead.explain(flag_payload, context) when you already have the authored flag payload and want a pure, human-readable explanation
Rulestead.explain_flag(flag_key, environment_key, context, opts \\ []) when an operator needs the mounted admin-safe runtime seam for one live flag

The root explain/2 call stays payload-first. The admin-safe explain_flag/4 adds environment lookup, authorization, and redaction at the package boundary.

What A Good Explanation Tells You

A useful explanation answers:

which flag and environment were evaluated
whether a rule matched or the default applied
which rule matched
whether deterministic bucketing affected the outcome
what the final value or variant decision was

That is enough for support and operator workflows without exposing raw actor payloads.

Keep Context Bounded

Explain requests should carry only the bounded context fields the runtime uses:

context =
  Rulestead.Context.new(
    targeting_key: "user_123",
    environment: "prod",
    attributes: %{country: "US", plan: "pro"}
  )

Avoid passing whole application structs or raw user payloads. The explain path is designed around explicit context and redacted metadata.

Operator Workflow

From the mounted admin package, the stable explain route is:

/admin/flags/:key/simulate?env=:environment

Use it like this:

choose the flag
choose the environment through ?env=
enter the bounded targeting context
read the explanation and matched-rule outcome
share the operator-facing URL or summarize the explanation in a ticket

The URL and environment convention are stable. Internal LiveView implementation details are not.

Lifecycle Evidence For Support And SRE

Support and SRE should not use explainability in isolation when lifecycle questions appear. Use three bounded surfaces together:

explain output for one decision path
lifecycle evidence from mounted review or mix rulestead.lifecycle
audit history for who changed what and why

That combination answers the real operator questions:

is the flag still expected to be active?
was it an archive candidate or blocked by missing evidence?
did a recent cleanup or owner handoff happen?
who changed the lifecycle posture?

This keeps lifecycle evidence, explain traces, and audit history aligned for support handoff without turning explainability into a second lifecycle system.

Redaction Rules

Explain and simulation workflows should stay redacted by default:

do not surface raw traits or PII unless the host explicitly allowlists a bounded key
prefer targeting_key and a small set of business-safe attributes
keep screenshots and support notes focused on the explanation, not the full input payload

The admin-safe explain seam returns redacted context metadata alongside the explanation so operators can confirm what was actually used without dumping the full trait bag.

Audience Trace In Explain Output

Explain and simulate output includes Audience trace steps for reusable targeting: matched, missed, missing from snapshot, and archived. Resolution is snapshot-local — no live database reads, mounted-admin lookups, host identity resolution, or observability queries during audience evaluation.

Support-safe explain permalinks include flag, environment, tenant, and targeting key only — never raw traits.

When audience questions exceed one explain call, escalate through explain + dependency inventory + audit history. Rulestead does not provide built-in observability dashboards or package-owned metrics for this path.

Simulation And Explain Belong Together

Simulation is the operator workflow for asking "what would happen for this context right now?" Explainability is the readable trace that answers it.

Use that pair when:

support needs to answer a customer report
an operator wants to verify a rollout step before publishing
on-call needs to understand whether a flag or rule caused an incident

Escalation Boundary

If an explanation is not enough, escalate to:

the timeline route for change history
lifecycle evidence from mix rulestead.lifecycle or the mounted queue
telemetry for aggregate runtime signals
the authored ruleset itself for exact rule order and conditions

Do not escalate by depending on RulesteadAdmin.Live.* internals. That would couple your workflow to implementation details the package does not stabilize.

← Previous Page Admin UI

Next Page → Multi-environment Operation