# `Slither.Examples.MlScoring.ScoringPipe`
[🔗](https://github.com/nshkrdotcom/slither/blob/v0.1.0/lib/slither/examples/ml_scoring/scoring_pipe.ex#L1)

ML scoring pipeline: enrich -> featurize -> predict -> route by confidence.

Demonstrates session-scoped Python object references and the full
Slither pipeline lifecycle:

  1. **enrich** (beam) -- checks the ETS feature cache for pre-computed
     features; on a miss, packages raw data for Python featurization.
     Attaches the `model_id` from context metadata to every item.
  2. **featurize** (python) -- extracts numeric features from raw data
     dicts, or passes through items that already have cached features.
  3. **predict** (python) -- runs batch prediction using a scikit-learn
     model stored in the Python session. Session affinity ensures the
     model trained by `train_model` is accessible.
  4. **route_by_confidence** (router) -- splits results into
     `:high_confidence` (>= 0.9) and `:low_confidence` (< 0.6) buckets;
     mid-range items go to `:default`.

## Concurrent Session Isolation

The `run_demo/0` function trains TWO models on TWO separate sessions
simultaneously, then scores test records through each session's
pipeline independently. This proves that:

  - Session A's model is isolated to its worker process
  - Session B's model is isolated to its worker process
  - Predictions through session A use model A (not B)
  - Predictions through session B use model B (not A)

Under free-threaded Python, a shared `_models` dict mutated from
multiple threads would lead to corruption -- one session could
silently overwrite another's model. Slither's process-per-session
design eliminates this by construction.

Requires scikit-learn and numpy. Run with:

    Slither.Examples.MlScoring.ScoringPipe.run_demo()

# `enrich`

Enrich a record with cached features or prepare it for featurization.

Checks the ETS feature cache for the record's ID. On a cache hit the
pre-computed feature vector is used directly; on a miss the raw data
map is passed through so the Python featurize stage can extract
features. The model ID from context metadata is attached in both cases
so the predict stage knows which model to use.

# `run_demo`

Run the ML scoring demo with concurrent session isolation.

Trains two logistic regression models on separate sessions with
different data distributions, then scores test records through each
session independently. Demonstrates that Slither's session affinity
prevents cross-contamination of model state.

---

*Consult [api-reference.md](api-reference.md) for complete listing*