Penelope v0.5.0 Penelope.ML.Text.CountVectorizer View Source

The CountVectorizer simply counts the number of tokens in the incoming documents. It assumes that samples have already been tokenized into a list per sample. This vectorizer is useful for biasing a model for longer/shorter documents.

Link to this section Summary

Functions

transform(model, context, x)

transforms a list of samples (list of lists of tokens) into vectors

Link to this section Functions

transform(model, context, x)

transform(model :: map(), context :: map(), x :: [[String.t()]]) :: [
  Penelope.ML.Vector.t()
]

transforms a list of samples (list of lists of tokens) into vectors