gleam_synapses/codec

This namespace contains functions that are related to data-point encodeing and decoding.

One hot encoding is a process that turns discrete attributes into a list of 0.0 and 1.0. Minmax normalization scales continuous attributes into values between 0.0 and 1.0.

Types

A codec can encode and decode every data point.

pub type Codec =
  preprocessor.Preprocessor

Functions

pub fn decode(codec: ZList(Attribute), encoded_values: List(Float)) -> Map(
  String,
  String,
)

Accepts the encoded_values as a list of numbers between 0.0 and 1.0 and returns the decoded data point as a map of strings.

cdc
|> codec.decode([0.0, 1.0, 0.0])
|> map.to_list
[#("petal_length", "1.5"), #("species","setosa")]
pub fn encode(codec: ZList(Attribute), data_point: Map(
    String,
    String,
  )) -> List(Float)

Accepts the data_point as a map of strings and returns the encoded data point as a list of float numbers between 0.0 and 1.0.

codec.encode(cdc, setosa)
[0.0, 1.0, 0.0]
pub fn from_json(json: String) -> ZList(Attribute)

Parses and returns a codec.

pub fn new(attributes: List(#(String, Bool)), data_points: Iterator(
    Map(String, String),
  )) -> ZList(Attribute)

Creates a codec that can encode and decode every data point. attributes is a list of pairs that define the name and the type (discrete or not) of each attribute.

let attributes = [#("petal_length", False), #("species", True)]
let setosa = map.from_list([#("petal_length", "1.5"), #("species","setosa")])
let versicolor = map.from_list([#("petal_length", "3.8"), #("species","versicolor")])
let data_points = iterator.from_list([setosa, versicolor])
let cdc = codec.new(attributes, data_points)
pub fn to_json(codec: ZList(Attribute)) -> String

The JSON representation of the codec.