gelman
Welcome to Gelman! ๐๐๐
Gelman is a statistical library written entirely in ๐ Gleam ๐. It is named after the noted Bayesian statistician Andrew Gelman, whose name happens to be a near-anagram of Gleam.
Wherever possible, Gelman is:
- ๐ฅ Simple to use, with a consistent interface. A dataset is a list of floats, so any summary or frequency statistics are returned as floats. If you have any integers, please convert them first.
- โญ Pure Gleam, and purely functional. Wherever possible, the
foldcombinator is used.mapfollowed byfoldis avoided as to reduce memory overhead. - ๐ฆฆ Efficient. Gelman endeavors to perform one-pass over the data, even for higher order moments like skewness and kurtosis. If sorting is required, Gelman sorts only once, unless absolutely necessary.
- ๐งช Extensively tested. Tests are borrowed from
scipy/stats, and so the results are guaranteed to be at least as accurate.
Gelman functions are grouped according to their purpose.
| Module | Contains |
|---|---|
summary | Summary statistics of a sample dataset, such as mean, variance. interquartile_range. These typically take in a list of values and return one single value. |
transform | Applies a transformation to the entire dataset. Attention: these can either preserve the size of the dataset, or they can drop some elements. |
discretize | Produce a range of values that describe its frequences. |
gleam add gelman@1
import gelman/summarize
import gelman/transform
import gelman/discretize
pub fn main() -> Nil {
let dataset = [0.0, 50.0, 100.0]
let average = summarize.mean(dataset)
let quantiles =
dataset
|> discretize.quantiles([0.25, 0.5, 0.75])
let windsorized_values =
dataset
|> transform.winsorize(0.05, 0.95)
let
}
Further documentation can be found at https://hexdocs.pm/gelman.
Development
Currently, the library has most of the functions available in scipy/stats,
for summary, descriptive and frequency statistics. I will implement a few more:
discretize:histogram,cumulative_frequency,countstransform:trim,ranksummary:standard_error_of_mean
I am also planning on implementing another module, test, which will contain statistical tests. Currently, there is no unified mathematics library which offers all the functions
required to perform parametric tests. Until I can can figure out how to either implement
these functions or augment these libraries, only nonparametric tests can be performed.
gleam run # Run the project
gleam test # Run the tests