Statistex v1.0.0 Statistex View Source

Calculate all the statistics for given samples.

Works all at once with statistics/1 or has a lot of functions that can be triggered individually.

To avoid wasting computation, function can be given values they depend on as optional keyword arguments so that these values can be used instead of recalculating them. For an example see average/2.

Most statistics don't really make sense when there are no samples, for that reason all functions except for sample_size/1 raise ArgumentError when handed an empty list. It is suggested that if it's possible for your program to throw an empty list at Statistex to handle that before handing it to Staistex to take care of the "no reasonable statistics" path entirely separately.

Limitations of ther erlang standard library apply (particularly :math.pow/2 raises for VERY large numbers).

Link to this section Summary

Types

configuration()

The optional configuration handed to a lot of functions.

mode()

Careful with the mode, might be multiple values, one value or nothing.😱 See mode/1.

percentiles()

The percentiles map returned by percentiles/2.

sample()

A single sample/

samples()

The samples to compute statistics from.

t()

All the statistics statistics/1 computes from the samples.

Functions

average(samples, options \\ [])

Calculate the average.

frequency_distribution(samples)

A map showing which sample occurs how often in the samples.

maximum(samples)

The biggest sample.

median(samples, options \\ [])

Calculates the median of the given samples.

minimum(samples)

The smallest sample.

mode(samples, opts \\ [])

Calculates the mode of the given samples.

percentiles(samples, percentiles)

Calculates the value at the percentile_rank-th percentile.

sample_size(samples)

Number of samples in the given list.

standard_deviation(samples, options \\ [])

Calculate the standard deviation.

standard_deviation_ratio(samples, options \\ [])

Calculate the standard deviation relative to the average.

statistics(samples, configuration \\ [])

Calculate all statistics Statistex offers for a given list of numbers.

total(samples)

The total of all samples added together.

variance(samples, options \\ [])

Calculate the variance.

Link to this section Types

configuration()

configuration() :: keyword()

The optional configuration handed to a lot of functions.

Keys used are function dependent and are documented there.

mode()

mode() :: [sample()] | sample() | nil

Careful with the mode, might be multiple values, one value or nothing.😱 See mode/1.

percentiles()

percentiles() :: %{required(number()) => float()}

The percentiles map returned by percentiles/2.

sample()

sample() :: number()

A single sample/

samples()

samples() :: [sample(), ...]

The samples to compute statistics from.

Importantly this list is not empty/includes at least one sample otherwise an ArgumentError will be raised.

t()

t() :: %Statistex{
  average: float(),
  frequency_distribution: %{required(sample()) => pos_integer()},
  maximum: number(),
  median: number(),
  minimum: number(),
  mode: mode(),
  percentiles: percentiles(),
  sample_size: non_neg_integer(),
  standard_deviation: float(),
  standard_deviation_ratio: float(),
  total: number(),
  variance: float()
}

All the statistics statistics/1 computes from the samples.

For a description of what a given value means please check out the function here by the same name, it will have an explanation.

Link to this section Functions

average(samples, options \\ [])

average(samples(), keyword()) :: float()

Calculate the average.

It's.. well the average. When the given samples are empty there is no average.

Argumenterror is raised if the given list is empty.

Options

If you already have these values, you can provide both :total and :sample_size. Should you provide both the provided samples are wholly ignored.

Examples

iex> Statistex.average([5])
5.0

iex> Statistex.average([600, 470, 170, 430, 300])
394.0

iex> Statistex.average([-1, 1])
0.0

iex> Statistex.average([2, 3, 4], sample_size: 3)
3.0

iex> Statistex.average([20, 20, 20, 20, 20], total: 100, sample_size: 5)
20.0

iex> Statistex.average(:ignored, total: 100, sample_size: 5)
20.0

iex> Statistex.average([])
** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least on number.

frequency_distribution(samples)

frequency_distribution(samples()) :: %{required(sample()) => pos_integer()}

A map showing which sample occurs how often in the samples.

Goes from a concrete occurence of the sample to the number of times it was observed in the samples.

Argumenterror is raised if the given list is empty.

Examples

iex> Statistex.frequency_distribution([1, 2, 4.23, 7, 2, 99])
%{
  2 => 2,
  1 => 1,
  4.23 => 1,
  7 => 1,
  99 => 1
}

iex> Statistex.frequency_distribution([])
** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least on number.

maximum(samples)

maximum(samples()) :: sample()

The biggest sample.

Argumenterror is raised if the given list is empty.

Examples

iex> Statistex.maximum([1, 100, 24])
100

iex> Statistex.maximum([])
** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least on number.

median(samples, options \\ [])

median(samples(), keyword()) :: number()

Calculates the median of the given samples.

The median can be thought of separating the higher half from the lower half of the samples. When all samples are sorted, this is the middle value (or average of the two middle values when the number of times is even). More stable than the average.

Argumenterror is raised if the given list is empty.

Examples

iex> Statistex.median([1, 3, 4, 6, 7, 8, 9])
6.0

iex> Statistex.median([1, 2, 3, 4, 5, 6, 8, 9])
4.5

iex> Statistex.median([0])
0.0

iex> Statistex.median([])
** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least on number.

minimum(samples)

minimum(samples()) :: sample()

The smallest sample.

Argumenterror is raised if the given list is empty.

Examples

iex> Statistex.minimum([1, 100, 24])
1

iex> Statistex.minimum([])
** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least on number.

mode(samples, opts \\ [])

mode(samples(), keyword()) :: mode()

Calculates the mode of the given samples.

Mode is the sample(s) that occur the most. Often one value, but can be multiple values if they occur the same amount of times. If no value occurs at least twice, there is no mode and it hence returns nil.

Argumenterror is raised if the given list is empty.

Options

If already calculated, the :frequency_distribution option can be provided to avoid recalulating it.

Examples

iex> Statistex.mode([5, 3, 4, 5, 1, 3, 1, 3])
3

iex> Statistex.mode([1, 2, 3, 4, 5])
nil

iex> Statistex.mode([])
** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least on number.

iex> mode = Statistex.mode([5, 3, 4, 5, 1, 3, 1])
iex> Enum.sort(mode)
[1, 3, 5]

percentiles(samples, percentiles)

percentiles(samples(), number() | [number(), ...]) :: percentiles()

Calculates the value at the percentile_rank-th percentile.

Think of this as the value below which percentile_rank percent of the samples lie. For example, if Statistex.percentile(samples, 99) == 123.45, 99% of samples are less than 123.45.

Passing a number for percentile_rank calculates a single percentile. Passing a list of numbers calculates multiple percentiles, and returns them as a map like %{90 => 45.6, 99 => 78.9}, where the keys are the percentile numbers, and the values are the percentile values.

Percentiles must be between 0 and 100 (excluding the boundaries).

The method used for interpolation is described here and recommended by NIST.

Argumenterror is raised if the given list is empty.

Examples

iex> Statistex.percentiles([5, 3, 4, 5, 1, 3, 1, 3], 12.5)
%{12.5 => 1.0}

iex> Statistex.percentiles([5, 3, 4, 5, 1, 3, 1, 3], [50])
%{50 => 3.0}

iex> Statistex.percentiles([5, 3, 4, 5, 1, 3, 1, 3], [75])
%{75 => 4.75}

iex> Statistex.percentiles([5, 3, 4, 5, 1, 3, 1, 3], 99)
%{99 => 5.0}

iex> Statistex.percentiles([5, 3, 4, 5, 1, 3, 1, 3], [50, 75, 99])
%{50 => 3.0, 75 => 4.75, 99 => 5.0}

iex> Statistex.percentiles([5, 3, 4, 5, 1, 3, 1, 3], 100)
** (ArgumentError) percentile must be between 0 and 100, got: 100

iex> Statistex.percentiles([5, 3, 4, 5, 1, 3, 1, 3], 0)
** (ArgumentError) percentile must be between 0 and 100, got: 0

iex> Statistex.percentiles([], [50])
** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least on number.

sample_size(samples)

sample_size([sample()]) :: non_neg_integer()

Number of samples in the given list.

Nothing to fancy here, this just calls length(list) and is only provided for completeness sake.

Examples

iex> Statistex.sample_size([])
0

iex> Statistex.sample_size([1, 1, 1, 1, 1])
5

standard_deviation(samples, options \\ [])

standard_deviation(samples(), keyword()) :: float()

Calculate the standard deviation.

A measurement how much samples vary (the higher the more the samples vary). It's the square root of the variance. Unlike the variance, its unit is the same as that of the sample (as calculating the variance includes squaring).

Options

If already calculated, the :variance option can be provided to avoid recalulating those values.

Argumenterror is raised if the given list is empty.

Examples

iex> Statistex.standard_deviation([4, 9, 11, 12, 17, 5, 8, 12, 12])
4.0

iex> Statistex.standard_deviation([4, 9, 11, 12, 17, 5, 8, 12, 12], variance: 16.0)
4.0

iex> Statistex.standard_deviation([42])
0.0

iex> Statistex.standard_deviation([1, 1, 1, 1, 1, 1, 1])
0.0

iex> Statistex.standard_deviation([])
** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least on number.

standard_deviation_ratio(samples, options \\ [])

standard_deviation_ratio(samples(), keyword()) :: float()

Calculate the standard deviation relative to the average.

This helps put the absolute standard deviation value into perspective expressing it relative to the average. It's what percentage of the absolute value of the average the variance takes.

Argumenterror is raised if the given list is empty.

## Options If already calculated, the :average and :standard_deviation options can be provided to avoid recalulating those values.

If both values are provided, the provided samples will be ignored.

## Examples

  iex> Statistex.standard_deviation_ratio([4, 9, 11, 12, 17, 5, 8, 12, 12])
  0.4

  iex> Statistex.standard_deviation_ratio([-4, -9, -11, -12, -17, -5, -8, -12, -12])
  0.4

  iex> Statistex.standard_deviation_ratio([4, 9, 11, 12, 17, 5, 8, 12, 12], average: 10.0, standard_deviation: 4.0)
  0.4

  iex> Statistex.standard_deviation_ratio(:ignored, average: 10.0, standard_deviation: 4.0)
  0.4

  iex> Statistex.standard_deviation_ratio([])
  ** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least on number.

statistics(samples, configuration \\ [])

statistics(samples(), configuration()) :: t()

Calculate all statistics Statistex offers for a given list of numbers.

The statistics themselves are described in the individual samples that can be used to calculate individual values.

Argumenterror is raised if the given list is empty.

Options

In a percentiles options arguments for the calculation of percentiles (see percentiles/2) can be given. The 50th percentile is always calculated as it is the median.

Examples

iex> Statistex.statistics([200, 400, 400, 400, 500, 500, 500, 700, 900])
%Statistex{
  average:                  500.0,
  variance:                 40_000.0,
  standard_deviation:       200.0,
  standard_deviation_ratio: 0.4,
  median:                   500.0,
  percentiles:              %{50 => 500.0},
  frequency_distribution:   %{
    200 => 1,
    400 => 3,
    500 => 3,
    700 => 1,
    900 => 1
  },
  mode:                     [500, 400],
  minimum:                  200,
  maximum:                  900,
  sample_size:              9,
  total:                    4500
}

iex> Statistex.statistics([])
** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least on number.

iex> Statistex.statistics([0, 0, 0, 0])
%Statistex{
  average:                  0.0,
  variance:                 0.0,
  standard_deviation:       0.0,
  standard_deviation_ratio: 0.0,
  median:                   0.0,
  percentiles:              %{50 => 0.0},
  frequency_distribution:   %{0 => 4},
  mode:                     0,
  minimum:                  0,
  maximum:                  0,
  sample_size:              4,
  total:                    0
}

total(samples)

total(samples()) :: number()

The total of all samples added together.

Argumenterror is raised if the given list is empty.

Examples

iex> Statistex.total([1, 2, 3, 4, 5])
15

iex> Statistex.total([10, 10.5, 5])
25.5

iex> Statistex.total([-10, 5, 3, 2])
0

iex> Statistex.total([])
** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least on number.

variance(samples, options \\ [])

variance(samples(), keyword()) :: float()

Calculate the variance.

A measurement how much samples vary (the higher the more the samples vary). This is the variance of a sample and is hence in its calculation divided by sample_size - 1 (Bessel's correction).

Argumenterror is raised if the given list is empty.

Options

If already calculated, the :average and :sample_size options can be provided to avoid recalulating those values.

Examples

iex> Statistex.variance([4, 9, 11, 12, 17, 5, 8, 12, 12])
16.0

iex> Statistex.variance([4, 9, 11, 12, 17, 5, 8, 12, 12], sample_size: 9, average: 10.0)
16.0

iex> Statistex.variance([42])
0.0

iex> Statistex.variance([1, 1, 1, 1, 1, 1, 1])
0.0

iex> Statistex.variance([])
** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least on number.

v1.0.0

Statistex v1.0.0 Statistex View Source

Link to this section Summary

Types

Functions

Link to this section Types

configuration() View Source configuration() :: keyword()

mode() View Source mode() :: [sample()] | sample() | nil

percentiles() View Source percentiles() :: %{required(number()) => float()}

sample() View Source sample() :: number()

samples() View Source samples() :: [sample(), ...]

Link to this section Functions

average(samples, options \\ []) View Source average(samples(), keyword()) :: float()

Options

Examples

frequency_distribution(samples) View Source frequency_distribution(samples()) :: %{required(sample()) => pos_integer()}

Examples

maximum(samples) View Source maximum(samples()) :: sample()

Examples

median(samples, options \\ []) View Source median(samples(), keyword()) :: number()

Examples

minimum(samples) View Source minimum(samples()) :: sample()

Examples

mode(samples, opts \\ []) View Source mode(samples(), keyword()) :: mode()

Options

Examples

percentiles(samples, percentiles) View Source percentiles(samples(), number() | [number(), ...]) :: percentiles()

Examples

sample_size(samples) View Source sample_size([sample()]) :: non_neg_integer()

Examples

standard_deviation(samples, options \\ []) View Source standard_deviation(samples(), keyword()) :: float()

Options

Examples

standard_deviation_ratio(samples, options \\ []) View Source standard_deviation_ratio(samples(), keyword()) :: float()

statistics(samples, configuration \\ []) View Source statistics(samples(), configuration()) :: t()

Options

Examples

total(samples) View Source total(samples()) :: number()

Examples

variance(samples, options \\ []) View Source variance(samples(), keyword()) :: float()

Options

Examples

configuration() View Source

configuration() :: keyword()

mode() View Source

mode() :: [sample()] | sample() | nil

percentiles() View Source

percentiles() :: %{required(number()) => float()}

sample() View Source

sample() :: number()

samples() View Source

samples() :: [sample(), ...]

average(samples, options \\ []) View Source

average(samples(), keyword()) :: float()

frequency_distribution(samples) View Source

frequency_distribution(samples()) :: %{required(sample()) => pos_integer()}

maximum(samples) View Source

maximum(samples()) :: sample()

median(samples, options \\ []) View Source

median(samples(), keyword()) :: number()

minimum(samples) View Source

minimum(samples()) :: sample()

mode(samples, opts \\ []) View Source

mode(samples(), keyword()) :: mode()

percentiles(samples, percentiles) View Source

percentiles(samples(), number() | [number(), ...]) :: percentiles()

sample_size(samples) View Source

sample_size([sample()]) :: non_neg_integer()

standard_deviation(samples, options \\ []) View Source

standard_deviation(samples(), keyword()) :: float()

standard_deviation_ratio(samples, options \\ []) View Source

standard_deviation_ratio(samples(), keyword()) :: float()

statistics(samples, configuration \\ []) View Source

statistics(samples(), configuration()) :: t()

total(samples) View Source

total(samples()) :: number()

variance(samples, options \\ []) View Source

variance(samples(), keyword()) :: float()