View Source Chi2fit.Statistics (Chi-SquaredFit v2.0.2)

Link to this section Summary

Types

algorithm()

Algorithm used to assign errors to frequencey data: Wald score and Wilson score.

cdf()

Cumulative Distribution Function

cullenfrey()

ecdf()

Binned data with error bounds specified through low and high values

range()

Functions

auto(list, opts \\ [nproc: 1])

Calculates the autocorrelation coefficient of a list of observations.

binerror(data, noise_fun, options \\ [])

Calculates the systematic errors for bins due to uncertainty in assigning data to bins.

bootstrap(total, data, fun, options \\ [])

Implements bootstrapping procedure as resampling with replacement.

convert_cdf(arg)

Converts a CDF function to a list of data points.

cullen_frey(sample, n \\ 100)

Generates a Cullen & Frey plot for the sample data.

cullen_frey_point(data)

Extracts data point with standard deviation from Cullen & Frey plot data.

empirical_cdf(data, bin \\ {1.0, 0.5}, algorithm \\ :wilson, correction \\ 0)

Generates an empirical Cumulative Distribution Function from sample data.

error(nauto, atom)

Calculates and returns the error associated with a list of observables.

get_cdf(data, binsize \\ {1.0, 0.5}, algorithm \\ :wilson, correction \\ 0)

Calculates the empirical CDF from a sample.

make_histogram(list, binsize \\ 1.0, offset \\ 0.0)

Converts a list of numbers to frequency data.

moment(sample, n)

Calculates the nth moment of the sample.

momentc(sample, n)

Calculates the nth centralized moment of the sample.

momentc(sample, n, mu)

Calculates the nth centralized moment of the sample.

momentn(sample, n)

Calculates the nth normalized moment of the sample.

momentn(sample, n, mu)

Calculates the nth normalized moment of the sample.

momentn(sample, n, mu, sigma)

Calculates the nth normalized moment of the sample.

puiseaux(list, result \\ [], flag \\ false)

Converts the input so that the result is a Puiseaux diagram, that is a strict convex shape.

resample(data, options)

Resamples the subsequences of numbers contained in the list as determined by analyze/2

subexponential_stat(data, test \\ :sum, n \\ 2, binsize \\ {1, 0})

Calculates the test statistic for subexponentiality of a sample.

to_bins(data, binsize \\ {1.0, 0.5})

Converts raw data to binned data with (asymmetrical) errors.

Link to this section Types

algorithm()

@type algorithm() :: :wilson | :wald

Algorithm used to assign errors to frequencey data: Wald score and Wilson score.

cdf()

@type cdf() :: (number() -> {number(), number(), number()})

Cumulative Distribution Function

cullenfrey()

@type cullenfrey() :: [{squared_skewness :: float(), kurtosis :: float()} | nil]

ecdf()

@type ecdf() :: [{float(), float(), float(), float()}]

Binned data with error bounds specified through low and high values

range()

@type range() :: {float(), float()} | [float(), ...]

Link to this section Functions

auto(list, opts \\ [nproc: 1])

@spec auto([number()], Keyword.t()) :: [number()]

Calculates the autocorrelation coefficient of a list of observations.

The implementation uses the discrete Fast Fourier Transform to calculate the autocorrelation. For available options see Chi2fit.FFT.fft/2. Returns a list of the autocorrelation coefficients.

example
Example

iex> auto [1,2,3]
[14.0, 7.999999999999999, 2.999999999999997]

binerror(data, noise_fun, options \\ [])

@spec binerror(
  data :: [number()],
  noise_fun :: (Enumerable.t() -> Enumerable.t()),
  options :: Keyword.t()
) :: [{bin :: number(), avg :: number(), error :: number()}]

Calculates the systematic errors for bins due to uncertainty in assigning data to bins.

options
Options

`bin` - the size of bins to use (defaults to 1)
`iterations` - the number of iterations to use to estimate the error due to noise (defatuls to 100)

bootstrap(total, data, fun, options \\ [])

@spec bootstrap(
  total :: integer(),
  data :: [number()],
  fun :: ([number()], integer() -> number()),
  options :: Keyword.t()
) :: [any()]

Implements bootstrapping procedure as resampling with replacement.

It supports saving intermediate results to a file using :dets. Use the options :safe and :filename (see below)

arguments
Arguments:

`total` - Total number resamplings to perform
`data` - The sample data
`fun` - The function to evaluate
`options` - A keyword list of options, see below.

options
Options

`:safe` - Whether to safe intermediate results to a file, so as to support continuation when it is interrupted.
      Valid values are `:safe` and `:cont`.
`:filename` - The filename to use for storing intermediate results

convert_cdf(arg)

@spec convert_cdf({cdf(), range()}) :: [{float(), float(), float(), float()}]

Converts a CDF function to a list of data points.

example
Example

iex> convert_cdf {fn x->{:math.exp(-x),:math.exp(-x)/16,:math.exp(-x)/4} end, {1,4}}
[{1, 0.36787944117144233, 0.022992465073215146, 0.09196986029286058},
 {2, 0.1353352832366127, 0.008458455202288294, 0.033833820809153176},
 {3, 0.049787068367863944, 0.0031116917729914965, 0.012446767091965986},
 {4, 0.01831563888873418, 0.0011447274305458862, 0.004578909722183545}]

cullen_frey(sample, n \\ 100)

@spec cullen_frey(sample :: [number()], n :: integer()) :: cullenfrey()

Generates a Cullen & Frey plot for the sample data.

The kurtosis returned is the 'excess kurtosis'.

cullen_frey_point(data)

@spec cullen_frey_point(data :: cullenfrey()) ::
  {{x :: float(), dx :: float()}, {y :: float(), dy :: float()}}

Extracts data point with standard deviation from Cullen & Frey plot data.

empirical_cdf(data, bin \\ {1.0, 0.5}, algorithm \\ :wilson, correction \\ 0)

@spec empirical_cdf(
  [{float(), number()}],
  {number(), number()},
  algorithm(),
  integer()
) ::
  {cdf(), bins :: [float()], numbins :: pos_integer(), sum :: float()}

Generates an empirical Cumulative Distribution Function from sample data.

Three parameters determine the resulting empirical distribution:

algorithm for assigning errors,
the size of the bins,
a correction for limiting the bounds on the 'y' values

When e.g. task effort/duration is modeled, some tasks measured have 0 time. In practice what is actually is meant, is that the task effort is between 0 and 1 hour. This is where binning of the data happens. Specify a size of the bins to control how this is done. A bin size of 1 means that 0 effort will be mapped to 1/2 effort (at the middle of the bin). This also prevents problems when the fited distribution cannot cope with an effort os zero.

Supports two ways of assigning errors: Wald score or Wilson score. See [1]. Valie values for the algorithm argument are :wald or :wilson.

In the handbook of MCMC [1] a cumulative distribution is constructed. For the largest 'x' value in the sample, the 'y' value is exactly one (1). In combination with the Wald score this gives zero errors on the value '1'. If the resulting distribution is used to fit a curve this may give an infinite contribution to the maximum likelihood function. Use the correction number to have a 'y' value of slightly less than 1 to prevent this from happening. Especially the combination of 0 correction, algorithm :wald, and 'linear' model for handling asymmetric errors gives problems.

The algorithm parameter determines how the errors onthe 'y' value are determined. Currently supported values include :wald and :wilson.

references
References

[1] "Handbook of Monte Carlo Methods" by Kroese, Taimre, and Botev, section 8.4
[2] See https://en.wikipedia.org/wiki/Cumulative_frequency_analysis
[3] https://arxiv.org/pdf/1112.2593v3.pdf
[4] See https://en.wikipedia.org/wiki/Student%27s_t-distribution:
    90% confidence ==> t = 1.645 for many data points (> 120)
    70% confidence ==> t = 1.000

error(nauto, atom)

@spec error([{gamma :: number(), k :: pos_integer()}], :initial_sequence_method) ::
  {var :: number(), lag :: number()}

Calculates and returns the error associated with a list of observables.

Usually these are the result of a Markov Chain Monte Carlo simulation run.

The only supported method is the so-called Initial Sequence Method. See section 1.10.2 (Initial sequence method) of [1].

Input is a list of autocorrelation coefficients. This may be the output of auto/2.

references
References

[1] 'Handbook of Markov Chain Monte Carlo'

get_cdf(data, binsize \\ {1.0, 0.5}, algorithm \\ :wilson, correction \\ 0)

@spec get_cdf([number()], number() | {number(), number()}, algorithm(), integer()) ::
  {cdf(), bins :: [float()], numbins :: pos_integer(), sum :: float()}

Calculates the empirical CDF from a sample.

Convenience function that chains make_histogram/2 and empirical_cdf/3.

make_histogram(list, binsize \\ 1.0, offset \\ 0.0)

@spec make_histogram([number()], number(), number()) :: [
  {non_neg_integer(), pos_integer()}
]

Converts a list of numbers to frequency data.

The data is divided into bins of size binsize and the number of data points inside a bin are counted. A map is returned with the bin's index as a key and as value the number of data points in that bin.

The function returns a list of 2-tuples. Each tuple contains the index of the bin and the value of the count of the number of items in the bin. The index of the bins start at 1 in the following way:

[0..1) has index 1 (including 0 and excludes 1),
[1..2) has index 2,
etc.

When an offset is used, the bin starting from the offset, i.e. [offset..offset+1) gets index 1. Values less than the offset are gathered in a bin with index 0.

examples
Examples

iex> make_histogram [1,2,3]
[{2, 1}, {3, 1}, {4, 1}]

iex> make_histogram [1,2,3], 1.0, 0
[{2, 1}, {3, 1}, {4, 1}]

iex> make_histogram [1,2,3,4,5,6,5,4,3,4,5,6,7,8,9]
[{2, 1}, {3, 1}, {4, 2}, {5, 3}, {6, 3}, {7, 2}, {8, 1}, {9, 1}, {10  , 1}]

iex> make_histogram [1,2,3,4,5,6,5,4,3,4,5,6,7,8,9], 3, 1.5
[{0, 1}, {1, 6}, {2, 6}, {3, 2}]

iex> make_histogram [0,0,0,1,3,4,3,2,6,7],1
[{1,3},{2,1},{3,1},{4,2},{5,1},{7,1},{8,1}]

iex> make_histogram [0,0,0,1,3,4,3,2,6,7],1,0.5
[{0,3},{1,1},{2,1},{3,2},{4,1},{6,1},{7,1}]

moment(sample, n)

@spec moment(sample :: [number()], n :: pos_integer()) :: float()

Calculates the nth moment of the sample.

example
Example

iex> moment [1,2,3,4,5,6], 1
3.5

momentc(sample, n)

@spec momentc(sample :: [number()], n :: pos_integer()) :: float()

Calculates the nth centralized moment of the sample.

example
Example

iex> momentc [1,2,3,4,5,6], 1
0.0

iex> momentc [1,2,3,4,5,6], 2
2.9166666666666665

momentc(sample, n, mu)

@spec momentc(sample :: [number()], n :: pos_integer(), mu :: float()) :: float()

Calculates the nth centralized moment of the sample.

example
Example

iex> momentc [1,2,3,4,5,6], 2, 3.5
2.9166666666666665

momentn(sample, n)

@spec momentn(sample :: [number()], n :: pos_integer()) :: float()

Calculates the nth normalized moment of the sample.

example
Example

iex> momentn [1,2,3,4,5,6], 1
0.0

iex> momentn [1,2,3,4,5,6], 2
1.0

iex> momentn [1,2,3,4,5,6], 4
1.7314285714285718

momentn(sample, n, mu)

@spec momentn(sample :: [number()], n :: pos_integer(), mu :: float()) :: float()

Calculates the nth normalized moment of the sample.

example
Example

iex> momentn [1,2,3,4,5,6], 4, 3.5
1.7314285714285718

momentn(sample, n, mu, sigma)

@spec momentn(
  sample :: [number()],
  n :: pos_integer(),
  mu :: float(),
  sigma :: float()
) :: float()

Calculates the nth normalized moment of the sample.

puiseaux(list, result \\ [], flag \\ false)

@spec puiseaux([number()], [number()], boolean()) :: [number()]

Converts the input so that the result is a Puiseaux diagram, that is a strict convex shape.

examples
Examples

iex> puiseaux [1]
[1]

iex> puiseaux [5,3,3,2]
[5, 3, 2.5, 2]

resample(data, options)

@spec resample(data :: [number()], options :: Keyword.t()) :: [number()]

Resamples the subsequences of numbers contained in the list as determined by analyze/2

subexponential_stat(data, test \\ :sum, n \\ 2, binsize \\ {1, 0})

Calculates the test statistic for subexponentiality of a sample.

A value close to 0 is a strong indication that the sample shows subexponential behaviour (extremistan), i.e. is fat-tailed.

to_bins(data, binsize \\ {1.0, 0.5})

@spec to_bins(data :: [number()], binsize :: {number(), number()}) :: ecdf()

Converts raw data to binned data with (asymmetrical) errors.

Settings View Source Chi2fit.Statistics (Chi-SquaredFit v2.0.2)

Link to this section Summary

Types

Functions

Link to this section Types

algorithm()

cdf()

cullenfrey()

ecdf()

range()

Link to this section Functions

auto(list, opts \\ [nproc: 1])

example Example

binerror(data, noise_fun, options \\ [])

options Options

bootstrap(total, data, fun, options \\ [])

arguments Arguments:

options Options

convert_cdf(arg)

example Example

cullen_frey(sample, n \\ 100)

cullen_frey_point(data)

empirical_cdf(data, bin \\ {1.0, 0.5}, algorithm \\ :wilson, correction \\ 0)

references References

error(nauto, atom)

references References

get_cdf(data, binsize \\ {1.0, 0.5}, algorithm \\ :wilson, correction \\ 0)

make_histogram(list, binsize \\ 1.0, offset \\ 0.0)

examples Examples

moment(sample, n)

example Example

momentc(sample, n)

example Example

momentc(sample, n, mu)

example Example

momentn(sample, n)

example Example

momentn(sample, n, mu)

example Example

momentn(sample, n, mu, sigma)

puiseaux(list, result \\ [], flag \\ false)

examples Examples

resample(data, options)

subexponential_stat(data, test \\ :sum, n \\ 2, binsize \\ {1, 0})

to_bins(data, binsize \\ {1.0, 0.5})

View Source Chi2fit.Statistics (Chi-SquaredFit v2.0.2)

example
Example

options
Options

arguments
Arguments:

options
Options

example
Example

references
References

references
References

examples
Examples

example
Example

example
Example

example
Example

example
Example

example
Example

examples
Examples