View Source Needlepoint.Probability.FreqDist (Needlepoint v0.1.0)

A frequency distribution for the outcomes of an experiment.

This basically just adds some functions for use with Enum.frequencies

Based on the NLTK class which is a subclass of the Counter in Python's collections class, sometimes called a bag or multiset.

examples

Examples

iex> alias Needlepoint.Probability.FreqDist

iex> FreqDist.new()
%Needlepoint.Probability.FreqDist{samples: %{}}

iex> FreqDist.new("abracadabra")
%Needlepoint.Probability.FreqDist{
    samples: %{"a" => 5, "b" => 2, "c" => 1, "d" => 1, "r" => 2}
}

iex> FreqDist.new("ABCABC") |> FreqDist.elements() |> Enum.sort() |> Enum.join()
"AABBCC"

iex> FreqDist.new("abracadabra") |> FreqDist.most_common()
[{"a", 5}, {"r", 2}, {"b", 2}, {"d", 1}, {"c", 1}]

iex> FreqDist.new("abracadabra") |> FreqDist.most_common(3)
[{"a", 5}, {"r", 2}, {"b", 2}]

iex> FreqDist.new("abracadabra") |> FreqDist.update("simsalabim") |> FreqDist.most_common()
[
  {"a", 7},
  {"b", 3},
  {"s", 2},
  {"r", 2},
  {"m", 2},
  {"i", 2},
  {"l", 1},
  {"d", 1},
  {"c", 1}
]

iex> FreqDist.new("abracadabra") |> FreqDist.subtract("aaaaa")
%Needlepoint.Probability.FreqDist{
    samples: %{"a" => 0, "b" => 2, "c" => 1, "d" => 1, "r" => 2}
}

Link to this section Summary

Functions

Return the total number of sample values ("bins") that have counts greater than zero. Called B in nltk.

Iterate over elements repeating each as many times as its count.

Return the frequency of a given sample.

Return a list of all samples that occur once (hapax legomena)

intersection is the minimum of corresponding counts.

Return the sample with the greatest number of outcomes.

List all counts from the most common to the least.

List n counts from the most common to the least.

Return the total number of sample outcomes that have been recorded.

Make a new empty FreqDict

Make a new FreqDict from a string, list, map or another FreqDist

Return the dictionary mapping r to Nr, the number of samples with frequency r, where Nr > 0.

Update the FreqDict fd by subtracting values in the samples.

union is the maximum of value in either of the input counters.

Update the FreqDict fd with the new set of samples.

Link to this section Types

@type t() :: %Needlepoint.Probability.FreqDist{samples: map()}

Link to this section Functions

Return the total number of sample values ("bins") that have counts greater than zero. Called B in nltk.

examples

Examples

iex> FreqDist.bins(FreqDist.new(%{"a" => 0, "b" => 1}))
1

Iterate over elements repeating each as many times as its count.

examples

Examples

iex> FreqDist.new("ABCABC") |> FreqDist.elements() |> Enum.sort()
["A", "A", "B", "B", "C", "C"]

# Knuth's example for prime factors of 1836:  2**2 * 3**3 * 17**1
iex> FreqDist.new(%{2 => 2, 3 => 3, 17 => 1}) |> FreqDist.elements() |> Enum.reduce(1, fn x, acc -> x * acc end)
1836

Note, if an element's count has been set to zero or is a negative number, elements() will ignore it.

Return the frequency of a given sample.

The frequency of a sample is defined as the count of that sample divided by the total number of sample outcomes that have been recorded by this FreqDist. The count of a sample is defined as the number of times that sample outcome was recorded by this FreqDist.

Frequencies are always real numbers in the range [0, 1]

examples

Examples

iex> FreqDist.freq(FreqDist.new(%{"a" => 0, "b" => 1, "c" => 2, "d" => 2}), "z")
0.0

iex> FreqDist.freq(FreqDist.new(%{"a" => 0, "b" => 1, "c" => 2, "d" => 2}), "a")
0.0

iex> FreqDist.freq(FreqDist.new(%{"a" => 0, "b" => 1, "c" => 2, "d" => 2}), "b")
0.2

Return a list of all samples that occur once (hapax legomena)

examples

Examples

iex> FreqDist.hapaxes(FreqDist.new(%{"a" => 0, "b" => 1, "c" => 2}))
["b"]
Link to this function

intersection(fd, samples)

View Source

intersection is the minimum of corresponding counts.

Values that only appear in one count are dropped.

examples

Examples

iex> FreqDist.new("abbb") |> FreqDist.intersection(FreqDist.new("bcc"))
%Needlepoint.Probability.FreqDist{samples: %{"b" => 1}}

Return the sample with the greatest number of outcomes.

If two or more samples have the same number of outcomes, return one of them; which sample is returned is undefined.

If no outcomes have occurred in this frequency distribution, return nil.

examples

Examples

iex> FreqDist.max(FreqDist.new())
nil

iex> FreqDist.max(FreqDist.new(%{"a" => 0, "b" => 1, "c" => 2}))
"c"

List all counts from the most common to the least.

examples

Examples

iex> alias Needlepoint.Probability.FreqDist
Needlepoint.Probability.FreqDist
iex> FreqDist.new("aabbbcccddddd") |> FreqDist.most_common()
[{"d", 5}, {"c", 3}, {"b", 3}, {"a", 2}]

List n counts from the most common to the least.

examples

Examples

iex> alias Needlepoint.Probability.FreqDist
Needlepoint.Probability.FreqDist
iex> FreqDist.new("aabbbcccddddd") |> FreqDist.most_common(1)
[{"d", 5}]

Return the total number of sample outcomes that have been recorded.

examples

Examples

iex> FreqDist.n(FreqDist.new("aabbccdd"))
8

Make a new empty FreqDict

Make a new FreqDict from a string, list, map or another FreqDist

examples

Examples

iex> FreqDist.new("gallahad")
%Needlepoint.Probability.FreqDist{
    samples: %{"a" => 3, "d" => 1, "g" => 1, "h" => 1, "l" => 2}
}

iex> FreqDist.new(%{"a" => 4, "b" => 2})
%Needlepoint.Probability.FreqDist{samples: %{"a" => 4, "b" => 2}}

iex> FreqDist.new(["a","a","a","a","b","b"])
%Needlepoint.Probability.FreqDist{samples: %{"a" => 4, "b" => 2}}

Return the dictionary mapping r to Nr, the number of samples with frequency r, where Nr > 0.

examples

Examples

iex> FreqDist.r_nr(FreqDist.new(%{"a" => 0, "b" => 1, "c" => 2, "d" => 2}))
%{0 => 1, 1 => 1, 2 => 2}

Update the FreqDict fd by subtracting values in the samples.

examples

Examples

iex> FreqDist.new("aaabbb") |> FreqDist.subtract(FreqDist.new("aba"))
%Needlepoint.Probability.FreqDist{samples: %{"a" => 1, "b" => 2}}

union is the maximum of value in either of the input counters.

examples

Examples

iex> FreqDist.new("abbb") |> FreqDist.union(FreqDist.new("bcc"))
%Needlepoint.Probability.FreqDist{samples: %{"a" => 1, "b" => 3, "c" => 2}}

Update the FreqDict fd with the new set of samples.

examples

Examples

iex> alias Needlepoint.Probability.FreqDist
Needlepoint.Probability.FreqDist
iex> FreqDist.new("aaa") |> FreqDist.update("bbb")
%Needlepoint.Probability.FreqDist{samples: %{"a" => 3, "b" => 3}}