tinypp

tinypp is a tiny package for probabilistic programming. Your probabilistic program has to consist of a single block where you place all your sample, condition, and query statements on the top level. Otherwise you’re free to go.

Tiny example:

import gleam/float
import gleam/int
import gleam/io
import tinypp.{pmf, normalize, sample, condition, query}
import tinypp/distribution.{uniform}

pub fn main() -> Nil {
  // What is the probability that a die shows a value greater than three if we
  // know that the value is even?
  let distribution_greater_three = {
    let die = uniform([1, 2, 3, 4, 5, 6])
    use value <- sample(die)
    use <- condition(int.is_even(value))
    query(value > 3)
  }
  let p_greater_three = pmf(normalize(distribution_greater_three), True)
  io.println("P(value > 3 | value is even) = " <> float.to_string(p_greater_three))
}

Types

The central type of tinypp, implements a discrete probability distribution by storing explicit probabilities for each value in its support.

pub type Distribution(a) {
  Distribution(table: dict.Dict(a, Float))
}

Constructors

Values

pub fn condition(
  on predicate: Bool,
  do f: fn() -> Distribution(a),
) -> Distribution(a)

Conceptually condition on a predicate. This is where you put your observation/data.

Intended to be used with the use-syntax.

use x <- sample(distribution)
use <- condition(x > 5)
pub fn fail() -> Distribution(a)

A “pseudo-distribution” with empty support.

pub fn mixture(
  weights: List(Float),
  components: List(Distribution(a)),
) -> Distribution(a)

Create a mixture distribution where the components are weighted by weights. Note that the returned distribution will, in general, not be normalized.

pub fn normalize(
  distribution: Distribution(a),
) -> Distribution(a)

Scale the probabilities in the given distribution by a constant factor such that they add up to one.

pub fn pmf(distribution: Distribution(a), value: a) -> Float

Query a distribution for the probability mass of a certain value. For values not explicitly stored in the distribution’s table, returns zero.

let coin = uniform([1, 2])
pmf(coin, 1) // -> 0.5
pmf(coin, 2) // -> 0.5
pmf(coin, 3) // -> 0.0
pub fn query(value: a) -> Distribution(a)

Conceptually query for the distribution of a certain value. Put this at the end of your probabilistic program (that is the block in which you have all you sample and condition statements) to specify what you want to know the distribution of.

let my_posterior = {
  use x <- sample(distribution)
  use <- condition(x > 5)
  query(x == 3)
  // or: query(x)
  // or: query(#(x, 2 * x))
  // etc.
}
pub fn sample(
  from distribution: Distribution(a),
  do f: fn(a) -> Distribution(b),
) -> Distribution(b)

Conceptually sample from a distribution. This does not involve any random number generation, it rather leads to a marginalization over the variable. This is principally irrelevant for using it but for runtime considerations do keep in mind that using sample leads to “looping” over the given function for every hypothesis in the given distribution.

Intended to be used with the use-syntax.

use x <- sample(from: distribution)
use y <- sample(from: other_distribution)
pub fn scale(
  distribution: Distribution(a),
  factor: Float,
) -> Distribution(a)

Scale all probabilities in the given distribution by a constant factor.

pub fn show_table(
  distribution: Distribution(a),
  top number: Int,
) -> Nil

Display the top hypotheses of the given distribution in terms of their probability. Will display less if the support has less than top elements.

pub fn singleton(value: a) -> Distribution(a)

Create a distribution that has all probability mass concentrated on a single value.

let d = singleton(1)
pmf(d, 1) // -> 1.0
pmf(d, 2) // -> 0.0
pub fn support(distribution: Distribution(a)) -> List(a)

Find the support of the given distribution, i.e. the largest list s such that

let s = support(distribution)
assert list.all(s, fn(value) { pmf(distribution, value) >. 0.0 })
pub fn to_list(
  distribution: Distribution(a),
) -> List(#(a, Float))

Similarly to dict.to_list, return the distribution as a list of tuples where each element is a hypothesis and its probability.

Search Document