Benchee.Statistics (Benchee v1.5.0)

View Source

Statistics related functionality that is meant to take the raw benchmark data and then compute statistics like the average and the standard deviation etc.

See statistics/1 for a breakdown of the included statistics.

Summary

Types

Careful with the mode, might be multiple values, one value or nothing.😱

The samples a Benchee.Collect collected to compute statistics from.

t()

All the statistics statistics/1 computes from the samples.

Functions

Takes a suite with scenarios and their data samples, adds the statistics to the scenarios. For an overview of what the statistics mean see t/0.

Types

mode()

@type mode() :: [number()] | number() | nil

Careful with the mode, might be multiple values, one value or nothing.😱

samples()

@type samples() :: [number()]

The samples a Benchee.Collect collected to compute statistics from.

t()

@type t() :: %Benchee.Statistics{
  absolute_difference: float() | nil,
  average: float(),
  ips: float() | nil,
  lower_outlier_bound: number(),
  maximum: number(),
  median: number(),
  minimum: number(),
  mode: mode(),
  outliers: [number()],
  percentiles: %{required(number()) => float()},
  relative_less: float() | nil | :infinity,
  relative_more: float() | nil | :infinity,
  sample_size: integer(),
  std_dev: float(),
  std_dev_ips: float() | nil,
  std_dev_ratio: float(),
  upper_outlier_bound: number()
}

All the statistics statistics/1 computes from the samples.

This used for run times, memory and reductions. Generally with these, the lower the better (less run time, memory consumption or reductions).

These values mostly correspond to their cousins in Statistex.

Overview of all the statistics Benchee currently provides:

  • average - average of all the samples (the lower the better)
  • ips - iterations per second, how often can the given function be executed within one second, used only for run times (the higher the better)
  • std_dev - standard deviation, how much results vary among the samples (the higher the more the results vary)
  • std_dev_ratio - standard deviation expressed as how much it is relative to the average
  • std_dev_ips - the absolute standard deviation of iterations per second
  • median - when all measured times are sorted, this is the middle value (or average of the two middle values when the number of times is even). More stable than the average and somewhat more likely to be a typical value you see.
  • percentiles - a map of percentile ranks. These are the values below which x% of the samples lie. For example, 99% of samples are less than is a value for which 99% of the run times are less than it.
  • mode - the samples that occur the most. Often one value, but can be multiple values if they occur the same amount of times. If no value occurs at least twice, this value will be nil.
  • minimum - the smallest sample measured for the scenario
  • maximum - the biggest sample measured for the scenario
  • relative_more - relative to the reference (usually the fastest scenario) how much more was the average of this scenario. E.g. for reference at 100, this scenario 200 then it is 2.0.
  • relative_less - relative to the reference (usually the fastest scenario) how much less was the average of this scenario. E.g. for reference at 100, this scenario 200 then it is 0.5.
  • absolute_difference - relative to the reference (usually the fastest scenario) what is the difference of the averages of the scenarios. e.g. for reference at 100, this scenario 200 then it is 100.
  • sample_size - the number of measurements/samples taken into account for calculating statistics
  • outliers - if outlier exclusion was enabled, may include any samples of outliers that were found, empty list otherwise
  • lower_outlier_bound - value below which values are considered an outlier
  • upper_outlier_bound - value above which values are considered an outlier

Functions

statistics(suite, printer \\ ProgressPrinter)

Takes a suite with scenarios and their data samples, adds the statistics to the scenarios. For an overview of what the statistics mean see t/0.

Note that this will also sort the scenarios fastest to slowest to ensure a consistent order of scenarios in all used formatters.

Examples

iex> scenarios = [
...>   %Benchee.Scenario{
...>     job_name: "My Job",
...>     run_time_data: %Benchee.CollectionData{
...>       samples: [200, 400, 400, 400, 500, 500, 500, 700, 900]
...>     },
...>     memory_usage_data: %Benchee.CollectionData{
...>       samples: [200, 400, 400, 400, 500, 500, 500, 700, 900]
...>     },
...>     input_name: "Input",
...>     input: "Input"
...>   }
...> ]
...>
...> suite = %Benchee.Suite{scenarios: scenarios}
...> statistics(suite, Benchee.Test.FakeProgressPrinter)
%Benchee.Suite{
  scenarios: [
    %Benchee.Scenario{
      job_name: "My Job",
      input_name: "Input",
      input: "Input",
      run_time_data: %Benchee.CollectionData{
        samples: [200, 400, 400, 400, 500, 500, 500, 700, 900],
        statistics: %Benchee.Statistics{
          average: 500.0,
          ips: 2000_000.0,
          std_dev: 200.0,
          std_dev_ratio: 0.4,
          std_dev_ips: 800_000.0,
          median: 500.0,
          percentiles: %{25 => 400.0, 50 => 500.0, 75 => 600.0, 99 => 900.0},
          mode: [500, 400],
          minimum: 200,
          maximum: 900,
          sample_size: 9,
          outliers: [],
          lower_outlier_bound: 100.0,
          upper_outlier_bound: 900.0
        }
      },
      memory_usage_data: %Benchee.CollectionData{
        samples: [200, 400, 400, 400, 500, 500, 500, 700, 900],
        statistics: %Benchee.Statistics{
          average: 500.0,
          ips: nil,
          std_dev: 200.0,
          std_dev_ratio: 0.4,
          std_dev_ips: nil,
          median: 500.0,
          percentiles: %{25 => 400.0, 50 => 500.0, 75 => 600.0, 99 => 900.0},
          mode: [500, 400],
          minimum: 200,
          maximum: 900,
          sample_size: 9,
          outliers: [],
          lower_outlier_bound: 100.0,
          upper_outlier_bound: 900.0
        }
      }
    }
  ],
  system: nil
}