PropCheck.StateM behaviour (PropCheck v1.5.0)

This module defines the :proper_statem behaviour, useful for testing stateful reactive systems whose internal state and side-effects are specified via an abstract state machine. Given a callback module implementing the :proper_statem behaviour (i.e. defining an abstract state machine of the system under test), PropEr can generate random symbolic sequences of calls to that system.

As a next step, generated symbolic calls are actually performed, while monitoring the system's responses to ensure it behaves as expected. Upon failure, the shrinking mechanism attempts to find a minimal sequence of calls provoking the same error.

The role of commands

Test cases generated for testing a stateful system are lists of symbolic API calls to that system. Symbolic representation has several benefits, which are listed here in increasing order of importance:

Generated test cases are easier to read and understand.
Failing test cases are easier to shrink.
The generation phase is side-effect free and this results in repeatable test cases, which is essential for correct shrinking.

Since the actual results of symbolic calls are not known at generation time, we use symbolic variables of type symbolic_var/0 to refer to them. A command of type command/0 is a symbolic term, used to bind a symbolic variable to the result of a symbolic call. For example:

[{:set, {:var, 1}, {:call, :erlang, :put, [:a, 42]}},
{:set, {:var, 2}, {:call, :erlang, :erase, [:a]}},
{:set, {:var, 3}, {:call, :erlang, :put, [:b, {:var, 2}]}}]

is a command sequence that could be used to test the process dictionary. In this example, the first call stores the pair {:a, 42} in the process dictionary, while the second one deletes it. Then, a new pair {:b, {:var, 2}} is stored. {:var, 2} is a symbolic variable bound to the result of :erlang.erase/1. This result is not known at generation time, since none of these operations is performed at that time. After evaluating the command sequence at runtime, the process dictionary will eventually contain the pair {:b, 42}.

The abstract model-state

In order to be able to test impure code, we need a way to track its internal state (at least the useful part of it). To this end, we use an abstract state machine representing the possible configurations of the system under test. When referring to the model state, we mean the state of the abstract state machine. The model state can be either symbolic or dynamic:

During command generation, we use symbolic variables to bind the results of symbolic calls. Therefore, the model state might (and usually does) contain symbolic variables and/or symbolic calls, which are necessary to operate on symbolic variables. Thus, we refer to it as symbolic state. For example, assuming that the internal state of the process dictionary is modeled as a proplist, the model state after generating the previous command sequence will be [b: {:var, 2}}].
During runtime, symbolic calls are evaluated and symbolic variables are replaced by their corresponding real values. Now we refer to the state as dynamic state. After running the previous command sequence, the model state will be [b: 42].

The callback functions

The following functions must be exported from the callback module implementing the abstract state machine:

initial_state/0
command/1
precondition/2
postcondition/3
next_state/3

The property used

Each test consists of two phases:

As a first step, PropEr generates random symbolic command sequences deriving information from the callback module implementing the abstract state machine. This is the role of commands/1 generator.
As a second step, command sequences are executed so as to check that the system behaves as expected. This is the role of run_commands/2, a function that evaluates a symbolic command sequence according to an abstract state machine specification.

These two phases are encapsulated in the following property, which can be used for testing the process dictionary:

def prop_pdict() do
  forall cmds <- commands(__MODULE__) do
    {_history, _state, result} = run_commands(__MODULE__, cmds)
    cleanup()
    result == ok
  end
end

When testing impure code, it is very important to keep each test self-contained. For this reason, almost every property for testing stateful systems contains some clean-up code. Such code is necessary to put the system in a known state, so that the next test can be executed independently from previous ones.

Parallel testing

After ensuring that a system's behaviour can be described via an abstract state machine when commands are executed sequentially, it is possible to move to parallel testing. The same state machine can be used to generate command sequences that will be executed in parallel to test for race conditions. A parallel test case (parallel_testcase/0 ) consists of a sequential and a parallel component. The sequential component is a command sequence that is run first to put the system in a random state. The parallel component is a list containing 2 command sequences to be executed in parallel, each of them in a separate newly-spawned process.

Generating parallel test cases involves the following actions. Initially, we generate a command sequence deriving information from the abstract state machine specification, as in the case of sequential statem testing. Then, we parallelize a random suffix (up to 12 commands) of the initial sequence by splitting it into 2 subsequences that will be executed concurrently. Limitations arise from the fact that each subsequence should be a valid command sequence (i.e. all commands should satisfy preconditions and use only symbolic variables bound to the results of preceding calls in the same sequence). Furthermore, we apply an additional check: we have to ensure that preconditions are satisfied in all possible interleavings of the concurrent tasks. Otherwise, an exception might be raised during parallel execution and lead to unexpected (and unwanted) test failure. In case these constraints cannot be satisfied for a specific test case, the test case will be executed sequentially. Then an f is printed on screen to inform the user. This usually means that preconditions need to become less strict for parallel testing to work.

After running a parallel test case, PropEr uses the state machine specification to check if the results observed could have been produced by a possible serialization of the parallel component. If no such serialization is possible, then an atomicity violation has been detected. In this case, the shrinking mechanism attempts to produce a counterexample that is minimal in terms of concurrent operations. Properties for parallel testing are very similar to those used for sequential testing.

def prop_parallel_testing() do
   forall testcase <- parallel_commands(__MODULE__) do
      {_sequential, _parallel, result} = run_parallel_commands(__MODULE__, testcase),
      cleanup(),
      result == :ok
   end
end

Please note that the actual interleaving of commands of the parallel component depends on the Erlang scheduler, which is too deterministic. For PropEr to be able to detect race conditions, the code of the system under test should be instrumented with erlang:yield/0 calls to the scheduler.

Acknowledgments

Very much of the documentation is immediately taken from the proper API documentation.

Summary

Types

command()

A value of type command denotes the execution of a symbolic command and storing its result in a symbolic variable.

command_list()

A sequence of commands.

dynamic_state()

A dynamic state can be anything and appears only during phase 2.

history()

History of command execution in phase 2. It contains current dynamic state and the result of the call.

parallel_history()

The history of concurrent execution of commands in phase 2.

parallel_testcase()

A parallel testcase consists of a sequential and a parallel component. The sequential component is a command sequence that is run first to put the system in a random state. The parallel component is a list containing 2 command sequences to be executed in parallel, each of them in a separate newly-spawned process.

result()

The outcome of the command sequence execution.

symbolic_call()

A symbolic call is the typical mfa-tuple plus the tag :call.

symbolic_state()

A symbolic state can be anything and appears only during phase 1.

symbolic_var()

Each result of a symbolic call is stored in a symbolic variable. Their values are opaque and can only used as whole.

Callbacks

command(s)

Generates a symbolic call to be included in the command sequence, given the current state s of the abstract state machine.

initial_state()

Specifies the symbolic initial state of the state machine.

next_state(s, res, call)

Specifies the next state of the abstract state machine, given the current state s, the symbolic call chosen and its result res. This function is called both at command generation and command execution time in order to update the model state, therefore the state s and the result Res can be either symbolic or dynamic.

postcondition(s, call, res)

Specifies the postcondition that should hold about the result res of performing call, given the dynamic state s of the abstract state machine prior to command execution.

precondition(s, call)

Specifies the precondition that should hold so that call can be included in the command sequence, given the current state s of the abstract state machine.

Functions

command_names(cmds)

Extracts the names of the commands from a given command sequence, in the form of MFAs.

commands(mod)

A special PropEr type which generates random command sequences, according to an abstract state machine specification.

commands(mod, initial_state)

Similar to commands/1, but generated command sequences always start at a given state.

more_commands(n, cmd_type)

Increases the expected length of command sequences generated from cmd_type by a factor n.

parallel_commands(mod)

A special PropEr type which generates parallel test cases, according to an abstract state machine specification.

parallel_commands(mod, initial_state)

Similar to parallel_commands/1, but generated command sequences always start at a given state.

print_report(run_result, cmds, opts \\ [])

Print pretty report of the failed command run.

run_commands(mod, cmds)

Evaluates a given symbolic command sequence cmds according to the state machine specified in mod.

run_commands(mod, cmds, env)

Similar to run_commands/2, but also accepts an environment, used for symbolic variable evaluation during command execution. The environment consists of {key::atom, value::any} pairs. Keys may be used in symbolic variables (i.e. {:var, key}) within the command sequence cmds. These symbolic variables will be replaced by their corresponding value during command execution.

run_parallel_commands(mod, testcase)

Runs a given parallel test case according to the state machine specified in mod.

run_parallel_commands(mod, testcase, env)

Similar to run_parallel_commands/2, but also accepts an environment used for symbolic variable evaluation, exactly as described in run_commands/3.

state_after(mod, cmds)

Returns the symbolic state after running a given command sequence, according to the state machine specification found in mod.

zip(l1, l2)

Behaves exactly like Enum.zip/2.

Types

command()

@type command() :: {:set, symbolic_var(), symbolic_call()} | {:init, symbolic_state()}

A value of type command denotes the execution of a symbolic command and storing its result in a symbolic variable.

command_list()

@type command_list() :: [command()]

A sequence of commands.

dynamic_state()

@type dynamic_state() :: any()

A dynamic state can be anything and appears only during phase 2.

history()

@type history() :: [{dynamic_state(), term()}]

History of command execution in phase 2. It contains current dynamic state and the result of the call.

parallel_history()

@type parallel_history() :: [{command(), term()}]

The history of concurrent execution of commands in phase 2.

parallel_testcase()

@type parallel_testcase() :: {command_list(), [command_list()]}

result()

@type result() :: :proper_statem.statem_result()

The outcome of the command sequence execution.

symbolic_call()

@type symbolic_call() :: :proper_statem.symbolic_call()

A symbolic call is the typical mfa-tuple plus the tag :call.

symbolic_state()

@type symbolic_state() :: any()

A symbolic state can be anything and appears only during phase 1.

symbolic_var()

@type symbolic_var() :: :proper_statem.symbolic_var()

Each result of a symbolic call is stored in a symbolic variable. Their values are opaque and can only used as whole.

Callbacks

command(s)

@callback command(s :: symbolic_state()) :: PropCheck.BasicTypes.type()

Generates a symbolic call to be included in the command sequence, given the current state s of the abstract state machine.

However, before the call is actually included, a precondition is checked. This function will be repeatedly called to produce the next call to be included in the test case.

initial_state()

@callback initial_state() :: symbolic_state()

Specifies the symbolic initial state of the state machine.

This state will be evaluated at command execution time to produce the actual initial state. The function is not only called at command generation time, but also in order to initialize the state every time the command sequence is run (i.e. during normal execution, while shrinking and when checking a counterexample). For this reason, it should be deterministic and self-contained.

next_state(s, res, call)

@callback next_state(
  s :: symbolic_state() | dynamic_state(),
  res :: term(),
  call :: symbolic_call()
) ::
  symbolic_state() | dynamic_state()

postcondition(s, call, res)

@callback postcondition(s :: dynamic_state(), call :: symbolic_call(), res :: term()) ::
  boolean()

Specifies the postcondition that should hold about the result res of performing call, given the dynamic state s of the abstract state machine prior to command execution.

This function is called during runtime, this is why the state is dynamic.

precondition(s, call)

@callback precondition(s :: symbolic_state(), call :: symbolic_call()) :: boolean()

Specifies the precondition that should hold so that call can be included in the command sequence, given the current state s of the abstract state machine.

In case precondition doesn't hold, a new call is chosen using the command/1 generator. If preconditions are very strict, it will take a lot of tries for PropEr to randomly choose a valid command. Testing will be stopped in case the constraint_tries limit is reached (see the Options section in the {@link proper} module documentation). Preconditions are also important for correct shrinking of failing test cases. When shrinking command sequences, we try to eliminate commands that do not contribute to failure, ensuring that all preconditions still hold. Validating preconditions is necessary because during shrinking we usually attempt to perform a call with the system being in a state different from the state it was when initially running the test.

Functions

command_names(cmds)

Extracts the names of the commands from a given command sequence, in the form of MFAs.

It is useful in combination with functions such as PropCheck.aggregate/2 in order to collect statistics about command execution.

commands(mod)

A special PropEr type which generates random command sequences, according to an abstract state machine specification.

The function takes as input the name of a callback module, which contains the state machine specification. The initial state is computed by mod:initial_state/0.

commands(mod, initial_state)

Similar to commands/1, but generated command sequences always start at a given state.

In this case, the first command is always {:init, initial_state} and is used to correctly initialize the state every time the command sequence is run (i.e. during normal execution, while shrinking and when checking a counterexample). In this case, mod:initial_state/0 is never called.

more_commands(n, cmd_type)

Increases the expected length of command sequences generated from cmd_type by a factor n.

CAVEAT This function does not work properly. My current guess is that this is a limitation of how PropEr works with sizing an din particular resizing. The commands list generator (cmd_type) is not a simple list which can be sized easily, but a complex construct where the rather simple approach of resizing does not work as expected.

parallel_commands(mod)

A special PropEr type which generates parallel test cases, according to an abstract state machine specification.

The function takes as input the name of a callback module, which contains the state machine specification. The initial state is computed by mod:initial_state/0.

parallel_commands(mod, initial_state)

Similar to parallel_commands/1, but generated command sequences always start at a given state.

print_report(run_result, cmds, opts \\ [])

@spec print_report({history(), dynamic_state(), result()}, command_list(), keyword()) ::
  :ok

Print pretty report of the failed command run.

Note that it by default aliases every commands module, so only the last part of the namespace will be used.

Accepts options:

return_values - whether to print return values after each command run (default true),
last_state - whether section with the last state should be present (default true),
pre_cmd_state - whether to print state prior to executed command (default false),
post_cmd_state - whether to print state post executed command (default true),
cmd_args - whether to print command arguments as literals (default true),
inspect_opts - options passed to inspect/2
alias - set a list of modules and aliases, to be used to pretty print commands. Accepts a module/0, a tuple of type {module(), alias :: module()} or a list of them (default is to alias every command's module - if you want to disable default aliasing set it to []).

run_commands(mod, cmds)

Evaluates a given symbolic command sequence cmds according to the state machine specified in mod.

The result is a triple of the form {history, dynamic_state, result}, where:

history contains the execution history of all commands that were executed without raising an exception. It contains tuples of the form {t:dynamic_state/0, t:term/0}, specifying the state prior to command execution and the actual result of the command.
dynamicState contains the state of the abstract state machine at the moment when execution stopped. In case execution has stopped due to a false postcondition, dynamic_state corresponds to the state prior to execution of the last command.
result specifies the outcome of command execution. It can be classified in one of the following categories: <ul> <li> ok All commands were successfully run and all postconditions were true. <li> initialization error There was an error while evaluating the initial state. <li> postcondition error A postcondition was false or raised an exception. <li> precondition error A precondition was false or raised an exception. <li> exception An exception was raised while running a command. </ul>

run_commands(mod, cmds, env)

run_parallel_commands(mod, testcase)

Runs a given parallel test case according to the state machine specified in mod.

The result is a triple of the form {sequential_history, parallel_history, result}, where:

sequential_history contains the execution history of the sequential component.
Parallel_history contains the execution history of each of the concurrent tasks.
Result specifies the outcome of the attempt to serialize command execution, based on the results observed. It can be one of the following: <ul><li> ok <li> no_possible_interleaving </ul>

Since the scheduler of the runtime system is quite predictable, you are advised to add more turbulences in the execution traces by instrumenting the code under test via PropCheck.Instrument.

The history shown when the execution fails, does not reveal in which exact interleaving of commands the failure happens but simply shows the executed commands in both processes running concurrently. This is a feature of PropEr, which cannot easily changed.

run_parallel_commands(mod, testcase, env)

Similar to run_parallel_commands/2, but also accepts an environment used for symbolic variable evaluation, exactly as described in run_commands/3.

state_after(mod, cmds)

Returns the symbolic state after running a given command sequence, according to the state machine specification found in mod.

The commands are not actually executed.

zip(l1, l2)

Behaves exactly like Enum.zip/2.

Zipping stops when the shortest list stops. This is useful for zipping a command sequence with its (failing) execution history.