PropCheck.StateM.ModelDSL behaviour (PropCheck v1.4.1) View Source

This module provides a shallow DSL (domain specific language) in Elixir for property based testing of stateful systems. It's built upon PropCheck.StateM and all it's the characteristics apply here as well. It's a replacement for PropCheck.StateM.DSL.

The basic approach

Property based testing of stateful systems is different from ordinary property based testing. Instead of testing operations and their effects on the data structure directly, we construct a model of the system and generate a sequence of commands operating on both, the model and the system. Then we check that after each command step, the system has evolved accordingly to the model. This is the same idea which is used in model checking and is sometimes called a bisimulation.

After defining a model, we have two phases during executing the property. In phase 1, the generators create a list of (symbolic) commands including their parameters to be run against the system under test (SUT). A state machine guides the generation of commands.

In phase 2, the commands are executed and the state machine checks that the SUT is in the same state as the state machine. If an invalid state is detected, then the command sequence is shrunk towards a shorter sequence serving then as counterexamples.

This approach works exactly the same as with PropCheck.StateM and PropCheck.FSM. The main difference is the API, grouping pre- and postconditions, state transitions around the commands of the SUT. This leads towards more logical locality compared to the former implementations. QuickCheck EQC has a similar approach for structuring their modern state machines.

The DSL

A state machine acting as a model of the SUT can be defined by focusing on states or on transitions. We focus here on the transitions. A transition is a command calling the SUT. Therefore the main phrase of the DSL is the defcommand macro.

defcommand :find do
  # define the rules for executing the find command here
end

Inside the defcommand macro, we define all the rules which the command must obey. As an example, we discuss here as an example the slightly simplified command :find from test/cache_dsl_test.exs. The SUT is a cache implementation based on an ETS and the model is is based on a list of (key/value)-pairs. This example is derived from Fred Hebert's PropEr Testing, Chapter 9

The find-command is a call to the find/1 API function. Its arguments are generated in command_gen/1 (described later) callback, which for this command is using just one argument, a key() generator. Next, we need to define the execution of the command by defining function impl/n. The impl-function allows to apply conversion of parameters and return values to ease the testing. A typical example is the conversion of an {:ok, value} tuple to only value which can simplify working with value.

defcommand :find do
  def impl(key), do: Cache.find(key)
end

After defining how a command is executed, we need to define in which state this is allowed. For this, we define function pre/2, taking the model state and the generated list of arguments to check whether this call is allowed in the current model state. In this particular example, find is always allowed, hence we return true without any further checking. This is also the default implementation and the reason why the precondition is missing in the test file.

defcommand :find do
  def impl(key), do: Cache.find(key)
  def pre(_state, [_key]), do: true
end

If the precondition is satisfied, the call can happen. After the call, the SUT can be in a different state and the model state must be updated according to the mapping of the SUT to the model. The function next/3 takes the state before the call, the list of arguments and the symbolic or dynamic result (depending on phase 1 or 2, respectively). next/3 returns the new model state. Since searching for a key in the cache does not modify the system nor the model state, nothing has to be done. This is again the default implementation and thus left out in the test file.

defcommand :find do
  def impl(key), do: Cache.find(key)
  def pre(_state, [_key]), do: true
  def next(old_state, _args, call_result), do: old_state
end

The missing part of the command definition is the post condition, checking that after calling the system in phase 2, the system is in the expected state compared the model. This check is implemented in function post/3, which again has a trivial default implementation for post conditions that always returns true. In this example, we check if we can find the key in our list of entries and if we do, we check if call_result resulted in {:ok, val}. Or if we don't found it, we check if the SUT also cannot find it by comparing if call_result returned {:error, :not_found}.

defcommand :find do
  def impl(key), do: Cache.find(key)
  def pre(_state, [_key]), do: true
  def next(old_state, _args, _call_result), do: old_state
  def post(entries, [key], call_result) do
    case List.keyfind(entries, key, 0, false) do
        false       -> call_result == {:error, :not_found}
        {^key, val} -> call_result == {:ok, val}
    end
  end
end

This completes the DSL for command definitions.

Additional model elements

In addition to commands, we need to define the model itself. This is the ingenious part of stateful property based testing! The initial state of the model must be implemented as the function initial_state/0. It doesn't accept any arguments, because this function has to be deterministic. From this function, all model evolutions start. In our simplified cache example the initial model is an empty list:

def initial_state(), do: []

The sequence of commands to be run is generated repeatedly by command_gen/1 callback. The generator has to return a tuple of with a command name and a list of it's arguments (a list of generators). This callback expects the current state as an argument, which often is used to determine the next one from a set of appropriate commands (e.g. there might not be much sense in calling the delete_user command, if there are no users in the system yet). Usually a PropCheck.BasicTypes.oneof/1 or PropCheck.BasicTypes.frequency/1 generators are used to pick one of possible commands. In our cache example we want the find command to appear three times more often than other commands:

def command_gen(_state) do
  frequency([
    {3, {:find, [key()]}},
    {1, {:cache, [key(), val()]}},
    {1, {:flush, []}}
  ])

The property to test

The property to test the stateful system is more or less the same for all systems. We generate all commands via generator commands/1, which takes a module with callbacks as parameter. Inside the test, we first start the SUT, execute the commands with run_commands/1, stopping the SUT and evaluating the result of the executions as a boolean expression. This boolean expression can be adorned with further functions and macros to analyze the generated commands (via PropCheck.aggregate/2) or to inspect the history if a failure occurs (via PropCheck.when_fail/2). In the test cases, you find more examples of such adornments.

property "run the sequential cache", [:verbose] do
  forall cmds <- commands(__MODULE__) do
    Cache.start_link(@cache_size)
    {_history, _state, result} = run_commands(cmds)
    Cache.stop()
    (result == :ok)
  end
end

Increasing the Number of Commands in a Sequence

Sometimes issues can hide when the command sequences are short. In order to tease out these hidden bugs we can increase the number of commands generated by using the max_size option in our property.

  property "run the sequential cache", [max_size: 250] do
  forall cmds <- commands(__MODULE__) do
    Cache.start_link(@cache_size)
    {_history, _state, result} = run_commands(cmds)
    Cache.stop()
    (result == :ok)
  end

Link to this section Summary

Types

command()

A value of type command denotes the execution of a symbolic command and storing its result in a symbolic variable.

command_list()

A sequence of commands.

dynamic_state()

A dynamic state can be anything and appears only during phase 2.

history()

History of command execution in phase 2. It contains current dynamic state and the result of the call.

parallel_history()

The history of concurrent execution of commands in phase 2.

parallel_testcase()

A parallel testcase consists of a sequential and a parallel component. The sequential component is a command sequence that is run first to put the system in a random state. The parallel component is a list containing 2 command sequences to be executed in parallel, each of them in a separate newly-spawned process.

result()

The outcome of the command sequence execution.

symbolic_call()

A symbolic call is the typical mfa-tuple plus the tag :call.

symbolic_state()

A symbolic state can be anything and appears only during phase 1.

symbolic_var()

Each result of a symbolic call is stored in a symbolic variable. Their values are opaque and can only used as whole.

Functions

command_names(cmds)

Extracts the names of the commands from a given command sequence, in the form of MFAs.

commands(mod)

A special PropEr type which generates random command sequences, according to an abstract state machine specification.

commands(mod, initial_state)

Similar to commands/1, but generated command sequences always start at a given state.

def_commands()

def_next_states(commands)

def_postconds(commands)

def_preconds(commands)

defcommand(name, list)

Defines a new command of the model.

more_commands(n, cmd_type)

Increases the expected length of command sequences generated from cmd_type by a factor n.

parallel_commands(mod)

A special PropEr type which generates parallel test cases, according to an abstract state machine specification.

parallel_commands(mod, initial_state)

Similar to parallel_commands/1, but generated command sequences always start at a given state.

print_report(run_result, cmds, opts \\ [])

Print pretty report of the failed command run.

run_commands(mod, cmds)

Evaluates a given symbolic command sequence cmds according to the state machine specified in mod.

run_commands(mod, cmds, env)

Similar to run_commands/2, but also accepts an environment, used for symbolic variable evaluation during command execution. The environment consists of {key::atom, value::any} pairs. Keys may be used in symbolic variables (i.e. {:var, key}) within the command sequence cmds. These symbolic variables will be replaced by their corresponding value during command execution.

run_parallel_commands(mod, testcase)

Runs a given parallel test case according to the state machine specified in mod.

run_parallel_commands(mod, testcase, env)

Similar to run_parallel_commands/2, but also accepts an environment used for symbolic variable evaluation, exactly as described in run_commands/3.

state_after(mod, cmds)

Returns the symbolic state after running a given command sequence, according to the state machine specification found in mod.

zip(l1, l2)

Behaves exactly like Enum.zip/2.

Callbacks

command_gen(s)

Generates a symbolic call to be included in the command sequence, given the current state s of the abstract state machine. Must return a type that generates tuples {command_name :: atom, args :: [PropCheck.BasicTypes.type]}.

initial_state()

Specifies the symbolic initial state of the state machine.

Link to this section Types

command()

Specs

command() :: {:set, symbolic_var(), symbolic_call()} | {:init, symbolic_state()}

A value of type command denotes the execution of a symbolic command and storing its result in a symbolic variable.

command_list()

Specs

command_list() :: [command()]

A sequence of commands.

dynamic_state()

Specs

dynamic_state() :: any()

A dynamic state can be anything and appears only during phase 2.

history()

Specs

history() :: [{dynamic_state(), term()}]

History of command execution in phase 2. It contains current dynamic state and the result of the call.

parallel_history()

Specs

parallel_history() :: [{command(), term()}]

The history of concurrent execution of commands in phase 2.

parallel_testcase()

Specs

parallel_testcase() :: {command_list(), [command_list()]}

result()

Specs

result() :: :proper_statem.statem_result()

The outcome of the command sequence execution.

symbolic_call()

Specs

symbolic_call() :: :proper_statem.symbolic_call()

A symbolic call is the typical mfa-tuple plus the tag :call.

symbolic_state()

Specs

symbolic_state() :: any()

A symbolic state can be anything and appears only during phase 1.

symbolic_var()

Specs

symbolic_var() :: :proper_statem.symbolic_var()

Each result of a symbolic call is stored in a symbolic variable. Their values are opaque and can only used as whole.

Link to this section Functions

command_names(cmds)

Extracts the names of the commands from a given command sequence, in the form of MFAs.

It is useful in combination with functions such as PropCheck.aggregate/2 in order to collect statistics about command execution.

commands(mod)

A special PropEr type which generates random command sequences, according to an abstract state machine specification.

The function takes as input the name of a callback module, which contains the state machine specification. The initial state is computed by mod:initial_state/0.

commands(mod, initial_state)

Similar to commands/1, but generated command sequences always start at a given state.

In this case, the first command is always {:init, initial_state} and is used to correctly initialize the state every time the command sequence is run (i.e. during normal execution, while shrinking and when checking a counterexample). In this case, mod:initial_state/0 is never called.

def_commands()

def_next_states(commands)

def_postconds(commands)

def_preconds(commands)

defcommand(name, list)

(macro)

Defines a new command of the model.

Inside the command, local functions define

how the command is executed: impl(...) - this is required,
if the command is allowed in the current model state: pre(state, arg_list) :: boolean - this is true per default,
what the next state of the model is after the call: next(old_state, arg_list, result) :: new_state - the default implementation does not change the model state, sufficient for queries,
if the system under test is in the correct state after the call: post(old_state, arg_list, result) :: boolean - this is true in the default implementation.

These local functions inside the macro are effectively callbacks to guide and evolve the model state.

more_commands(n, cmd_type)

Increases the expected length of command sequences generated from cmd_type by a factor n.

CAVEAT This function does not work properly. My current guess is that this is a limitation of how PropEr works with sizing an din particular resizing. The commands list generator (cmd_type) is not a simple list which can be sized easily, but a complex construct where the rather simple approach of resizing does not work as expected.

parallel_commands(mod)

A special PropEr type which generates parallel test cases, according to an abstract state machine specification.

The function takes as input the name of a callback module, which contains the state machine specification. The initial state is computed by mod:initial_state/0.

parallel_commands(mod, initial_state)

Similar to parallel_commands/1, but generated command sequences always start at a given state.

print_report(run_result, cmds, opts \\ [])

Print pretty report of the failed command run.

Note that it by default aliases every commands module, so only the last part of the namespace will be used.

Accepts options:

return_values - whether to print return values after each command run (default true),
last_state - whether section with the last state should be present (default true),
pre_cmd_state - whether to print state prior to executed command (default false),
post_cmd_state - whether to print state post executed command (default true),
cmd_args - whether to print command arguments as literals (default true),
inspect_opts - options passed to inspect/2
alias - set a list of modules and aliases, to be used to pretty print commands. Accepts a module/0, a tuple of type {module(), alias :: module()} or a list of them (default is to alias every command's module - if you want to disable default aliasing set it to []).

run_commands(mod, cmds)

Evaluates a given symbolic command sequence cmds according to the state machine specified in mod.

The result is a triple of the form {history, dynamic_state, result}, where:

history contains the execution history of all commands that were executed without raising an exception. It contains tuples of the form {t:dynamic_state, t:term}, specifying the state prior to command execution and the actual result of the command.
dynamicState contains the state of the abstract state machine at the moment when execution stopped. In case execution has stopped due to a false postcondition, dynamic_state corresponds to the state prior to execution of the last command.
result specifies the outcome of command execution. It can be classified in one of the following categories: <ul> <li> ok All commands were successfully run and all postconditions were true. <li> initialization error There was an error while evaluating the initial state. <li> postcondition error A postcondition was false or raised an exception. <li> precondition error A precondition was false or raised an exception. <li> exception An exception was raised while running a command. </ul>

run_commands(mod, cmds, env)

run_parallel_commands(mod, testcase)

Runs a given parallel test case according to the state machine specified in mod.

The result is a triple of the form {sequential_history, parallel_history, result}, where:

sequential_history contains the execution history of the sequential component.
Parallel_history contains the execution history of each of the concurrent tasks.
Result specifies the outcome of the attempt to serialize command execution, based on the results observed. It can be one of the following: <ul><li> ok <li> no_possible_interleaving </ul>

Since the scheduler of the runtime system is quite predictable, you are advised to add more turbulences in the execution traces by instrumenting the code under test via PropCheck.Instrument.

The history shown when the execution fails, does not reveal in which exact interleaving of commands the failure happens but simply shows the executed commands in both processes running concurrently. This is a feature of PropEr, which cannot easily changed.

run_parallel_commands(mod, testcase, env)

Similar to run_parallel_commands/2, but also accepts an environment used for symbolic variable evaluation, exactly as described in run_commands/3.

state_after(mod, cmds)

Returns the symbolic state after running a given command sequence, according to the state machine specification found in mod.

The commands are not actually executed.

zip(l1, l2)

Behaves exactly like Enum.zip/2.

Zipping stops when the shortest list stops. This is useful for zipping a command sequence with its (failing) execution history.

Link to this section Callbacks

command_gen(s)

Specs

command_gen(s :: symbolic_state()) :: PropCheck.BasicTypes.type()

However, before the call is actually included, a precondition is checked. This function will be repeatedly called to produce the next call to be included in the test case.

initial_state()

Specs

initial_state() :: symbolic_state()

Specifies the symbolic initial state of the state machine.

This state will be evaluated at command execution time to produce the actual initial state. The function is not only called at command generation time, but also in order to initialize the state every time the command sequence is run (i.e. during normal execution, while shrinking and when checking a counterexample). For this reason, it should be deterministic and self-contained.