Markov (markov v4.1.3)

Public API

Example workflow:

# The model will be stored under this path
{:ok, model} = Markov.load("./model_path", sanitize_tokens: true, store_log: [:train])

# train using four strings
:ok = Markov.train(model, "hello, world!")
:ok = Markov.train(model, "example string number two")
:ok = Markov.train(model, "hello, Elixir!")
:ok = Markov.train(model, "fourth string")

# generate text
{:ok, text} = Markov.generate_text(model)
IO.puts(text)

# commit all changes and unload
Markov.unload(model)

# these will return errors because the model is unloaded
# Markov.generate_text(model)
# Markov.train(model, "hello, world!")

# load the model again
{:ok, model} = Markov.load("./model_path")

# enable probability shifting and generate text
:ok = Markov.configure(model, shift_probabilities: true)
{:ok, text} = Markov.generate_text(model)
IO.puts(text)

# print uninteresting stats
model |> Markov.dump_partition(0) |> IO.inspect
model |> Markov.read_log |> IO.inspect

# this will also write our new just-set option
Markov.unload(model)

Link to this section Summary

Types

Model options that could be set during creation in a call to load/3 or with configure/2

If data was tagged when training, you can use tag queries to alter the probabilities of certain generation paths

Functions

Reconfigures a loaded model. See model_option/0 for a thorough description of the options

Generates a string. Will raise an exception if the model was trained on non-textual tokens at least once

Generates a list of tokens

Gets the configuration of a loaded model

Loads an existing model under path path. If none is found, a new model with the specified options will be created and loaded, and if that fails, an error will be returned.

Reads the log file and returns a list of entries in chronological order

Trains model using text or a list of tokens.

Unloads a loaded model

Link to this section Types

Link to this type

log_entry_type()

@type log_entry_type() :: :start | :end | :train | :gen
Link to this type

model_option()

@type model_option() ::
  {:store_log, [log_entry_type()]}
  | {:shift_probabilities, boolean()}
  | {:sanitize_tokens, boolean()}
  | {:order, integer()}

Model options that could be set during creation in a call to load/3 or with configure/2:

  • store_log: determines what data to put in the operation log, all of them by default:
    • :start - model is loaded
    • :end - model is unloaded
    • :train: training requests
    • :gen: generation results
  • shift_probabilities: gives less popular generation paths more chance to get used, which makes the output more original but may produce nonsense; false by default
  • sanitize_tokens: ignores letter case and punctuation when switching states, but still keeps the output as-is; false by default, can't be changed once the model is created
  • order: order of the chain, i.e. how many previous tokens the next one is based on; 2 by default, can never be changed once the model is created
Link to this opaque

model_reference()

(opaque)
@opaque model_reference()
@type tag_query() :: %{required(term()) => non_neg_integer()}

If data was tagged when training, you can use tag queries to alter the probabilities of certain generation paths

examples

Examples:

# training
iex> Markov.train(model, "hello earth", [
  {:action, :saying_hello}, # <- terms of any type can function as tags
  {:subject_type, :planet},
  {:subject, "earth"},
  :lowercase
])
:ok
iex> Markov.train(model, "Hello Elixir", [
  {:action, :saying_hello},
  {:subject_type, :programming_language},
  {:subject, "Elixir"},
  :uppercase
])
:ok


# simple generation - both paths have equal probabilities
iex> Markov.generate_text(model)
{:ok, "hello earth"}
iex> Markov.generate_text(model)
{:ok, "hello Elixir"}

# All generation paths have a score of 1 by default. Here we're telling
# Markov to add 1 point to paths tagged with `:uppercase`;
# "hello Elixir" now has a score of 2 and "hello earth" has a score of 1.
# Thus, "hello Elixir" has a probability of 2/3, and "hello earth" has
# that of 1/3
iex> Markov.generate_text(model, %{uppercase: 1})
{:ok, "hello Elixir"}
iex> Markov.generate_text(model, %{uppercase: 1})
{:ok, "hello Elixir"}
iex> Markov.generate_text(model, %{uppercase: 1})
{:ok, "hello earth"}

Link to this section Functions

Link to this function

configure(model, opts)

@spec configure(model :: model_reference(), opts :: [model_option()]) ::
  :ok | {:error, term()}

Reconfigures a loaded model. See model_option/0 for a thorough description of the options

Link to this function

generate_text(model, tag_query \\ %{})

@spec generate_text(model_reference(), tag_query()) ::
  {:ok, binary()} | {:error, term()}

Generates a string. Will raise an exception if the model was trained on non-textual tokens at least once

iex> Markov.generate_text(model)
{:ok, "hello world"}

See type tag_query/0 for more info about tags

Link to this function

generate_tokens(model, tag_query \\ %{})

@spec generate_tokens(model_reference(), tag_query()) ::
  {:ok, [term()]} | {:error, term()}

Generates a list of tokens

iex> Markov.generate_tokens(model)
{:ok, ["hello", "world"]}

See type tag_query/0 for more info about tag_query

Link to this function

get_config(model)

@spec get_config(model :: model_reference()) ::
  {:ok, [model_option()]} | {:error, term()}

Gets the configuration of a loaded model

Link to this function

load(path, create_options \\ [])

@spec load(path :: String.t(), options :: [model_option()]) ::
  {:ok, model_reference()} | {:error, term()}

Loads an existing model under path path. If none is found, a new model with the specified options will be created and loaded, and if that fails, an error will be returned.

Link to this function

read_log(model)

@spec read_log(model_reference()) :: [
  %Markov.Operation{arg: term(), date_time: term(), type: term()}
]

Reads the log file and returns a list of entries in chronological order

iex> Markov.read_log(model)
{:ok,
 [
   %Markov.Operation{date_time: ~U[2022-10-02 16:59:51.844Z], type: :start, arg: nil},
   %Markov.Operation{date_time: ~U[2022-10-02 16:59:56.705Z], type: :train, arg: ["hello", "world"]}
 ]}
Link to this function

train(model, text, tags \\ [:"$none"])

@spec train(model_reference(), String.t() | [term()], [term()]) ::
  :ok | {:error, term()}

Trains model using text or a list of tokens.

:ok = Markov.train(model, "Hello, world!")
:ok = Markov.train(model, "this is a string that's broken down into tokens behind the scenes")
:ok = Markov.train(model, [
  :this, "is", 'a token', :list, "where",
  {:each_element, :is, {:taken, :as_is}},
  :and, :can_be, :erlang.make_ref(), "<-- any term"
])

See tag_query/0 for more info about tags

@spec unload(model :: model_reference()) :: :ok

Unloads a loaded model