View Source Dmp.Diff (diff_match_patch v0.2.0)

Compare two blocks of plain text and efficiently return a list of differences.

Link to this section Summary

Types

A list of diff operations, representing the difference between two text versions.

The result of a successful Diff.half_match/3 call.

A diff's operation type. The operation :nil is used internally to indicate a nil value for the diff.

t()

The diff tuple, consisting of two elements: the operation and the associated text.

Functions

Find the "middle snake" of a diff, split the problem in two and return the recursively constructed diff.

Given the location of the "middle snake", split the diff in two parts and recurse.

Rehydrate the text in a diff from a string of line hashes to real lines of text.

Reduce the number of edits in a diff by eliminating operationally trivial equalities.

Reorder and merge like edit sections in a diff, merging equalities.

Reduce the number of edits in a diff by eliminating semantically trivial equalities.

Look for single edits in a diff that are surrounded on both sides by equalities which can be shifted sideways to align the edit to a word boundary.

Determine if the suffix of one string is the prefix of another.

Determine the common prefix of two strings.

Determine the common suffix of two strings.

Find the differences between two texts.

Given the original text1, and an encoded string which describes the operations required to transform text1 into text2, compute the full diff.

Do the two texts share a substring which is at least half the length of the longer text?

Compute the Levenshtein distance of a diff--the number of inserted, deleted or substituted characters.

Do a quick line-level diff on both strings, then rediff the parts for greater accuracy.

Split two texts into a list of strings.

Find the differences between two texts.

Skips validation of options. Used internally by Patch.apply.

Generate a pretty HTML report from a difflist.

Given two strings, compute a score representing whether the internal boundary falls on logical boundaries.

Compute and return the source text of a diff (all equalities and deletions).

Compute and return the destination text of a diff (all equalities and insertions).

Crush a diff into an encoded string which describes the operations required to transform text1 into text2.

Returns the diff tuple, or a "nil" pseudo-diff (with op :nil and empty text).

Given loc, a location in text1, compute and return the equivalent location in text2.

Link to this section Types

@type difflist() :: [t()]

A list of diff operations, representing the difference between two text versions.

A "difflist" is an Elixir list of "diff" tuples. Here is an example difflist:

[{:delete, "Hello"}, {:insert, "Goodbye"}, {:equal, " world."}]

which means: delete "Hello", add "Goodbye" and keep " world."

@type expiry() :: :never | non_neg_integer()
@type first_pass_acc() ::
  {non_neg_integer(), non_neg_integer(), String.t(), String.t()}
@type half_match_result() ::
  {String.t(), String.t(), String.t(), String.t(), String.t()}

The result of a successful Diff.half_match/3 call.

A tuple of five strings:

  1. the prefix of text1
  2. the suffix of text1
  3. the prefix of text2
  4. the suffix of text2
  5. the common middle
@type op() :: :delete | :insert | :equal | nil

A diff's operation type. The operation :nil is used internally to indicate a nil value for the diff.

@type options() :: Dmp.Options.t()
@type t() :: {op(), String.t()}

The diff tuple, consisting of two elements: the operation and the associated text.

Link to this section Functions

Link to this function

bisect(text1, text2, deadline)

View Source
@spec bisect(String.t(), String.t(), expiry()) :: difflist()

Find the "middle snake" of a diff, split the problem in two and return the recursively constructed diff.

See: An O(ND) Difference Algorithm and Its Variations (Meyers, 1986)

  • text1 - Old string to be diffed.
  • text2 - New string to be diffed.
  • deadline - Unix timestamp (in milliseconds) at which to bail if not yet complete.

Returns a difflist.

Link to this function

bisect_split(text1, text2, x, y, deadline)

View Source
@spec bisect_split(
  String.t(),
  String.t(),
  non_neg_integer(),
  non_neg_integer(),
  non_neg_integer()
) :: difflist()

Given the location of the "middle snake", split the diff in two parts and recurse.

  • text1 - Old string to be diffed.
  • text2 - New string to be diffed.
  • x - Index of split point in text1.
  • y - Index of split point in text2.
  • deadline - Unix timestamp (in milliseconds) at which to bail if not yet complete.

Returns a difflist.

Link to this function

chars_to_lines(diffs, line_array)

View Source
@spec chars_to_lines(difflist(), [String.t()]) :: difflist()

Rehydrate the text in a diff from a string of line hashes to real lines of text.

  • diffs - A difflist.
  • line_array - A list of unique strings.

Returns the rehydrated difflist.

Link to this function

cleanup_efficiency(diffs, diff_edit_cost)

View Source
@spec cleanup_efficiency(difflist(), non_neg_integer()) :: difflist()

Reduce the number of edits in a diff by eliminating operationally trivial equalities.

  • diff_edit_cost Cost of an empty edit operation in terms of edit characters.

Returns the updated difflist.

@spec cleanup_merge(difflist()) :: difflist()

Reorder and merge like edit sections in a diff, merging equalities.

Any edit section can move as long as it doesn't cross an equality.

Returns the updated difflist.

@spec cleanup_semantic(difflist()) :: difflist()

Reduce the number of edits in a diff by eliminating semantically trivial equalities.

Returns the updated difflist.

Link to this function

cleanup_semantic_lossless(diffs)

View Source
@spec cleanup_semantic_lossless(difflist()) :: difflist()

Look for single edits in a diff that are surrounded on both sides by equalities which can be shifted sideways to align the edit to a word boundary.

Example: The c<ins>at c</ins>ame. becomes The <ins>cat </ins>came.

Returns the updated difflist.

Link to this function

combine_previous_inequalities(diffs, text, count_delete, count_insert, text_delete, text_insert)

View Source
Link to this function

common_overlap(text1, text2)

View Source
@spec common_overlap(String.t(), String.t()) :: non_neg_integer()

Determine if the suffix of one string is the prefix of another.

  • text1 - First string.
  • text2 - Second string.

Returns the number of characters common to the end of the first string and the start of the second string.

Link to this function

common_prefix(text1, text2)

View Source
@spec common_prefix(String.t(), String.t()) :: {String.t(), String.t(), String.t()}

Determine the common prefix of two strings.

  • text1 - First string.
  • text2 - Second string.

Returns a tuple {prefix, rest1, rest2}, where

  • prefix - The common prefix.
  • rest1 - text1 with the prefix removed.
  • rest2 - text2 with the prefix removed.
Link to this function

common_suffix(text1, text2)

View Source
@spec common_suffix(String.t(), String.t()) :: {String.t(), String.t(), String.t()}

Determine the common suffix of two strings.

  • text1 - First string.
  • text2 - Second string.

Returns a tuple {suffix, rest1, rest2}, where

  • suffix - The common suffix.
  • rest1 - text1 with the suffix removed.
  • rest2 - text2 with the suffix removed.
Link to this function

compute(text1, text2, check_lines, deadline)

View Source
@spec compute(String.t(), String.t(), boolean(), expiry()) :: difflist()

Find the differences between two texts.

  • text1 - Old string to be diffed.
  • text2 - New string to be diffed.
  • check_lines - Speedup flag. If false, then don't run a line-level diff first to identify the changed areas. If true, then run a faster slightly less optimal diff.
  • deadline - Unix timestamp in milliseconds when the diff should be complete by.

Assumes that the texts do not have any common prefix or suffix.

Returns a difflist.

Link to this function

factor_out_prefixes(diffs, text_delete, text_insert)

View Source
Link to this function

factor_out_suffixes(diffs, text, text_delete, text_insert)

View Source
Link to this function

from_delta(text1, delta)

View Source
@spec from_delta(String.t(), String.t()) :: nil | difflist()

Given the original text1, and an encoded string which describes the operations required to transform text1 into text2, compute the full diff.

  • text1 - Source string for the diff. *delta - Encoded delta text.

Returns a difflist.

Raises an ArgumentError if the encoded delta has invalid contents for the given text.

Link to this function

half_match(text1, text2, deadline)

View Source
@spec half_match(String.t(), String.t(), non_neg_integer()) ::
  nil | half_match_result()

Do the two texts share a substring which is at least half the length of the longer text?

  • text1 - First string.
  • text2 - Second string.
  • deadline - Unix timestamp (in milliseconds) at which to bail if not yet complete.

This speedup can produce non-minimal diffs.

Returns a half_match_result 5-tuple, or nil if there was no match. Returns nil if deadline is zero (no time limit specified).

@spec levenshtein(difflist()) :: non_neg_integer()

Compute the Levenshtein distance of a diff--the number of inserted, deleted or substituted characters.

Link to this function

line_mode(text1, text2, deadline)

View Source
@spec line_mode(String.t(), String.t(), expiry()) :: difflist()

Do a quick line-level diff on both strings, then rediff the parts for greater accuracy.

  • text1 - Old string to be diffed.
  • text2 - New string to be diffed.
  • deadline - Unix timestamp (in milliseconds) when the diff should be complete by.

This speedup can produce non-minimal diffs.

Returns a difflist.

Link to this function

lines_to_chars(text1, text2)

View Source
@spec lines_to_chars(String.t(), String.t()) :: {String.t(), String.t(), [String.t()]}

Split two texts into a list of strings.

Reduce the texts to a string of hashes where each Unicode character represents one line.

  • text1 - First string.
  • text2 - Second string.

Returns a tuple containing the encoded text1, the encoded text2 and the list of unique strings. The zeroth element of the list of unique strings is intentionally blank.

Link to this function

main(text1, text2, check_lines \\ true, opts \\ [])

View Source
@spec main(String.t(), String.t(), boolean(), options()) :: difflist()

Find the differences between two texts.

  • text1 - Old string to be diffed.
  • text2 - New string to be diffed.
  • check_lines - Speedup flag. If false, then don't run a line-level diff first to identify the changed areas. If true, then run a faster slightly less optimal diff.
  • opts - A options keyword list, [] to use the default options.

Most of the time check_lines is wanted, so it defaults to true.

Returns a difflist.

Link to this function

main_(text1, text2, check_lines, opts)

View Source
@spec main_(String.t(), String.t(), boolean(), options()) :: difflist()

Skips validation of options. Used internally by Patch.apply.

@spec pretty_html(difflist()) :: String.t()

Generate a pretty HTML report from a difflist.

Link to this function

semantic_score(one, two)

View Source
@spec semantic_score(String.t(), String.t()) :: non_neg_integer()

Given two strings, compute a score representing whether the internal boundary falls on logical boundaries.

Scores range from 6 (best) to 0 (worst).

  • one - First string.
  • two - Second string.

Scores are:

  • 6 if one or two is an empty string.
  • 5 if a blank line ends in one or a blank line starts in two.
  • 4 if one ends, or two starts, with a newline.
  • 3 if one ends in a punctuation and two starts with white space.
  • 2 if one ends, or two starts, with white space.
  • 1 if one ends, or two starts, with a non-alphanumeric.
  • 0 otherwise

examples

Examples

iex> Diff.semantic_score("two is empty string", "")
6

iex> Diff.semantic_score("one ends in blank line\n\n", "two")
5

iex> Diff.semantic_score("one ends in new line\n", "two")
4

iex> Diff.semantic_score("one sentence.", " space before two")
3

iex> Diff.semantic_score("one sentence.", "no space before two")
1

iex> Diff.semantic_score("one ends with white space ", "two")
2

iex> Diff.semantic_score("one ends in 'punctuation'", "two")
1

iex> Diff.semantic_score("one ends in middle of word", "two")
0
Link to this function

sorted_half_match(hm, arg2)

View Source
@spec text1(difflist()) :: String.t()

Compute and return the source text of a diff (all equalities and deletions).

@spec text2(difflist()) :: String.t()

Compute and return the destination text of a diff (all equalities and insertions).

@spec to_delta(difflist()) :: String.t()

Crush a diff into an encoded string which describes the operations required to transform text1 into text2.

For example, "=3 -2 +ing" means keep 3 chars, delete 2 chars, insert "ing".

Operations are tab-separated. Inserted text is escaped using %xx notation.

examples

Examples

|> iex [{:equal, "abc"}, {:delete, "de"}, {:insert, "ing"}] |> to_delta() |> IO.inspect()
"=3\t-2\t+ing"
@spec undiff(nil | t()) :: t()

Returns the diff tuple, or a "nil" pseudo-diff (with op :nil and empty text).

@spec x_index(difflist(), non_neg_integer()) :: non_neg_integer()

Given loc, a location in text1, compute and return the equivalent location in text2.

  • diffs - a difflist.
  • loc - Location within text1.

Returns location within text2.

examples

Examples

iex> Diff.main("The cat", "The big cat") |> Diff.x_index(1)
1

iex> Diff.main("The cat", "The big cat") |> Diff.x_index(4)
8