Metastatic.Analysis.Duplication.Fingerprint (Metastatic v0.10.4)

View Source

Structural fingerprinting for ASTs.

Generates hash-based fingerprints that uniquely identify AST structures. Supports both exact fingerprints (sensitive to variable/literal names) and normalized fingerprints (structure-only matching).

Fingerprint Types

  • Exact: Identical ASTs produce identical fingerprints
  • Normalized: ASTs with same structure but different names produce identical fingerprints

Usage

alias Metastatic.Analysis.Duplication.Fingerprint

ast = {:binary_op, :arithmetic, :+, {:variable, "x"}, {:literal, :integer, 5}}

# Exact fingerprint
Fingerprint.exact(ast)
# => "ABC123..."

# Normalized fingerprint (ignores variable/literal names)
Fingerprint.normalized(ast)
# => "DEF456..."

Examples

# Exact fingerprints
iex> ast = {:literal, [subtype: :integer], 42}
iex> fp = Metastatic.Analysis.Duplication.Fingerprint.exact(ast)
iex> is_binary(fp) and String.length(fp) > 0
true

# Normalized fingerprints ignore values
iex> ast1 = {:literal, [subtype: :integer], 42}
iex> ast2 = {:literal, [subtype: :integer], 99}
iex> fp1 = Metastatic.Analysis.Duplication.Fingerprint.normalized(ast1)
iex> fp2 = Metastatic.Analysis.Duplication.Fingerprint.normalized(ast2)
iex> fp1 == fp2
true

Summary

Functions

Generates an exact fingerprint for an AST.

Compares two fingerprints for equality.

Generates a normalized fingerprint for an AST.

Extracts a sequence of tokens from an AST.

Functions

exact(ast)

@spec exact(Metastatic.AST.meta_ast()) :: String.t()

Generates an exact fingerprint for an AST.

Identical ASTs (including variable names and literal values) produce identical fingerprints. Uses SHA-256 hashing.

Examples

iex> ast1 = {:variable, [], "x"}
iex> ast2 = {:variable, [], "x"}
iex> Metastatic.Analysis.Duplication.Fingerprint.exact(ast1) == Metastatic.Analysis.Duplication.Fingerprint.exact(ast2)
true

iex> ast1 = {:variable, [], "x"}
iex> ast2 = {:variable, [], "y"}
iex> Metastatic.Analysis.Duplication.Fingerprint.exact(ast1) == Metastatic.Analysis.Duplication.Fingerprint.exact(ast2)
false

match?(fp1, fp2)

@spec match?(String.t(), String.t()) :: boolean()

Compares two fingerprints for equality.

Examples

iex> ast = {:literal, [subtype: :integer], 42}
iex> fp1 = Metastatic.Analysis.Duplication.Fingerprint.exact(ast)
iex> fp2 = Metastatic.Analysis.Duplication.Fingerprint.exact(ast)
iex> Metastatic.Analysis.Duplication.Fingerprint.match?(fp1, fp2)
true

iex> ast1 = {:literal, [subtype: :integer], 42}
iex> ast2 = {:literal, [subtype: :string], "hello"}
iex> fp1 = Metastatic.Analysis.Duplication.Fingerprint.exact(ast1)
iex> fp2 = Metastatic.Analysis.Duplication.Fingerprint.exact(ast2)
iex> Metastatic.Analysis.Duplication.Fingerprint.match?(fp1, fp2)
false

normalized(ast)

@spec normalized(Metastatic.AST.meta_ast()) :: String.t()

Generates a normalized fingerprint for an AST.

ASTs with the same structure but different variable names or literal values produce identical fingerprints. This is useful for Type II clone detection.

Examples

iex> ast1 = {:variable, [], "x"}
iex> ast2 = {:variable, [], "y"}
iex> Metastatic.Analysis.Duplication.Fingerprint.normalized(ast1) == Metastatic.Analysis.Duplication.Fingerprint.normalized(ast2)
true

iex> ast1 = {:literal, [subtype: :integer], 42}
iex> ast2 = {:literal, [subtype: :integer], 100}
iex> Metastatic.Analysis.Duplication.Fingerprint.normalized(ast1) == Metastatic.Analysis.Duplication.Fingerprint.normalized(ast2)
true

iex> ast1 = {:binary_op, [category: :arithmetic, operator: :+], [{:variable, [], "a"}, {:literal, [subtype: :integer], 1}]}
iex> ast2 = {:binary_op, [category: :arithmetic, operator: :+], [{:variable, [], "b"}, {:literal, [subtype: :integer], 2}]}
iex> Metastatic.Analysis.Duplication.Fingerprint.normalized(ast1) == Metastatic.Analysis.Duplication.Fingerprint.normalized(ast2)
true

tokens(ast)

@spec tokens(Metastatic.AST.meta_ast()) :: [atom()]

Extracts a sequence of tokens from an AST.

Tokens include node types, operators, and structure markers. Useful for token-based similarity comparison.

Examples

iex> ast = {:literal, [subtype: :integer], 42}
iex> Metastatic.Analysis.Duplication.Fingerprint.tokens(ast)
[:literal, :integer]

iex> ast = {:binary_op, [category: :arithmetic, operator: :+], [{:variable, [], "x"}, {:literal, [subtype: :integer], 5}]}
iex> tokens = Metastatic.Analysis.Duplication.Fingerprint.tokens(ast)
iex> :binary_op in tokens and :arithmetic in tokens and :+ in tokens
true