Yog.Approximate (YogEx v0.97.0)

Copy Markdown View Source

Fast approximation algorithms for expensive graph properties.

These algorithms trade exact precision for dramatically improved time complexity, making them suitable for large graphs where exact computation is prohibitive.

Algorithms

PropertyFunctionExact ComplexityApproximate Complexity
Diameterdiameter/2O(V × (V+E)) or O(V³)O(k × (V+E))
Betweennessbetweenness/2O(VE)O(k(V+E))
Avg Path Lengthaverage_path_length/2O(V²E) or O(V³)O(k(V+E))
Global Efficiencyglobal_efficiency/2O(V²E) or O(V³)O(k(V+E))
Transitivitytransitivity/2O(V³) or O(Δ²E)O(k)
Vertex Coververtex_cover/1NP-hardO(V+E)
Max Cliquemax_clique/1O(3^(V/3))O(V²)
Treewidthtreewidth_upper_bound/2NP-hardO(V²) or O(V³)
Tree Decompositiontree_decomposition/2NP-hardO(V²) or O(V³)

Approximation Quality

  • Diameter: Multi-sweep BFS/Dijkstra provides a lower bound. Empirically within 10–20% on real-world networks. Set samples: 4 for fast estimates, samples: 10 for tighter bounds.
  • Betweenness: Sampled Brandes algorithm. Scores are unbiased when scaled by the sampling ratio.
  • Path metrics: Pivot sampling averages shortest paths from a random subset of source nodes.
  • Transitivity: Wedge sampling estimates the global clustering coefficient by randomly sampling connected triples.
  • Treewidth: Heuristic elimination orderings (:min_degree, :min_fill) provide upper bounds. The resulting tree decomposition can be validated with Yog.Property.TreeDecomposition.valid?/2.

Summary

Functions

Approximates the average shortest path length using pivot sampling.

Approximates betweenness centrality using sampled Brandes sources.

Approximates the diameter using multi-sweep BFS/Dijkstra.

Approximates global efficiency using pivot sampling.

Returns a large clique using a greedy heuristic.

Approximates transitivity (global clustering coefficient) via wedge sampling.

Returns a tree decomposition of the graph using heuristic elimination ordering.

Returns an upper bound on the treewidth using heuristic elimination ordering.

Returns a 2-approximation of the minimum vertex cover.

Types

metric_opts()

@type metric_opts() :: [
  with_zero: any(),
  with_add: (any(), any() -> any()),
  with_compare: (any(), any() -> :lt | :eq | :gt),
  with: (any() -> any()),
  to_float: (any() -> float())
]

Functions

average_path_length(graph, opts \\ [])

@spec average_path_length(
  Yog.Graph.t(),
  keyword()
) :: float() | nil

Approximates the average shortest path length using pivot sampling.

Instead of computing shortest paths from all nodes, a random subset of source nodes is used. The mean over sampled sources is returned.

Returns nil if the graph is empty or disconnected.

Options

  • :samples - Number of pivot nodes (default: sqrt(V))
  • :seed - Random seed for reproducibility
  • :with_zero, :with_add, :with_compare, :with, :to_float - Weight options (same as Yog.Health.average_path_length/2)

Examples

iex> graph = Yog.from_edges(:undirected, [{1, 2, 1}, {2, 3, 1}, {3, 4, 1}])
iex> apl = Yog.Approximate.average_path_length(graph, samples: 2)
iex> apl > 0
true

Time Complexity

O(k × (E + V log V)) where k is the sample count.

betweenness(graph, opts \\ [])

@spec betweenness(
  Yog.Graph.t(),
  keyword()
) :: %{required(Yog.node_id()) => float()}

Approximates betweenness centrality using sampled Brandes sources.

Only a random subset of nodes is used as sources for the dependency accumulation. The resulting scores are scaled by n / k to remain unbiased estimators of the exact betweenness values.

Options

  • :samples - Number of source nodes to sample. Defaults to sqrt(V).
  • :seed - Random seed for reproducible sampling (optional)
  • :zero, :add, :compare - Weight options for weighted graphs

Examples

iex> graph = Yog.from_edges(:undirected, [{1, 2, 1}, {1, 3, 1}, {2, 3, 1}, {2, 4, 1}])
iex> scores = Yog.Approximate.betweenness(graph, samples: 2)
iex> is_map(scores) and map_size(scores) == 4
true

Time Complexity

O(k(V+E)) where k is the sample count.

diameter(graph, opts \\ [])

@spec diameter(
  Yog.Graph.t(),
  keyword()
) :: any()

Approximates the diameter using multi-sweep BFS/Dijkstra.

The algorithm picks a random start node, finds the farthest node via shortest-path search, and repeats from that peripheral candidate. Each sweep provides a lower bound on the diameter. The maximum over all sweeps is returned.

Options

  • :samples - Number of sweeps (default: 4)
  • :with_zero, :with_add, :with_compare, :with - Weight options for weighted graphs (same interface as Yog.Health.diameter/2)

Examples

iex> graph = Yog.from_edges(:undirected, [{1, 2, 1}, {2, 3, 1}, {3, 4, 1}])
iex> diam = Yog.Approximate.diameter(graph, samples: 2)
iex> diam >= 2 and diam <= 3
true

Time Complexity

O(k × (V+E)) for unweighted graphs, O(k × (E + V log V)) for weighted.

global_efficiency(graph, opts \\ [])

@spec global_efficiency(
  Yog.Graph.t(),
  keyword()
) :: float()

Approximates global efficiency using pivot sampling.

Options

  • :samples - Number of pivot nodes (default: sqrt(V))
  • :seed - Random seed for reproducibility
  • :with_zero, :with_add, :with_compare, :with, :to_float - Weight options (same as Yog.Health.global_efficiency/2)

Examples

iex> graph = Yog.from_edges(:undirected, [{1, 2, 1}, {2, 3, 1}])
iex> eff = Yog.Approximate.global_efficiency(graph, samples: 2)
iex> eff > 0.0 and eff <= 1.0
true

Time Complexity

O(k × (E + V log V)) where k is the sample count.

max_clique(graph)

@spec max_clique(Yog.Graph.t()) :: MapSet.t(Yog.node_id())

Returns a large clique using a greedy heuristic.

The algorithm sorts nodes by degree (descending) and greedily builds a clique by adding each node if it is adjacent to all nodes already in the clique. While not guaranteed to find the maximum clique, it runs in O(V²) time and typically finds a clique of reasonable size on dense graphs.

Returns a MapSet of node IDs.

Examples

iex> triangle = Yog.from_edges(:undirected, [{1, 2, 1}, {2, 3, 1}, {3, 1, 1}])
iex> clique = Yog.Approximate.max_clique(triangle)
iex> MapSet.size(clique)
3

iex> path = Yog.from_edges(:undirected, [{1, 2, 1}, {2, 3, 1}])
iex> clique = Yog.Approximate.max_clique(path)
iex> MapSet.size(clique) >= 2
true

Time Complexity

O(V²)

transitivity(graph, opts \\ [])

@spec transitivity(
  Yog.Graph.t(),
  keyword()
) :: float()

Approximates transitivity (global clustering coefficient) via wedge sampling.

Randomly samples "wedges" (connected triples u-v-w) and checks whether the closing edge v-w or u-w exists, forming a triangle. The fraction of closed wedges estimates the clustering coefficient.

For exact transitivity, use Yog.Community.Metrics.transitivity/1.

Options

  • :samples - Number of wedges to sample (default: min(10_000, V²))
  • :seed - Random seed for reproducibility

Examples

iex> triangle = Yog.from_edges(:undirected, [{1, 2, 1}, {2, 3, 1}, {3, 1, 1}])
iex> t = Yog.Approximate.transitivity(triangle, samples: 100)
iex> t > 0.9
true

Time Complexity

O(k) where k is the number of sampled wedges.

tree_decomposition(graph, opts \\ [])

@spec tree_decomposition(
  Yog.Graph.t(),
  keyword()
) :: {:ok, Yog.Property.TreeDecomposition.t()} | :error

Returns a tree decomposition of the graph using heuristic elimination ordering.

Returns {:ok, Yog.Property.TreeDecomposition.t()} on success.

Options

  • :heuristic - :min_degree (default) or :min_fill

Examples

iex> graph = Yog.Generator.Classic.path(4)
iex> {:ok, td} = Yog.Approximate.tree_decomposition(graph)
iex> td.width <= 1
true

treewidth_upper_bound(graph, opts \\ [])

@spec treewidth_upper_bound(
  Yog.Graph.t(),
  keyword()
) :: non_neg_integer()

Returns an upper bound on the treewidth using heuristic elimination ordering.

Options

  • :heuristic - :min_degree (default) or :min_fill

Examples

iex> graph = Yog.Generator.Classic.cycle(5)
iex> Yog.Approximate.treewidth_upper_bound(graph) <= 2
true

iex> graph = Yog.Generator.Classic.complete(4)
iex> Yog.Approximate.treewidth_upper_bound(graph) == 3
true

vertex_cover(graph)

@spec vertex_cover(Yog.Graph.t()) :: MapSet.t(Yog.node_id())

Returns a 2-approximation of the minimum vertex cover.

A vertex cover is a set of nodes such that every edge in the graph is incident to at least one node in the set. This implementation uses the classic greedy algorithm: repeatedly pick an uncovered edge and add both endpoints to the cover. The result is guaranteed to be at most twice the size of the optimal minimum vertex cover.

Works for both undirected and directed graphs. For directed graphs, an edge (u, v) is considered covered if u or v is in the set.

Examples

iex> graph = Yog.from_edges(:undirected, [{1, 2, 1}, {2, 3, 1}, {3, 4, 1}])
iex> cover = Yog.Approximate.vertex_cover(graph)
iex> MapSet.size(cover) <= 4
true
iex> # Every edge touches the cover
iex> Enum.all?(Yog.all_edges(graph), fn {u, v, _} ->
...>   MapSet.member?(cover, u) or MapSet.member?(cover, v)
...> end)
true

Time Complexity

O(V + E)