libgraph v0.1.0 Graph

This module defines a directed graph data structure, which supports both acyclic and cyclic forms. It also defines the API for creating, manipulating, and querying that structure.

This is intended as a replacement for :digraph, which requires the use of 3 ETS tables at a minimum, but up to 6 at a time during certain operations (such as get_short_path/3). In environments where many graphs are in memory at a time, this can be dangerous, as it is easy to hit the system limit for max ETS tables, which will bring your node down. This graph implementation does not use ETS, so it can be used freely without concern for hitting this limit.

With regards to space requirements, vertices require 2N space, where N is the size of each vertex, as vertices are mapped to indexes (integers), and we store both the vertex->id map and the inverted index (id->vertex) for efficient lookups when querying the graph. Edges are stored as a map of vertex_id -> MapSet(vertex_id), which is the most compact representation we’re allowed while still supporting efficient lookups on edges/neighbors. It is recommended that you avoid stuffing large objects in the graph as vertices, and instead use a key which you can then use to reify a list of objects you care about from ETS or some other source once you have results you care about.

There are benchmarks provided with this library which compare it directly to :digraph for some common operations, and thus far, libgraph outperforms :digraph in all of them.

The only bit of data I have not yet evaluated is how much garbage is generated when querying/manipulating the graph between libgraph and digraph, but I suspect the use of ETS means that digraph is able to keep that to a minimum. Until I verify if that’s the case, I would assume that libgraph has higher memory requirements, but better performance, and is able to side-step the ETS limit. If your requirements, like mine, mean that you are dynamically constructing and querying graphs concurrently, I think libgraph is the better choice - however if you either need the APIs of :digraph that I have not yet implemented, or do not have the same use case, I would stick to :digraph for now.

Link to this section Summary

Functions

Adds an edge connecting a to b. If either a or b do not exist in the graph, they are automatically added. Adding the same edge more than once does not create multiple edges, each edge is only ever stored once

Like add_edge/3, but takes a list of vertex pairs, and adds an edge to the graph for each pair

Adds a new vertex to the graph. If the vertex is already present in the graph, the add is a no-op

Like add_vertex/2, but takes a list of vertices to add to the graph

Returns a list of connected components, where each component is a list of vertices

Removes an edge connecting a to b. If no such vertex exits, or the edge does not exist, it is effectively a no-op

Like delete_edge/3, but takes a list of vertex pairs, and deletes the corresponding edge from the graph, if it exists

Removes a vertex from the graph, as well as any edges which refer to that vertex. If the vertex does not exist in the graph, it is a no-op

Like delete_vertex/2, but takes a list of vertices to delete from the graph

Return a list of all the edges, where each edge is expressed as a tuple of {A, B}, where the elements are the vertices involved, and implying the direction of the edge to be from A to B

Builds a list of paths between vertex a and vertex b

Gets the shortest path between a and b. If there are multiple paths of the same length, where all are “shortest”, this implementation simply takes the first one encountered

Returns the in-degree of vertex v of graph g

Returns a list of vertices which all have edges coming in to the given vertex v

Returns a map of summary information about this graph

Returns true if and only if the graph g is acyclic

Returns true if the graph is an aborescence, a directed acyclic graph, where the root, a vertex, of the arborescence has a unique path from itself to every other vertex in the graph

Returns true if the graph g is not acyclic

Returns true if and only if the graph g is a tree

Returns a list of vertices from graph g which are included in a loop

Creates a new graph

Returns the out-degree of vertex v of graph g

Returns a list of vertices which the given vertex v has edges going to

Returns all vertices of graph g. The order is given by a depth-first traversal of the graph, collecting visited vertices in postorder. More precisely, the vertices visited while searching from an arbitrarily chosen vertex are collected in postorder, and all those collected vertices are placed before the subsequently visited vertices

Returns all vertices of graph g. The order is given by a depth-first traversal of the graph, collecting visited vertices in preorder

Returns an unsorted list of vertices from the graph, such that for each vertex in the list (call it v), there is a path in the graph from some vertex of vs to v

Returns an unsorted list of vertices from the graph, such that for each vertex in the list (call it v), there is a path in the graph of length one or more from some vertex of vs to v

Returns an unsorted list of vertices from the graph, such that for each vertex in the list (call it v), there is a path from v to some vertex of vs

Returns an unsorted list of vertices from the graph, such that for each vertex in the list (call it v), there is a path of length one or more from v to some vertex of vs

Returns a list of strongly connected components, where each component is a list of vertices

Builds a maximal subgraph of g which includes all of the vertices in vs and the edges which connect them

Returns a topological ordering of the vertices of graph g, if such an ordering exists, otherwise it returns false. For each vertex in the returned list, no out-neighbors occur earlier in the list

Returns a list of all the vertices in the graph

Link to this section Types

Link to this type edge()
edge() :: {vertex, vertex}
Link to this type vertex()
vertex() :: term

Link to this section Functions

Link to this function add_edge(g, a, b)
add_edge(t, vertex, vertex) :: t

Adds an edge connecting a to b. If either a or b do not exist in the graph, they are automatically added. Adding the same edge more than once does not create multiple edges, each edge is only ever stored once.

Example

iex> g = Graph.new |> Graph.add_edge(:a, :b)
...> [:a, :b] = Graph.vertices(g)
...> Graph.edges(g)
[{:a, :b}]
Link to this function add_edges(g, es)
add_edges(t, [{vertex, vertex}]) ::
  t |
  {:error, {:invalid_vertex_pair, term}}

Like add_edge/3, but takes a list of vertex pairs, and adds an edge to the graph for each pair.

Examples

iex> g = Graph.new |> Graph.add_vertices([:a, :b, :c]) |> Graph.add_edges([{:a, :b}, {:b, :c}])
...> Graph.edges(g)
[{:a, :b}, {:b, :c}]

iex> Graph.new |> Graph.add_vertices([:a, :b, :c]) |> Graph.add_edges([:a, :b])
{:error, {:invalid_vertex_pair, :a}}
Link to this function add_vertex(g, vertex)
add_vertex(t, vertex) :: t

Adds a new vertex to the graph. If the vertex is already present in the graph, the add is a no-op.

Example

iex> g = Graph.new |> Graph.add_vertex(:a) |> Graph.add_vertex(:a)
...> Graph.vertices(g)
[:a]
Link to this function add_vertices(g, vs)
add_vertices(t, [vertex]) :: t

Like add_vertex/2, but takes a list of vertices to add to the graph.

Example

iex> g = Graph.new |> Graph.add_vertices([:a, :b, :a])
...> Graph.vertices(g)
[:a, :b]
Link to this function components(g)
components(t) :: [[vertex]]

Returns a list of connected components, where each component is a list of vertices.

A connected component is a maximal subgraph such that there is a path between each pair of vertices, considering all edges undirected.

A subgraph is a graph whose vertices and edges are a subset of the vertices and edges of the source graph.

A maximal subgraph is a subgraph with property P where all other subgraphs which contain the same vertices do not have that same property P.

See the test suite for an example of how this is used.

Link to this function delete_edge(g, a, b)

Removes an edge connecting a to b. If no such vertex exits, or the edge does not exist, it is effectively a no-op.

Example

iex> g = Graph.new |> Graph.add_edge(:a, :b) |> Graph.delete_edge(:a, :b)
...> [:a, :b] = Graph.vertices(g)
...> Graph.edges(g)
[]
Link to this function delete_edges(g, es)
delete_edges(t, [{vertex, vertex}]) ::
  t |
  {:error, {:invalid_vertex_pair, term}}

Like delete_edge/3, but takes a list of vertex pairs, and deletes the corresponding edge from the graph, if it exists.

Examples

iex> g = Graph.new |> Graph.add_vertices([:a, :b, :c]) |> Graph.add_edge(:a, :b)
...> g = Graph.delete_edges(g, [{:a, :b}])
...> Graph.edges(g)
[]

iex> g = Graph.new |> Graph.add_vertices([:a, :b, :c]) |> Graph.add_edge(:a, :b)
...> Graph.delete_edges(g, [:a])
{:error, {:invalid_vertex_pair, :a}}
Link to this function delete_vertex(g, vertex)
delete_vertex(t, vertex) :: t

Removes a vertex from the graph, as well as any edges which refer to that vertex. If the vertex does not exist in the graph, it is a no-op.

Example

iex> g = Graph.new |> Graph.add_vertex(:a) |> Graph.add_vertex(:b) |> Graph.add_edge(:a, :b)
...> [:a, :b] = Graph.vertices(g)
...> [{:a, :b}] = Graph.edges(g)
...> g = Graph.delete_vertex(g, :b)
...> [:a] = Graph.vertices(g)
...> Graph.edges(g)
[]
Link to this function delete_vertices(g, vs)
delete_vertices(t, [vertex]) :: t

Like delete_vertex/2, but takes a list of vertices to delete from the graph.

Example

iex> g = Graph.new |> Graph.add_vertices([:a, :b, :c]) |> Graph.delete_vertices([:a, :b])
...> Graph.vertices(g)
[:c]
Link to this function edges(graph)
edges(t) :: [edge]

Return a list of all the edges, where each edge is expressed as a tuple of {A, B}, where the elements are the vertices involved, and implying the direction of the edge to be from A to B.

NOTE: You should be careful when using this on dense graphs, as it produces lists with whatever you’ve provided as vertices, with likely many copies of each. I’m not sure if those copies are shared in-memory as they are unchanged, so it should be fairly compact in memory, but I have not verified that to be sure.

Example

iex> g = Graph.new |> Graph.add_vertex(:a) |> Graph.add_vertex(:b) |> Graph.add_vertex(:c)
...> g = g |> Graph.add_edge(:a, :c) |> Graph.add_edge(:b, :c)
...> Graph.edges(g)
[{:a, :c}, {:b, :c}]
Link to this function get_paths(g, a, b)
get_paths(t, vertex, vertex) :: [[vertex]]

Builds a list of paths between vertex a and vertex b.

The algorithm used here is a depth-first search, which evaluates the whole graph until all paths are found. Order is guaranteed to be deterministic, but not guaranteed to be in any meaningful order (i.e. shortest to longest).

Example usages can be found in the test suite.

Link to this function get_shortest_path(g, a, b)
get_shortest_path(t, vertex, vertex) :: [vertex]

Gets the shortest path between a and b. If there are multiple paths of the same length, where all are “shortest”, this implementation simply takes the first one encountered.

The algorithm used here is a breadth-first search, which stops evaluation as soon as the shortest path to b is found. It follows very closely the implementation of get_short_path from :digraph, the only difference being in how some lookups, etc. are performed.

Example usages can be found in the test suite.

Link to this function in_degree(graph, v)

Returns the in-degree of vertex v of graph g.

The in-degree of a vertex is the number of edges directed inbound towards that vertex.

Link to this function in_neighbors(graph, v)
in_neighbors(t, vertex) :: [vertex]

Returns a list of vertices which all have edges coming in to the given vertex v.

Link to this function info(g)
info(t) :: %{num_edges: non_neg_integer, num_vertices: non_neg_integer, memory: non_neg_integer}

Returns a map of summary information about this graph.

Link to this function is_acyclic?(g)
is_acyclic?(t) :: boolean

Returns true if and only if the graph g is acyclic.

Link to this function is_arborescence?(g)
is_arborescence?(t) :: boolean

Returns true if the graph is an aborescence, a directed acyclic graph, where the root, a vertex, of the arborescence has a unique path from itself to every other vertex in the graph.

Link to this function is_cyclic?(g)
is_cyclic?(t) :: boolean

Returns true if the graph g is not acyclic.

Link to this function is_tree?(g)
is_tree?(t) :: boolean

Returns true if and only if the graph g is a tree.

Link to this function loop_vertices(g)
loop_vertices(t) :: [vertex]

Returns a list of vertices from graph g which are included in a loop.

Link to this function new()
new() :: t

Creates a new graph.

Link to this function out_degree(graph, v)
out_degree(t, vertex) :: non_neg_integer

Returns the out-degree of vertex v of graph g.

The out-degree of a vertex is the number of edges directed outbound from that vertex.

Link to this function out_neighbors(graph, v)
out_neighbors(t, vertex) :: [vertex]

Returns a list of vertices which the given vertex v has edges going to.

Link to this function postorder(g)
postorder(t) :: [vertex]

Returns all vertices of graph g. The order is given by a depth-first traversal of the graph, collecting visited vertices in postorder. More precisely, the vertices visited while searching from an arbitrarily chosen vertex are collected in postorder, and all those collected vertices are placed before the subsequently visited vertices.

Link to this function preorder(g)
preorder(t) :: [vertex]

Returns all vertices of graph g. The order is given by a depth-first traversal of the graph, collecting visited vertices in preorder.

Link to this function reachable(g, vs)
reachable(t, [vertex]) :: [[vertex]]

Returns an unsorted list of vertices from the graph, such that for each vertex in the list (call it v), there is a path in the graph from some vertex of vs to v.

As paths of length zero are allowed, the vertices of vs are also included in the returned list.

Link to this function reachable_neighbors(g, vs)
reachable_neighbors(t, [vertex]) :: [[vertex]]

Returns an unsorted list of vertices from the graph, such that for each vertex in the list (call it v), there is a path in the graph of length one or more from some vertex of vs to v.

As a consequence, only those vertices of vs that are included in some cycle are returned.

Link to this function reaching(g, vs)
reaching(t, [vertex]) :: [[vertex]]

Returns an unsorted list of vertices from the graph, such that for each vertex in the list (call it v), there is a path from v to some vertex of vs.

As paths of length zero are allowed, the vertices of vs are also included in the returned list.

Link to this function reaching_neighbors(g, vs)
reaching_neighbors(t, [vertex]) :: [[vertex]]

Returns an unsorted list of vertices from the graph, such that for each vertex in the list (call it v), there is a path of length one or more from v to some vertex of vs.

As a consequence, only those vertices of vs that are included in some cycle are returned.

Link to this function strong_components(g)
strong_components(t) :: [[vertex]]

Returns a list of strongly connected components, where each component is a list of vertices.

A strongly connected component is a maximal subgraph such that there is a path between each pair of vertices.

See components/1 for the definitions of subgraph and maximal subgraph if you are unfamiliar with the terminology.

See the test suite for an example of how this is used.

Link to this function subgraph(g, vs)
subgraph(t, [vertex]) :: t

Builds a maximal subgraph of g which includes all of the vertices in vs and the edges which connect them.

See the test suite for example usage.

Link to this function topsort(g)
topsort(t) :: [vertex]

Returns a topological ordering of the vertices of graph g, if such an ordering exists, otherwise it returns false. For each vertex in the returned list, no out-neighbors occur earlier in the list.

Link to this function vertices(graph)
vertices(t) :: vertex

Returns a list of all the vertices in the graph.

NOTE: You should be careful when using this on large graphs, as the list it produces contains every vertex on the graph. I have not yet verified whether Erlang ensures that they are a shared reference with the original, or copies, but if the latter it could result in running out of memory if the graph is too large.

Example

iex> g = Graph.new |> Graph.add_vertex(:a) |> Graph.add_vertex(:b)
...> Graph.vertices(g)
[:a, :b]