libgraph v0.1.0 Graph

This module defines a directed graph data structure, which supports both acyclic and cyclic forms. It also defines the API for creating, manipulating, and querying that structure.

This is intended as a replacement for :digraph, which requires the use of 3 ETS tables at a minimum, but up to 6 at a time during certain operations (such as get_short_path/3). In environments where many graphs are in memory at a time, this can be dangerous, as it is easy to hit the system limit for max ETS tables, which will bring your node down. This graph implementation does not use ETS, so it can be used freely without concern for hitting this limit.

With regards to space requirements, vertices require 2N space, where N is the size of each vertex, as vertices are mapped to indexes (integers), and we store both the vertex->id map and the inverted index (id->vertex) for efficient lookups when querying the graph. Edges are stored as a map of vertex_id -> MapSet(vertex_id), which is the most compact representation we’re allowed while still supporting efficient lookups on edges/neighbors. It is recommended that you avoid stuffing large objects in the graph as vertices, and instead use a key which you can then use to reify a list of objects you care about from ETS or some other source once you have results you care about.

There are benchmarks provided with this library which compare it directly to :digraph for some common operations, and thus far, libgraph outperforms :digraph in all of them.

The only bit of data I have not yet evaluated is how much garbage is generated when querying/manipulating the graph between libgraph and digraph, but I suspect the use of ETS means that digraph is able to keep that to a minimum. Until I verify if that’s the case, I would assume that libgraph has higher memory requirements, but better performance, and is able to side-step the ETS limit. If your requirements, like mine, mean that you are dynamically constructing and querying graphs concurrently, I think libgraph is the better choice - however if you either need the APIs of :digraph that I have not yet implemented, or do not have the same use case, I would stick to :digraph for now.

Link to this section Summary

Types

edge()

t()

vertex()

Functions

add_edge(g, a, b)

Adds an edge connecting a to b. If either a or b do not exist in the graph, they are automatically added. Adding the same edge more than once does not create multiple edges, each edge is only ever stored once

add_edges(g, es)

Like add_edge/3, but takes a list of vertex pairs, and adds an edge to the graph for each pair

add_vertex(g, vertex)

Adds a new vertex to the graph. If the vertex is already present in the graph, the add is a no-op

add_vertices(g, vs)

Like add_vertex/2, but takes a list of vertices to add to the graph

components(g)

Returns a list of connected components, where each component is a list of vertices

delete_edge(g, a, b)

Removes an edge connecting a to b. If no such vertex exits, or the edge does not exist, it is effectively a no-op

delete_edges(g, es)

Like delete_edge/3, but takes a list of vertex pairs, and deletes the corresponding edge from the graph, if it exists

delete_vertex(g, vertex)

Removes a vertex from the graph, as well as any edges which refer to that vertex. If the vertex does not exist in the graph, it is a no-op

delete_vertices(g, vs)

Like delete_vertex/2, but takes a list of vertices to delete from the graph

edges(graph)

Return a list of all the edges, where each edge is expressed as a tuple of {A, B}, where the elements are the vertices involved, and implying the direction of the edge to be from A to B

get_paths(g, a, b)

Builds a list of paths between vertex a and vertex b

get_shortest_path(g, a, b)

Gets the shortest path between a and b. If there are multiple paths of the same length, where all are “shortest”, this implementation simply takes the first one encountered

in_degree(graph, v)

Returns the in-degree of vertex v of graph g

in_neighbors(graph, v)

Returns a list of vertices which all have edges coming in to the given vertex v

info(g)

Returns a map of summary information about this graph

is_acyclic?(g)

Returns true if and only if the graph g is acyclic

is_arborescence?(g)

Returns true if the graph is an aborescence, a directed acyclic graph, where the root, a vertex, of the arborescence has a unique path from itself to every other vertex in the graph

is_cyclic?(g)

Returns true if the graph g is not acyclic

is_tree?(g)

Returns true if and only if the graph g is a tree

loop_vertices(g)

Returns a list of vertices from graph g which are included in a loop

new()

Creates a new graph

out_degree(graph, v)

Returns the out-degree of vertex v of graph g

out_neighbors(graph, v)

Returns a list of vertices which the given vertex v has edges going to

postorder(g)

Returns all vertices of graph g. The order is given by a depth-first traversal of the graph, collecting visited vertices in postorder. More precisely, the vertices visited while searching from an arbitrarily chosen vertex are collected in postorder, and all those collected vertices are placed before the subsequently visited vertices

preorder(g)

Returns all vertices of graph g. The order is given by a depth-first traversal of the graph, collecting visited vertices in preorder

reachable(g, vs)

Returns an unsorted list of vertices from the graph, such that for each vertex in the list (call it v), there is a path in the graph from some vertex of vs to v

reachable_neighbors(g, vs)

Returns an unsorted list of vertices from the graph, such that for each vertex in the list (call it v), there is a path in the graph of length one or more from some vertex of vs to v

reaching(g, vs)

Returns an unsorted list of vertices from the graph, such that for each vertex in the list (call it v), there is a path from v to some vertex of vs

reaching_neighbors(g, vs)

Returns an unsorted list of vertices from the graph, such that for each vertex in the list (call it v), there is a path of length one or more from v to some vertex of vs

strong_components(g)

Returns a list of strongly connected components, where each component is a list of vertices

subgraph(g, vs)

Builds a maximal subgraph of g which includes all of the vertices in vs and the edges which connect them

topsort(g)

Returns a topological ordering of the vertices of graph g, if such an ordering exists, otherwise it returns false. For each vertex in the returned list, no out-neighbors occur earlier in the list

vertices(graph)

Returns a list of all the vertices in the graph

Link to this section Types

edge()

edge() :: {vertex, vertex}

t()

vertex()

vertex() :: term

Link to this section Functions

add_edge(g, a, b)

add_edge(t, vertex, vertex) :: t

Example

iex> g = Graph.new |> Graph.add_edge(:a, :b)
...> [:a, :b] = Graph.vertices(g)
...> Graph.edges(g)
[{:a, :b}]

add_edges(g, es)

add_edges(t, [{vertex, vertex}]) ::
  t |
  {:error, {:invalid_vertex_pair, term}}

Like add_edge/3, but takes a list of vertex pairs, and adds an edge to the graph for each pair.

Examples

iex> g = Graph.new |> Graph.add_vertices([:a, :b, :c]) |> Graph.add_edges([{:a, :b}, {:b, :c}])
...> Graph.edges(g)
[{:a, :b}, {:b, :c}]

iex> Graph.new |> Graph.add_vertices([:a, :b, :c]) |> Graph.add_edges([:a, :b])
{:error, {:invalid_vertex_pair, :a}}

add_vertex(g, vertex)

add_vertex(t, vertex) :: t

Adds a new vertex to the graph. If the vertex is already present in the graph, the add is a no-op.

Example

iex> g = Graph.new |> Graph.add_vertex(:a) |> Graph.add_vertex(:a)
...> Graph.vertices(g)
[:a]

add_vertices(g, vs)

add_vertices(t, [vertex]) :: t

Like add_vertex/2, but takes a list of vertices to add to the graph.

Example

iex> g = Graph.new |> Graph.add_vertices([:a, :b, :a])
...> Graph.vertices(g)
[:a, :b]

components(g)

components(t) :: [[vertex]]

Returns a list of connected components, where each component is a list of vertices.

A connected component is a maximal subgraph such that there is a path between each pair of vertices, considering all edges undirected.

A subgraph is a graph whose vertices and edges are a subset of the vertices and edges of the source graph.

A maximal subgraph is a subgraph with property P where all other subgraphs which contain the same vertices do not have that same property P.

See the test suite for an example of how this is used.

delete_edge(g, a, b)

Removes an edge connecting a to b. If no such vertex exits, or the edge does not exist, it is effectively a no-op.

Example

iex> g = Graph.new |> Graph.add_edge(:a, :b) |> Graph.delete_edge(:a, :b)
...> [:a, :b] = Graph.vertices(g)
...> Graph.edges(g)
[]

delete_edges(g, es)

delete_edges(t, [{vertex, vertex}]) ::
  t |
  {:error, {:invalid_vertex_pair, term}}

Like delete_edge/3, but takes a list of vertex pairs, and deletes the corresponding edge from the graph, if it exists.

Examples

iex> g = Graph.new |> Graph.add_vertices([:a, :b, :c]) |> Graph.add_edge(:a, :b)
...> g = Graph.delete_edges(g, [{:a, :b}])
...> Graph.edges(g)
[]

iex> g = Graph.new |> Graph.add_vertices([:a, :b, :c]) |> Graph.add_edge(:a, :b)
...> Graph.delete_edges(g, [:a])
{:error, {:invalid_vertex_pair, :a}}

delete_vertex(g, vertex)

delete_vertex(t, vertex) :: t

Removes a vertex from the graph, as well as any edges which refer to that vertex. If the vertex does not exist in the graph, it is a no-op.

Example

iex> g = Graph.new |> Graph.add_vertex(:a) |> Graph.add_vertex(:b) |> Graph.add_edge(:a, :b)
...> [:a, :b] = Graph.vertices(g)
...> [{:a, :b}] = Graph.edges(g)
...> g = Graph.delete_vertex(g, :b)
...> [:a] = Graph.vertices(g)
...> Graph.edges(g)
[]

delete_vertices(g, vs)

delete_vertices(t, [vertex]) :: t

Like delete_vertex/2, but takes a list of vertices to delete from the graph.

Example

iex> g = Graph.new |> Graph.add_vertices([:a, :b, :c]) |> Graph.delete_vertices([:a, :b])
...> Graph.vertices(g)
[:c]

edges(graph)

edges(t) :: [edge]

Return a list of all the edges, where each edge is expressed as a tuple of {A, B}, where the elements are the vertices involved, and implying the direction of the edge to be from A to B.

NOTE: You should be careful when using this on dense graphs, as it produces lists with whatever you’ve provided as vertices, with likely many copies of each. I’m not sure if those copies are shared in-memory as they are unchanged, so it should be fairly compact in memory, but I have not verified that to be sure.

Example

iex> g = Graph.new |> Graph.add_vertex(:a) |> Graph.add_vertex(:b) |> Graph.add_vertex(:c)
...> g = g |> Graph.add_edge(:a, :c) |> Graph.add_edge(:b, :c)
...> Graph.edges(g)
[{:a, :c}, {:b, :c}]

get_paths(g, a, b)

get_paths(t, vertex, vertex) :: [[vertex]]

Builds a list of paths between vertex a and vertex b.

The algorithm used here is a depth-first search, which evaluates the whole graph until all paths are found. Order is guaranteed to be deterministic, but not guaranteed to be in any meaningful order (i.e. shortest to longest).

Example usages can be found in the test suite.

get_shortest_path(g, a, b)

get_shortest_path(t, vertex, vertex) :: [vertex]

Gets the shortest path between a and b. If there are multiple paths of the same length, where all are “shortest”, this implementation simply takes the first one encountered.

The algorithm used here is a breadth-first search, which stops evaluation as soon as the shortest path to b is found. It follows very closely the implementation of get_short_path from :digraph, the only difference being in how some lookups, etc. are performed.

Example usages can be found in the test suite.

in_degree(graph, v)

Returns the in-degree of vertex v of graph g.

The in-degree of a vertex is the number of edges directed inbound towards that vertex.

in_neighbors(graph, v)

in_neighbors(t, vertex) :: [vertex]

Returns a list of vertices which all have edges coming in to the given vertex v.

info(g)

info(t) :: %{num_edges: non_neg_integer, num_vertices: non_neg_integer, memory: non_neg_integer}

Returns a map of summary information about this graph.

is_acyclic?(g)

is_acyclic?(t) :: boolean

Returns true if and only if the graph g is acyclic.

is_arborescence?(g)

is_arborescence?(t) :: boolean

Returns true if the graph is an aborescence, a directed acyclic graph, where the root, a vertex, of the arborescence has a unique path from itself to every other vertex in the graph.

is_cyclic?(g)

is_cyclic?(t) :: boolean

Returns true if the graph g is not acyclic.

is_tree?(g)

is_tree?(t) :: boolean

Returns true if and only if the graph g is a tree.

loop_vertices(g)

loop_vertices(t) :: [vertex]

Returns a list of vertices from graph g which are included in a loop.

new()

new() :: t

Creates a new graph.

out_degree(graph, v)

out_degree(t, vertex) :: non_neg_integer

Returns the out-degree of vertex v of graph g.

The out-degree of a vertex is the number of edges directed outbound from that vertex.

out_neighbors(graph, v)

out_neighbors(t, vertex) :: [vertex]

Returns a list of vertices which the given vertex v has edges going to.

postorder(g)

postorder(t) :: [vertex]

preorder(g)

preorder(t) :: [vertex]

Returns all vertices of graph g. The order is given by a depth-first traversal of the graph, collecting visited vertices in preorder.

reachable(g, vs)

reachable(t, [vertex]) :: [[vertex]]

Returns an unsorted list of vertices from the graph, such that for each vertex in the list (call it v), there is a path in the graph from some vertex of vs to v.

As paths of length zero are allowed, the vertices of vs are also included in the returned list.

reachable_neighbors(g, vs)

reachable_neighbors(t, [vertex]) :: [[vertex]]

Returns an unsorted list of vertices from the graph, such that for each vertex in the list (call it v), there is a path in the graph of length one or more from some vertex of vs to v.

As a consequence, only those vertices of vs that are included in some cycle are returned.

reaching(g, vs)

reaching(t, [vertex]) :: [[vertex]]

Returns an unsorted list of vertices from the graph, such that for each vertex in the list (call it v), there is a path from v to some vertex of vs.

As paths of length zero are allowed, the vertices of vs are also included in the returned list.

reaching_neighbors(g, vs)

reaching_neighbors(t, [vertex]) :: [[vertex]]

Returns an unsorted list of vertices from the graph, such that for each vertex in the list (call it v), there is a path of length one or more from v to some vertex of vs.

As a consequence, only those vertices of vs that are included in some cycle are returned.

strong_components(g)

strong_components(t) :: [[vertex]]

Returns a list of strongly connected components, where each component is a list of vertices.

A strongly connected component is a maximal subgraph such that there is a path between each pair of vertices.

See components/1 for the definitions of subgraph and maximal subgraph if you are unfamiliar with the terminology.

See the test suite for an example of how this is used.

subgraph(g, vs)

subgraph(t, [vertex]) :: t

Builds a maximal subgraph of g which includes all of the vertices in vs and the edges which connect them.

See the test suite for example usage.

topsort(g)

topsort(t) :: [vertex]

Returns a topological ordering of the vertices of graph g, if such an ordering exists, otherwise it returns false. For each vertex in the returned list, no out-neighbors occur earlier in the list.

vertices(graph)

vertices(t) :: vertex

Returns a list of all the vertices in the graph.

NOTE: You should be careful when using this on large graphs, as the list it produces contains every vertex on the graph. I have not yet verified whether Erlang ensures that they are a shared reference with the original, or copies, but if the latter it could result in running out of memory if the graph is too large.

Example

iex> g = Graph.new |> Graph.add_vertex(:a) |> Graph.add_vertex(:b)
...> Graph.vertices(g)
[:a, :b]