Program Dependence Graph for Elixir and Erlang.
Reach analyzes Elixir and Erlang source code and builds a graph that captures what depends on what: which expressions produce values consumed by others (data dependence), and which expressions control whether others execute (control dependence).
Building a graph
# Elixir (default)
{:ok, graph} = Reach.string_to_graph("""
def foo(x) do
y = x + 1
if y > 10, do: :big, else: :small
end
""")
# Erlang
{:ok, graph} = Reach.string_to_graph(source, language: :erlang)
# Auto-detected from file extension
{:ok, graph} = Reach.file_to_graph("lib/my_module.ex")
{:ok, graph} = Reach.file_to_graph("src/my_module.erl")Querying
Reach.backward_slice(graph, node_id)
Reach.forward_slice(graph, node_id)
Reach.independent?(graph, id_a, id_b)
Reach.nodes(graph, type: :call, module: Enum)
Reach.nodes(graph, type: :call, module: Enum, function: :map, arity: 2)
Reach.data_flows?(graph, source_id, sink_id)Inspecting nodes
node = Reach.node(graph, some_id)
node.type #=> :call
node.meta #=> %{module: Enum, function: :map, arity: 2}
node.source_span #=> %{file: "lib/foo.ex", start_line: 5, ...}
Reach.pure?(node) #=> true
Summary
Types
A program dependence graph. Built by string_to_graph/2, file_to_graph/2, etc.
Functions
Builds a graph from an already-parsed Elixir AST.
Returns all node IDs that affect the given node (backward slice).
Returns the call graph as a Graph.t().
Returns the children of a block in canonical order.
Returns node IDs on all paths from source to sink.
Returns the effect classification of a node.
Compiles an Elixir source string and builds a graph from the expanded bytecode.
Performs a context-sensitive backward slice through call boundaries.
Returns the control dependencies of a node.
Returns true if controller has a control-dependence edge to target.
Returns the data dependencies of a node.
Returns true if there's a data-dependence path from source to sink.
Returns nodes whose values are never used and have no side effects.
Returns true if there's any dependence path between two nodes.
Returns all dependence edges in the graph.
Reads a source file and builds a program dependence graph.
Same as file_to_graph/2 but raises on error.
Returns all node IDs affected by the given node (forward slice).
Returns the per-function PDG for a {module, function, arity} tuple.
Returns true if the node has data dependents (its value is used elsewhere).
Returns true if two expressions are independent.
Builds a graph from a compiled module (loaded in the VM).
Returns node IDs directly connected to node_id.
Returns the IR node for a given ID, or nil.
Returns all IR nodes, optionally filtered.
Returns true if the path from source to sink passes through
any node matching predicate.
Returns true if the node is pure (no side effects).
Parses a source string and builds a program dependence graph.
Same as string_to_graph/2 but raises on parse error.
Finds data flow paths from taint sources to dangerous sinks.
Exports the graph to DOT format for Graphviz visualization.
Returns the underlying Graph.t() (libgraph) for direct traversal.
Types
@type graph() :: struct()
A program dependence graph. Built by string_to_graph/2, file_to_graph/2, etc.
@type node_id() :: Reach.IR.Node.id()
Functions
Builds a graph from an already-parsed Elixir AST.
Useful when you already have the AST (e.g. from Credo or ExDNA) and don't want to re-parse source.
Returns all node IDs that affect the given node (backward slice).
The backward slice answers: "what does this expression depend on?"
Returns the call graph as a Graph.t().
Vertices are {module, function, arity} tuples.
@spec canonical_order(graph(), Reach.IR.Node.id()) :: [ {Reach.IR.Node.id(), Reach.IR.Node.t()} ]
Returns the children of a block in canonical order.
Independent sibling expressions are sorted by structural hash so that reordered-but-equivalent blocks produce the same sequence. Dependent expressions preserve their relative order.
Returns a list of {node_id, ir_node} pairs.
Example
# These two blocks produce the same canonical order:
# a = 1; b = 2; c = a + b
# b = 2; a = 1; c = a + b
# Because a=1 and b=2 are independent, they get sorted,
# while c=a+b stays last (depends on both).
Returns node IDs on all paths from source to sink.
The chop answers: "how does A influence B?"
@spec classify_effect(Reach.IR.Node.t()) :: Reach.Effects.effect()
Returns the effect classification of a node.
Possible values: :pure, :read, :write, :io, :send,
:receive, :exception, :nif, :unknown.
@spec compiled_to_graph( String.t() | [{module(), binary()}], keyword() ) :: {:ok, graph()} | {:error, term()}
Compiles an Elixir source string and builds a graph from the expanded bytecode.
Unlike string_to_graph/2, this compiles the code first, so macro-expanded
constructs (try/rescue inside macros, use callbacks, etc.) are visible.
The source must define complete modules.
@spec context_sensitive_slice(graph(), Reach.IR.Node.id()) :: [Reach.IR.Node.id()]
Performs a context-sensitive backward slice through call boundaries.
Uses the Horwitz-Reps-Binkley two-phase algorithm to avoid impossible paths through call sites.
@spec control_deps(graph(), Reach.IR.Node.id()) :: [{Reach.IR.Node.id(), term()}]
Returns the control dependencies of a node.
Each entry is {controller_node_id, label}.
Returns true if controller has a control-dependence edge to target.
@spec data_deps(graph(), Reach.IR.Node.id()) :: [{Reach.IR.Node.id(), atom()}]
Returns the data dependencies of a node.
Each entry is {source_node_id, variable_name}.
Returns true if there's a data-dependence path from source to sink.
@spec dead_code(graph()) :: [Reach.IR.Node.t()]
Returns nodes whose values are never used and have no side effects.
A node is dead if:
- It is pure (no side effects)
- No observable output depends on it (return values or effectful calls)
Returns true if there's any dependence path between two nodes.
@spec edges(graph()) :: [Graph.Edge.t()]
Returns all dependence edges in the graph.
Reads a source file and builds a program dependence graph.
The language is auto-detected from the file extension (.ex / .exs
for Elixir, .erl / .hrl for Erlang), or can be set explicitly
via the :language option.
Returns {:ok, graph} or {:error, reason}.
Same as file_to_graph/2 but raises on error.
Returns all node IDs affected by the given node (forward slice).
The forward slice answers: "what does this expression influence?"
@spec function_graph(graph(), {module() | nil, atom(), non_neg_integer()}) :: graph() | nil
Returns the per-function PDG for a {module, function, arity} tuple.
Returns true if the node has data dependents (its value is used elsewhere).
Returns true if two expressions are independent.
Two expressions are independent when:
- No data flows between them in either direction
- They execute under the same conditions (same control dependencies)
- Their side effects don't conflict
Independent expressions can be safely reordered.
Builds a graph from a compiled module (loaded in the VM).
Analyzes the macro-expanded Erlang abstract forms from the BEAM bytecode.
This captures code injected by use, defmacro, and other macros that
the source-level frontend misses.
Requires the module to be compiled with debug info (the default).
@spec neighbors(graph(), Reach.IR.Node.id(), term() | nil) :: [Reach.IR.Node.id()]
Returns node IDs directly connected to node_id.
With no label filter, returns all neighbors (both incoming and outgoing). With a label, returns only neighbors connected by edges with that label.
# All direct neighbors
Reach.neighbors(graph, node_id)
# Only nodes connected by :state_read edges
Reach.neighbors(graph, node_id, :state_read)
# Only data dependencies
Reach.neighbors(graph, node_id, {:data, :x})
@spec node(graph(), Reach.IR.Node.id()) :: Reach.IR.Node.t() | nil
Returns the IR node for a given ID, or nil.
@spec nodes( graph(), keyword() ) :: [Reach.IR.Node.t()]
Returns all IR nodes, optionally filtered.
Options
:type— filter by node type (:call,:match,:var, etc.):module— filter calls by module:function— filter calls by function name:arity— filter by arity
Examples
Reach.nodes(graph, type: :call)
Reach.nodes(graph, type: :call, module: Enum)
Reach.nodes(graph, type: :call, module: Enum, function: :map, arity: 2)
Returns true if the path from source to sink passes through
any node matching predicate.
Useful for taint analysis — check if sanitization occurs between a source and sink.
@spec pure?(Reach.IR.Node.t()) :: boolean()
Returns true if the node is pure (no side effects).
Parses a source string and builds a program dependence graph.
Returns {:ok, graph} or {:error, reason}.
Options
:language—:elixir(default) or:erlang:file— filename for source locations (default:"nofile"):module— module name for call graph resolution
Same as string_to_graph/2 but raises on parse error.
Finds data flow paths from taint sources to dangerous sinks.
Returns a list of %{source: node, sink: node, path: [node_id], sanitized: boolean}
for each source→sink pair where data flows.
Sources, sinks, and sanitizers can be specified as keyword filters
(same format as nodes/2) or as predicate functions.
Options
:sources— keyword filter or predicate identifying taint sources:sinks— keyword filter or predicate identifying dangerous sinks:sanitizers— keyword filter or predicate identifying sanitization points (optional)
Examples
Reach.taint_analysis(graph,
sources: [type: :call, function: :get_param],
sinks: [type: :call, module: System, function: :cmd, arity: 2],
sanitizers: [type: :call, function: :sanitize]
)
# Predicates also work
Reach.taint_analysis(graph,
sources: &(&1.meta[:function] in [:params, :body_params]),
sinks: [type: :call, module: Ecto.Adapters.SQL]
)
Exports the graph to DOT format for Graphviz visualization.
Returns the underlying Graph.t() (libgraph) for direct traversal.
Use this when you need graph operations that Reach doesn't expose — path finding, subgraphs, BFS/DFS, topological sort, etc.
raw = Reach.to_graph(graph)
Graph.vertices(raw) |> length()
Graph.get_shortest_path(raw, id_a, id_b)