Edifice.Memory.MemoryNetwork (Edifice v0.2.0)

End-to-End Memory Networks (Sukhbaatar et al., 2015).

Memory Networks perform iterative reasoning over a set of memory slots by repeatedly attending to (reading from) memory and updating an internal query state. Each "hop" refines the query, enabling multi-step inference.

How It Works

Given a query q and memories M:

Compute attention: p = softmax(q^T A M)
Read memory: o = sum(p_i C m_i)
Update query: q' = q + o
Repeat for K hops

Different embedding matrices A (for attention) and C (for output) at each hop allow the network to focus on different aspects of memory.

Architecture

Query [batch, input_dim]     Memories [batch, num_memories, input_dim]
      |                              |
      v                              v
+-----------+                 +-------------+
|  Embed A  |                 |   Embed A   |  (attention embedding)
+-----------+                 +-------------+
      |                              |
      +--------> Attention <---------+
                    |
              +-------------+
              |   Embed C   |  (output embedding)
              +-------------+
                    |
                    v
            Weighted Sum = o
                    |
                    v
            q' = q + o      (update query)
                    |
                    v
            (repeat K hops)
                    |
                    v
            Output [batch, output_dim]

Usage

model = MemoryNetwork.build(
  input_dim: 128,
  memory_dim: 128,
  num_hops: 3
)

References

Sukhbaatar et al., "End-To-End Memory Networks" (2015)
https://arxiv.org/abs/1503.08895

Summary

Types

build_opt()

Options for build/1.

Functions

build(opts \\ [])

Build an End-to-End Memory Network.

build_multi_hop(query, memories, opts \\ [])

Stack multiple memory hops for iterative memory reading.

memory_hop(query, memories, opts \\ [])

Perform a single memory hop: attend to memories, read, update query.

Types

build_opt()

@type build_opt() ::
  {:input_dim, pos_integer()}
  | {:memory_dim, pos_integer()}
  | {:num_hops, pos_integer()}
  | {:num_memories, pos_integer()}
  | {:output_dim, pos_integer()}

Options for build/1.

Functions

build(opts \\ [])

@spec build([build_opt()]) :: Axon.t()

Build an End-to-End Memory Network.

Options

:input_dim - Input/query feature dimension (required)
:memory_dim - Internal memory embedding dimension (default: 128)
:num_hops - Number of memory reading iterations (default: 3)
:output_dim - Output dimension (default: same as input_dim)
:num_memories - Expected number of memory slots, nil for dynamic (default: nil)

Returns

An Axon model taking query [batch, input_dim] and memories [batch, num_memories, input_dim], producing [batch, output_dim].

build_multi_hop(query, memories, opts \\ [])

@spec build_multi_hop(Axon.t(), Axon.t(), keyword()) :: Axon.t()

Stack multiple memory hops for iterative memory reading.

Each hop refines the query by attending to memories and incorporating the read result. Later hops can focus on different memory aspects because the query has been updated by previous hops.

Parameters

query - Initial query [batch, memory_dim]
memories - Memory slots [batch, num_memories, input_dim]

Options

:num_hops - Number of hops (default: 3)
:memory_dim - Memory embedding dimension (default: 128)

Returns

Final query after all hops [batch, memory_dim]

memory_hop(query, memories, opts \\ [])

@spec memory_hop(Axon.t(), Axon.t(), keyword()) :: Axon.t()

Perform a single memory hop: attend to memories, read, update query.

One hop consists of:

Compute attention weights over memories using current query
Read a weighted sum from memories (output embedding)
Add the read result to the query (residual update)

Parameters

query - Current query state [batch, memory_dim]
memories - Memory slots [batch, num_memories, input_dim]

Options

:memory_dim - Memory embedding dimension
:hop_idx - Index of this hop (for naming)
:name - Layer name prefix

Returns

Updated query [batch, memory_dim]