Edifice.Meta.Capsule (Edifice v0.2.0)

Capsule Networks with dynamic routing (Sabour et al., 2017).

Capsule Networks replace scalar neuron activations with vector "capsules" that encode both the probability of an entity's existence (vector length) and its instantiation parameters (vector direction). This preserves spatial hierarchies that CNNs lose through max-pooling.

Key Concepts

Capsule: A group of neurons whose activity vector represents an entity. Vector length = probability of entity, direction = entity properties.
Squash: Non-linear activation that preserves direction but squashes length to [0, 1]: v = (||s||^2 / (1 + ||s||^2)) * (s / ||s||)
Dynamic Routing: Agreement-based routing where lower capsules send output to higher capsules that "agree" with their predictions.

Architecture

Input [batch, height, width, channels]
      |
      v
+----------------------------+
|    Conv Layer              |
+----------------------------+
      |
      v
+----------------------------+
| Primary Capsule Layer      |
| (Conv -> reshape to caps)  |
+----------------------------+
      |
      v
+----------------------------+
| Dynamic Routing            |
| (routing by agreement)     |
+----------------------------+
      |
      v
+----------------------------+
| Digit/Output Capsules      |
+----------------------------+
      |
      v
Output: capsule vectors [batch, num_digit_caps, digit_cap_dim]
Length of each capsule = class probability

Usage

model = Capsule.build(
  input_shape: {nil, 28, 28, 1},
  num_primary_caps: 32,
  primary_cap_dim: 8,
  num_digit_caps: 10,
  digit_cap_dim: 16,
  routing_iterations: 3
)

References

Sabour et al., "Dynamic Routing Between Capsules" (2017)
https://arxiv.org/abs/1710.09829

Summary

Types

build_opt()

Options for build/1.

Functions

build(opts \\ [])

Build a Capsule Network (CapsNet).

dynamic_routing(input_caps, num_output_caps, output_cap_dim, opts \\ [])

Dynamic routing by agreement between capsule layers.

primary_capsule_layer(input, num_caps, cap_dim, opts \\ [])

Build a primary capsule layer.

squash(tensor)

Squash activation function for capsule vectors.

Types

build_opt()

@type build_opt() ::
  {:conv_channels, pos_integer()}
  | {:conv_kernel, pos_integer()}
  | {:digit_cap_dim, pos_integer()}
  | {:input_shape, tuple()}
  | {:num_digit_caps, pos_integer()}
  | {:num_primary_caps, pos_integer()}
  | {:primary_cap_dim, pos_integer()}
  | {:primary_kernel, pos_integer()}
  | {:primary_strides, pos_integer()}
  | {:routing_iterations, float()}

Options for build/1.

Functions

build(opts \\ [])

@spec build([build_opt()]) :: Axon.t()

Build a Capsule Network (CapsNet).

Options

:input_shape - Input shape as {nil, height, width, channels} (required)
:num_primary_caps - Number of primary capsule types (default: 32)
:primary_cap_dim - Dimension of each primary capsule (default: 8)
:num_digit_caps - Number of output capsules (default: 10)
:digit_cap_dim - Dimension of each output capsule (default: 16)
:routing_iterations - Number of dynamic routing iterations (default: 3)
:conv_channels - Initial convolution channels (default: 256)
:conv_kernel - Initial convolution kernel size (default: 9)
:primary_kernel - Primary capsule convolution kernel size (default: 9)
:primary_strides - Primary capsule convolution strides (default: 2)

Returns

An Axon model producing capsule norms [batch, num_digit_caps] representing class probabilities.

dynamic_routing(input_caps, num_output_caps, output_cap_dim, opts \\ [])

@spec dynamic_routing(Axon.t(), pos_integer(), pos_integer(), keyword()) :: Axon.t()

Dynamic routing by agreement between capsule layers.

Lower-level capsules predict the output of higher-level capsules via learned transformation matrices. Routing coefficients are iteratively updated based on agreement between predictions and actual outputs.

Algorithm

Initialize routing logits b_ij = 0
For each iteration: a. Compute routing coefficients: c_ij = softmax(b_ij) b. Compute weighted prediction sum: s_j = sum(c_ij * u_hat_ij) c. Apply squash: v_j = squash(s_j) d. Update logits: b_ij += u_hat_ij . v_j (agreement)

Parameters

input_caps - Axon node with input capsules [batch, num_input_caps, input_cap_dim]
num_output_caps - Number of output capsules
output_cap_dim - Dimension of each output capsule

Options

:routing_iterations - Number of routing iterations (default: 3)
:name - Layer name prefix

Returns

An Axon node with shape [batch, num_output_caps, output_cap_dim]

primary_capsule_layer(input, num_caps, cap_dim, opts \\ [])

@spec primary_capsule_layer(Axon.t(), pos_integer(), pos_integer(), keyword()) ::
  Axon.t()

Build a primary capsule layer.

Converts a standard convolutional feature map into capsule vectors. Uses convolution to produce num_caps * cap_dim channels, then reshapes into capsule vectors and applies the squash activation.

Parameters

input - Axon node with conv features [batch, height, width, channels]
num_caps - Number of capsule types
cap_dim - Dimension of each capsule vector

Options

:kernel_size - Convolution kernel size (default: 9)
:strides - Convolution strides (default: 2)
:name - Layer name prefix

Returns

An Axon node with shape [batch, total_num_capsules, cap_dim] where total_num_capsules = num_caps * spatial_positions

squash(tensor)

@spec squash(Nx.Tensor.t()) :: Nx.Tensor.t()

Squash activation function for capsule vectors.

Non-linear "squashing" that preserves the direction of the vector but scales its magnitude to be between 0 and 1.

v = (||s||^2 / (1 + ||s||^2)) * (s / ||s||)

Short vectors get shrunk to near zero length, long vectors get shrunk to just below 1. Direction is preserved.

Parameters

tensor - Input tensor [..., cap_dim]

Returns

Squashed tensor with same shape, magnitudes in [0, 1)