Edifice.Meta.Capsule (Edifice v0.2.0)

Copy Markdown View Source

Capsule Networks with dynamic routing (Sabour et al., 2017).

Capsule Networks replace scalar neuron activations with vector "capsules" that encode both the probability of an entity's existence (vector length) and its instantiation parameters (vector direction). This preserves spatial hierarchies that CNNs lose through max-pooling.

Key Concepts

  • Capsule: A group of neurons whose activity vector represents an entity. Vector length = probability of entity, direction = entity properties.
  • Squash: Non-linear activation that preserves direction but squashes length to [0, 1]: v = (||s||^2 / (1 + ||s||^2)) * (s / ||s||)
  • Dynamic Routing: Agreement-based routing where lower capsules send output to higher capsules that "agree" with their predictions.

Architecture

Input [batch, height, width, channels]
      |
      v
+----------------------------+
|    Conv Layer              |
+----------------------------+
      |
      v
+----------------------------+
| Primary Capsule Layer      |
| (Conv -> reshape to caps)  |
+----------------------------+
      |
      v
+----------------------------+
| Dynamic Routing            |
| (routing by agreement)     |
+----------------------------+
      |
      v
+----------------------------+
| Digit/Output Capsules      |
+----------------------------+
      |
      v
Output: capsule vectors [batch, num_digit_caps, digit_cap_dim]
Length of each capsule = class probability

Usage

model = Capsule.build(
  input_shape: {nil, 28, 28, 1},
  num_primary_caps: 32,
  primary_cap_dim: 8,
  num_digit_caps: 10,
  digit_cap_dim: 16,
  routing_iterations: 3
)

References

Summary

Types

Options for build/1.

Functions

Build a Capsule Network (CapsNet).

Dynamic routing by agreement between capsule layers.

Squash activation function for capsule vectors.

Types

build_opt()

@type build_opt() ::
  {:conv_channels, pos_integer()}
  | {:conv_kernel, pos_integer()}
  | {:digit_cap_dim, pos_integer()}
  | {:input_shape, tuple()}
  | {:num_digit_caps, pos_integer()}
  | {:num_primary_caps, pos_integer()}
  | {:primary_cap_dim, pos_integer()}
  | {:primary_kernel, pos_integer()}
  | {:primary_strides, pos_integer()}
  | {:routing_iterations, float()}

Options for build/1.

Functions

build(opts \\ [])

@spec build([build_opt()]) :: Axon.t()

Build a Capsule Network (CapsNet).

Options

  • :input_shape - Input shape as {nil, height, width, channels} (required)
  • :num_primary_caps - Number of primary capsule types (default: 32)
  • :primary_cap_dim - Dimension of each primary capsule (default: 8)
  • :num_digit_caps - Number of output capsules (default: 10)
  • :digit_cap_dim - Dimension of each output capsule (default: 16)
  • :routing_iterations - Number of dynamic routing iterations (default: 3)
  • :conv_channels - Initial convolution channels (default: 256)
  • :conv_kernel - Initial convolution kernel size (default: 9)
  • :primary_kernel - Primary capsule convolution kernel size (default: 9)
  • :primary_strides - Primary capsule convolution strides (default: 2)

Returns

An Axon model producing capsule norms [batch, num_digit_caps] representing class probabilities.

dynamic_routing(input_caps, num_output_caps, output_cap_dim, opts \\ [])

@spec dynamic_routing(Axon.t(), pos_integer(), pos_integer(), keyword()) :: Axon.t()

Dynamic routing by agreement between capsule layers.

Lower-level capsules predict the output of higher-level capsules via learned transformation matrices. Routing coefficients are iteratively updated based on agreement between predictions and actual outputs.

Algorithm

  1. Initialize routing logits b_ij = 0
  2. For each iteration: a. Compute routing coefficients: c_ij = softmax(b_ij) b. Compute weighted prediction sum: s_j = sum(c_ij * u_hat_ij) c. Apply squash: v_j = squash(s_j) d. Update logits: b_ij += u_hat_ij . v_j (agreement)

Parameters

  • input_caps - Axon node with input capsules [batch, num_input_caps, input_cap_dim]
  • num_output_caps - Number of output capsules
  • output_cap_dim - Dimension of each output capsule

Options

  • :routing_iterations - Number of routing iterations (default: 3)
  • :name - Layer name prefix

Returns

An Axon node with shape [batch, num_output_caps, output_cap_dim]

primary_capsule_layer(input, num_caps, cap_dim, opts \\ [])

@spec primary_capsule_layer(Axon.t(), pos_integer(), pos_integer(), keyword()) ::
  Axon.t()

Build a primary capsule layer.

Converts a standard convolutional feature map into capsule vectors. Uses convolution to produce num_caps * cap_dim channels, then reshapes into capsule vectors and applies the squash activation.

Parameters

  • input - Axon node with conv features [batch, height, width, channels]
  • num_caps - Number of capsule types
  • cap_dim - Dimension of each capsule vector

Options

  • :kernel_size - Convolution kernel size (default: 9)
  • :strides - Convolution strides (default: 2)
  • :name - Layer name prefix

Returns

An Axon node with shape [batch, total_num_capsules, cap_dim] where total_num_capsules = num_caps * spatial_positions

squash(tensor)

@spec squash(Nx.Tensor.t()) :: Nx.Tensor.t()

Squash activation function for capsule vectors.

Non-linear "squashing" that preserves the direction of the vector but scales its magnitude to be between 0 and 1.

v = (||s||^2 / (1 + ||s||^2)) * (s / ||s||)

Short vectors get shrunk to near zero length, long vectors get shrunk to just below 1. Direction is preserved.

Parameters

  • tensor - Input tensor [..., cap_dim]

Returns

Squashed tensor with same shape, magnitudes in [0, 1)