Capsule Networks with dynamic routing (Sabour et al., 2017).
Capsule Networks replace scalar neuron activations with vector "capsules"
that encode both the probability of an entity's existence (vector length)
and its instantiation parameters (vector direction). This preserves
spatial hierarchies that CNNs lose through max-pooling.
Key Concepts
- Capsule: A group of neurons whose activity vector represents an entity.
Vector length = probability of entity, direction = entity properties.
- Squash: Non-linear activation that preserves direction but squashes
length to [0, 1]:
v = (||s||^2 / (1 + ||s||^2)) * (s / ||s||) - Dynamic Routing: Agreement-based routing where lower capsules send
output to higher capsules that "agree" with their predictions.
Architecture
Input [batch, height, width, channels]
|
v
+----------------------------+
| Conv Layer |
+----------------------------+
|
v
+----------------------------+
| Primary Capsule Layer |
| (Conv -> reshape to caps) |
+----------------------------+
|
v
+----------------------------+
| Dynamic Routing |
| (routing by agreement) |
+----------------------------+
|
v
+----------------------------+
| Digit/Output Capsules |
+----------------------------+
|
v
Output: capsule vectors [batch, num_digit_caps, digit_cap_dim]
Length of each capsule = class probability
Usage
model = Capsule.build(
input_shape: {nil, 28, 28, 1},
num_primary_caps: 32,
primary_cap_dim: 8,
num_digit_caps: 10,
digit_cap_dim: 16,
routing_iterations: 3
)
References