Edifice.Blocks.DepthwiseConv (Edifice v0.2.0)

Copy Markdown View Source

1D depthwise separable convolution block for sequence models.

Depthwise separable convolution factorizes a standard convolution into a depthwise convolution (per-channel) followed by a pointwise 1x1 convolution. This reduces parameters from O(C_in * C_out * K) to O(C * K + C * C_out).

Used by: Conformer, Mega, StripedHyena, and other hybrid models that need local pattern extraction alongside attention or SSM layers.

Architecture

Input [batch, seq_len, channels]
      |
Depthwise Conv1D (groups = channels)
      |
Optional BatchNorm / LayerNorm
      |
Activation (SiLU by default)
      |
Pointwise Conv1D (1x1)
      |
Output [batch, seq_len, out_channels]

Usage

output = DepthwiseConv.layer(input, 256, 31, name: "dw_conv")

Summary

Functions

Build a depthwise separable 1D convolution Axon layer.

Functions

layer(input, channels, kernel_size \\ 31, opts \\ [])

@spec layer(Axon.t(), pos_integer(), pos_integer(), keyword()) :: Axon.t()

Build a depthwise separable 1D convolution Axon layer.

Parameters

  • input - Axon node with shape [batch, seq_len, channels]
  • channels - Number of input/depthwise channels
  • kernel_size - Convolution kernel size (default: 31)

Options

  • :out_channels - Output channels for pointwise conv (default: same as channels)
  • :activation - Activation function (default: :silu)
  • :use_norm - Apply layer norm after depthwise conv (default: true)
  • :padding - Padding mode: :causal or :same (default: :causal)
  • :name - Layer name prefix (default: "depthwise_conv")