1D depthwise separable convolution block for sequence models.
Depthwise separable convolution factorizes a standard convolution into a
depthwise convolution (per-channel) followed by a pointwise 1x1 convolution.
This reduces parameters from O(C_in * C_out * K) to O(C * K + C * C_out).
Used by: Conformer, Mega, StripedHyena, and other hybrid models that need local pattern extraction alongside attention or SSM layers.
Architecture
Input [batch, seq_len, channels]
|
Depthwise Conv1D (groups = channels)
|
Optional BatchNorm / LayerNorm
|
Activation (SiLU by default)
|
Pointwise Conv1D (1x1)
|
Output [batch, seq_len, out_channels]Usage
output = DepthwiseConv.layer(input, 256, 31, name: "dw_conv")
Summary
Functions
Build a depthwise separable 1D convolution Axon layer.
Functions
@spec layer(Axon.t(), pos_integer(), pos_integer(), keyword()) :: Axon.t()
Build a depthwise separable 1D convolution Axon layer.
Parameters
input- Axon node with shape[batch, seq_len, channels]channels- Number of input/depthwise channelskernel_size- Convolution kernel size (default: 31)
Options
:out_channels- Output channels for pointwise conv (default: same aschannels):activation- Activation function (default::silu):use_norm- Apply layer norm after depthwise conv (default:true):padding- Padding mode::causalor:same(default::causal):name- Layer name prefix (default:"depthwise_conv")