View Source Bumblebee.Vision.ConvNext (Bumblebee v0.4.2)
ConvNeXT model family.
Architectures
:base
- plain ConvNeXT without any head on top:for_image_classification
- ConvNeXT with a classification head. The head consists of a single dense layer on top of the pooled features
Inputs
"pixel_values"
- {batch_size, height, width, num_channels}Featurized image pixel values (224x224).
Configuration
:num_channels
- the number of channels in the input. Defaults to3
:patch_size
- the size of the patch spatial dimensions. Defaults to4
:hidden_sizes
- the dimensionality of hidden layers at each stage. Defaults to[96, 192, 384, 768]
:depths
- the depth (number of residual blocks) at each stage. Defaults to[3, 3, 9, 3]
:activation
- the activation function. Defaults to:gelu
:scale_initial_value
- the initial value for scaling layers. Defaults to1.0e-6
:drop_path_rate
- the drop path rate used to for stochastic depth. Defaults to0.0
:layer_norm_epsilon
- the epsilon used by the layer normalization layers. Defaults to1.0e-12
:initializer_scale
- the standard deviation of the normal initializer used for initializing kernel parameters. Defaults to0.02
:output_hidden_states
- whether the model should return all hidden states. Defaults tofalse
:num_labels
- the number of labels to use in the last layer for the classification task. Defaults to2
:id_to_label
- a map from class index to label. Defaults to%{}