View Source Bumblebee.Vision.ResNet (Bumblebee v0.4.2)

ResNet model family.

Architectures

:base - plain ResNet without any head on top
:for_image_classification - ResNet with a classification head. The head consists of a single dense layer on top of the pooled features and it returns logits corresponding to possible classes

"pixel_values" - {batch_size, height, width, num_channels}
Featurized image pixel values (224x224).

:num_channels - the number of channels in the input. Defaults to 3
:embedding_size - the dimensionality of the embedding layer. Defaults to 64
:hidden_sizes - the dimensionality of hidden layers at each stage. Defaults to [256, 512, 1024, 2048]
:depths - the depth (number of residual blocks) at each stage. Defaults to [3, 4, 6, 3]
:residual_block_type - the residual block to use, either :basic (used for smaller models, like ResNet-18 or ResNet-34) or :bottleneck (used for larger models like ResNet-50 and above) . Defaults to :bottleneck
:activation - the activation function. Defaults to :relu
:downsample_in_first_stage - whether the first stage should downsample the inputs using a stride of 2. Defaults to false
:output_hidden_states - whether the model should return all hidden states. Defaults to false
:num_labels - the number of labels to use in the last layer for the classification task. Defaults to 2
:id_to_label - a map from class index to label. Defaults to %{}