View Source Bumblebee.Vision.ResNet (Bumblebee v0.5.3)

ResNet model family.

Architectures

  • :base - plain ResNet without any head on top

  • :for_image_classification - ResNet with a classification head. The head consists of a single dense layer on top of the pooled features and it returns logits corresponding to possible classes

Inputs

  • "pixel_values" - {batch_size, height, width, num_channels}

    Featurized image pixel values (224x224).

Configuration

  • :num_channels - the number of channels in the input. Defaults to 3

  • :embedding_size - the dimensionality of the embedding layer. Defaults to 64

  • :hidden_sizes - the dimensionality of hidden layers at each stage. Defaults to [256, 512, 1024, 2048]

  • :depths - the depth (number of residual blocks) at each stage. Defaults to [3, 4, 6, 3]

  • :residual_block_type - the residual block to use, either :basic (used for smaller models, like ResNet-18 or ResNet-34) or :bottleneck (used for larger models like ResNet-50 and above) . Defaults to :bottleneck

  • :activation - the activation function. Defaults to :relu

  • :downsample_in_first_stage - whether the first stage should downsample the inputs using a stride of 2. Defaults to false

  • :output_hidden_states - whether the model should return all hidden states. Defaults to false

  • :num_labels - the number of labels to use in the last layer for the classification task. Defaults to 2

  • :id_to_label - a map from class index to label. Defaults to %{}