View Source Bumblebee.Diffusion.VaeKl (Bumblebee v0.5.3)

Variational autoencoder (VAE) with Kullback–Leibler divergence (KL) loss.

Architectures

  • :base - the entire VAE model

  • :encoder - just the encoder part of the base model

  • :decoder - just the decoder part of the base model

Inputs

  • "sample" - {batch_size, sample_size, sample_size, in_channels}

    Sample input with two spatial dimensions. Note that in case of the :decoder model, the input usually has lower dimensionality.

  • "sample_posterior" - {}

    When true, the decoder input is sampled from the encoder output distribution. Otherwise the distribution mode value is used instead. This input is only relevant for the :base model. Defaults to false.

Configuration

  • :sample_size - the size of the input spatial dimensions. Defaults to 32

  • :in_channels - the number of channels in the input. Defaults to 3

  • :out_channels - the number of channels in the output. Defaults to 3

  • :latent_channels - the number of channels in the latent space. Defaults to 4

  • :hidden_sizes - the dimensionality of hidden layers in each upsample/downsample block. Defaults to ~c"@"

  • :depth - the number of residual blocks in each upsample/downsample block. Defaults to 1

  • :down_block_types - a list of downsample block types. Currently the only supported type is :down_block. Defaults to [:down_block]

  • :up_block_types - a list of upsample block types. Currently the only supported type is :up_block. Defaults to [:up_block]

  • :activation - the activation function. Defaults to :silu

References