# `Edifice.Vision.UNet`
[🔗](https://github.com/blasphemetheus/edifice/blob/main/lib/edifice/vision/unet.ex#L1)

U-Net encoder-decoder architecture with skip connections.

Originally designed for biomedical image segmentation, U-Net uses a symmetric
encoder-decoder structure with skip connections that concatenate encoder features
at each level with decoder features, preserving fine-grained spatial information.

This implementation uses real 2D convolutions, max-pooling for downsampling,
and transposed convolutions for upsampling — faithful to the original paper.

## Architecture

```
Image [batch, channels, height, width]
      |
+-----v--------------------+
| Transpose to NHWC         |  [batch, H, W, C]
+---------------------------+
      |
+-----v--------------------+       Skip Connections
| Encoder Level 1           |  ----------+
|   Conv 3x3 + BN + ReLU   |            |
|   Conv 3x3 + BN + ReLU   |            |
|   MaxPool 2x2             |            |
+---------------------------+            |
      |                                  |
+-----v--------------------+             |
| Encoder Level 2           |  -----+    |
|   Conv 3x3 + BN + ReLU   |       |    |
|   Conv 3x3 + BN + ReLU   |       |    |
|   MaxPool 2x2             |       |    |
+---------------------------+       |    |
      |                             |    |
      ... (depth levels)            |    |
      |                             |    |
+-----v--------------------+       |    |
| Bottleneck                |       |    |
|   Conv 3x3 + BN + ReLU   |       |    |
|   Conv 3x3 + BN + ReLU   |       |    |
+---------------------------+       |    |
      |                             |    |
+-----v--------------------+       |    |
| Decoder Level 2           |       |    |
|   ConvTranspose 2x2 (up) |       |    |
|   Concat skip <-----------+------+    |
|   Conv 3x3 + BN + ReLU   |            |
|   Conv 3x3 + BN + ReLU   |            |
+---------------------------+            |
      |                                  |
+-----v--------------------+             |
| Decoder Level 1           |            |
|   ConvTranspose 2x2 (up) |            |
|   Concat skip <-----------+------------+
|   Conv 3x3 + BN + ReLU   |
|   Conv 3x3 + BN + ReLU   |
+---------------------------+
      |
+-----v--------------------+
| Output Conv 1x1           |  [batch, H, W, out_channels]
+---------------------------+
      |
+-----v--------------------+
| Transpose to NCHW         |  [batch, out_channels, H, W]
+---------------------------+
```

## Usage

    # Basic U-Net for segmentation
    model = UNet.build(
      in_channels: 3,
      out_channels: 1,
      image_size: 256,
      base_features: 64,
      depth: 4
    )

    # Shallow U-Net for small images
    model = UNet.build(
      in_channels: 1,
      out_channels: 10,
      image_size: 28,
      base_features: 32,
      depth: 3
    )

## References

- "U-Net: Convolutional Networks for Biomedical Image Segmentation"
  (Ronneberger et al., MICCAI 2015)

# `build_opt`

```elixir
@type build_opt() ::
  {:base_features, pos_integer()}
  | {:depth, pos_integer()}
  | {:dropout, float()}
  | {:image_size, pos_integer()}
  | {:in_channels, pos_integer()}
  | {:out_channels, pos_integer()}
  | {:use_attention, boolean()}
```

Options for `build/1`.

# `build`

```elixir
@spec build([build_opt()]) :: Axon.t()
```

Build a U-Net model.

## Options

  - `:in_channels` - Number of input channels (default: 3)
  - `:out_channels` - Number of output channels (default: 1)
  - `:image_size` - Input image size, square (default: 256)
  - `:base_features` - Feature count at first encoder level (default: 64)
  - `:depth` - Number of encoder/decoder levels (default: 4)
  - `:dropout` - Dropout rate (default: 0.0)
  - `:use_attention` - Add attention at bottleneck (default: false)

## Returns

  An Axon model outputting `[batch, out_channels, image_size, image_size]`.

# `output_size`

```elixir
@spec output_size(keyword()) :: pos_integer()
```

Get the output size of a UNet model.

Returns `out_channels * image_size * image_size` (flattened spatial output).

---

*Consult [api-reference.md](api-reference.md) for complete listing*