Edifice.Vision.NeRF (Edifice v0.2.0)

Copy Markdown View Source

NeRF: Neural Radiance Fields network (Mildenhall et al., 2020).

Maps 3D coordinates (and optionally viewing directions) to color and density values using Fourier positional encoding followed by an MLP with skip connections. This is the core network architecture used in Neural Radiance Fields for novel view synthesis.

Architecture

Coordinates [batch, 3]
      |
+-----v--------------------+
| Fourier Encoding          |  gamma(p) = [p, sin(2^0*pi*p), cos(2^0*pi*p), ...]
+---------------------------+
      |
      v
[batch, 3 * (2*L + 1)]
      |
+-----v--------------------+
| Dense Layer 1             |  ReLU
| Dense Layer 2             |  ReLU
| ...                       |
| Dense Layer K (skip_layer)|  Concatenate encoded input, ReLU
| ...                       |
| Dense Layer N             |  ReLU
+---------------------------+
      |
      +---> Density sigma [batch, 1]
      |
      +---> Feature -> concat(directions_encoded) -> Dense -> RGB [batch, 3]
      |
      v
Output [batch, 4]  (RGB + density)

Differences from Vision Models

Unlike other vision models in Edifice, NeRF does not take image inputs. Instead, it takes raw 3D coordinates and optional viewing directions, making it fundamentally a coordinate-to-color mapping network.

Usage

# With viewing directions
model = NeRF.build(
  hidden_size: 256,
  num_layers: 8,
  skip_layer: 4,
  num_frequencies: 10,
  use_viewdir: true
)

# Without viewing directions
model = NeRF.build(use_viewdir: false)

References

Summary

Types

Options for build/1.

Functions

Build a NeRF network.

Get the output size of a NeRF model.

Types

build_opt()

@type build_opt() ::
  {:coord_dim, pos_integer()}
  | {:dir_dim, pos_integer()}
  | {:hidden_size, pos_integer()}
  | {:num_frequencies, pos_integer()}
  | {:num_layers, pos_integer()}
  | {:skip_layer, pos_integer()}
  | {:use_viewdir, boolean()}

Options for build/1.

Functions

build(opts \\ [])

@spec build([build_opt()]) :: Axon.t()

Build a NeRF network.

Options

  • :coord_dim - Coordinate input dimension (default: 3)
  • :dir_dim - Viewing direction dimension (default: 3)
  • :hidden_size - Hidden layer size (default: 256)
  • :num_layers - Number of MLP layers (default: 8)
  • :skip_layer - Layer index for skip connection (default: 4)
  • :num_frequencies - Number of Fourier frequency bands (default: 10)
  • :use_viewdir - Whether to use viewing direction input (default: true)

Returns

An Axon model that takes "coordinates" (and optionally "directions") inputs and outputs [batch, 4] (RGB + density).

output_size(opts \\ [])

@spec output_size(keyword()) :: pos_integer()

Get the output size of a NeRF model.

Always returns 4 (RGB + density).