# `Edifice.Feedforward.KAN`
[🔗](https://github.com/blasphemetheus/edifice/blob/main/lib/edifice/feedforward/kan.ex#L1)

KAN: Kolmogorov-Arnold Networks with learnable activation functions.

Implements KAN from "KAN: Kolmogorov-Arnold Networks" (Liu et al., 2024).
Based on the Kolmogorov-Arnold representation theorem: any multivariate
continuous function can be represented as compositions of univariate functions.

## Key Innovation: Learnable Edge Activations

Unlike MLPs with fixed activations on nodes, KAN has learnable activations on edges:

```
MLP:  y = W2 * sigma(W1 * x)           # Fixed sigma (ReLU, etc.)
KAN:  y = Sum_j Phi_j(x_j)             # Learnable Phi_j per edge
```

Each edge activation is parameterized as:
```
Phi(x) = w_base * SiLU(x) + w_spline * spline(x)
```

## Basis Function Options

This implementation supports multiple basis functions:

| Basis | Formula | Params | Speed |
|-------|---------|--------|-------|
| `:bspline` (default) | Sum c*B_k(x) (cubic B-spline) | O(oig) | Medium |
| `:sine` | Sum A*sin(omega*x + phi) | O(oig) | Fast |
| `:chebyshev` | Sum c*Tn(x) | O(oig) | Fast |
| `:fourier` | Sum (a*cos + b*sin) | O(2oig) | Medium |
| `:rbf` | Sum w*exp(-||x-mu||^2/2sigma^2) | O(oig) | Medium |

## Architecture

```
Input [batch, seq_len, embed_dim]
      |
      v
+-------------------------------------+
|       KAN Block                      |
|  LayerNorm -> KAN Layer -> Residual  |
|  LayerNorm -> KAN Layer -> Residual  |
+-------------------------------------+
      | (repeat for num_layers)
      v
Output [batch, hidden_size]
```

## Usage

    # Build KAN backbone
    model = KAN.build(
      embed_dim: 287,
      hidden_size: 256,
      num_layers: 4,
      grid_size: 8,
      basis: :sine
    )

## Comparison with MLP

| Aspect | MLP | KAN |
|--------|-----|-----|
| Activation | Fixed on nodes | Learnable on edges |
| Interpretability | Low | High (visualizable) |
| Parameters | O(n^2) | O(n^2*g) where g=grid |
| Best for | General tasks | Symbolic/scientific |

## References
- Paper: https://arxiv.org/abs/2404.19756
- SineKAN: https://www.frontiersin.org/articles/10.3389/frai.2024.1462952
- GitHub: https://github.com/KindXiaoming/pykan

# `build_opt`

```elixir
@type build_opt() ::
  {:dropout, float()}
  | {:embed_dim, pos_integer()}
  | {:hidden_size, pos_integer()}
  | {:num_layers, pos_integer()}
  | {:seq_len, pos_integer()}
  | {:window_size, pos_integer()}
```

Options for `build/1`.

# `build`

```elixir
@spec build([build_opt()]) :: Axon.t()
```

Build a KAN model for sequence processing.

## Options
  - `:embed_dim` - Size of input embedding per frame (required)
  - `:hidden_size` - Internal hidden dimension (default: 256)
  - `:num_layers` - Number of KAN blocks (default: 4)
  - `:grid_size` - Number of basis functions per edge (default: 8)
  - `:basis` - Basis function type: :bspline, :sine, :chebyshev, :fourier, :rbf (default: :bspline)
  - `:dropout` - Dropout rate (default: 0.0)
  - `:window_size` - Expected sequence length (default: 60)
  - `:base_weight` - Weight for base SiLU activation (default: 0.5)

## Returns
  An Axon model that processes sequences and outputs the last hidden state.

# `build_kan_block`

```elixir
@spec build_kan_block(
  Axon.t(),
  keyword()
) :: Axon.t()
```

Build a single KAN block.

KAN block structure:
1. LayerNorm -> KAN Layer -> Residual
2. LayerNorm -> KAN Layer (wider) -> Residual

# `build_kan_layer`

```elixir
@spec build_kan_layer(Axon.t(), pos_integer(), keyword()) :: Axon.t()
```

Build a KAN layer with learnable edge activations.

KAN layer computes:
```
y_i = Sum_j Phi_ij(x_j)
```

Where each Phi_ij is approximated as:
```
Phi(x) = w_base * SiLU(x) + w_spline * Sum sin(omega*x)
```

This implementation uses a combination of:
1. Base activation: SiLU(x) for gradient flow
2. Learnable activation: Multi-frequency sine basis projected through dense layers

# `chebyshev_basis`

Compute Chebyshev polynomial basis functions.

ChebyKAN: y = Sum c * Tn(x)
where T0(x) = 1, T1(x) = x, Tn+1(x) = 2x*Tn(x) - Tn-1(x)

# `default_basis`

```elixir
@spec default_basis() :: atom()
```

Default basis function type

# `default_dropout`

```elixir
@spec default_dropout() :: float()
```

Default dropout rate

# `default_grid_size`

```elixir
@spec default_grid_size() :: pos_integer()
```

Default grid size (number of basis functions)

# `default_hidden_size`

```elixir
@spec default_hidden_size() :: pos_integer()
```

Default hidden dimension

# `default_num_layers`

```elixir
@spec default_num_layers() :: pos_integer()
```

Default number of layers

# `eps`

```elixir
@spec eps() :: float()
```

Epsilon for numerical stability

# `output_size`

```elixir
@spec output_size(keyword()) :: non_neg_integer()
```

Get the output size of a KAN model.

# `param_count`

```elixir
@spec param_count(keyword()) :: non_neg_integer()
```

Calculate approximate parameter count for a KAN model.

# `rbf_basis`

Compute RBF (Radial Basis Function) basis.

y = Sum w * exp(-||x - mu||^2 / 2*sigma^2)

# `recommended_defaults`

```elixir
@spec recommended_defaults() :: keyword()
```

Get recommended defaults for sequence processing.

# `sine_basis`

Compute sine basis functions.

SineKAN: y = Sum A * sin(omega * x + phi)

---

*Consult [api-reference.md](api-reference.md) for complete listing*
