Edifice.Meta.MixtureOfAgents (Edifice v0.2.0)

Mixture of Agents: N proposer models feed into an aggregator.

Implements a multi-agent architecture where N independent "proposer" transformer stacks process the same input in parallel, then their outputs are concatenated and fed into a larger "aggregator" transformer stack that combines the proposals.

Architecture

Input [batch, seq, embed_dim]
      |
      +----+----+----+----+
      |    |    |    |    |
      v    v    v    v    v
     P1   P2   P3   P4  ...  (Proposer stacks)
      |    |    |    |    |
      v    v    v    v    v
Concatenate along feature dim
      |
      v
[batch, seq, num_proposers * proposer_hidden]
      |
      v
Dense projection to aggregator_hidden
      |
      v
+-----------------------------+
|   Aggregator Transformer    |
|   (larger, combines all)    |
+-----------------------------+
      |
      v
Final Norm -> Last Timestep
Output [batch, aggregator_hidden_size]

Design

Each proposer is a lightweight transformer stack that can specialize on different aspects of the input. The aggregator is typically larger and learns to combine the diverse proposals into a unified representation.

Usage

model = MixtureOfAgents.build(
  embed_dim: 287,
  num_proposers: 4,
  proposer_hidden_size: 128,
  aggregator_hidden_size: 256,
  proposer_layers: 2,
  aggregator_layers: 2
)

References

Wang et al., "Mixture-of-Agents Enhances Large Language Model Capabilities" (2024)

Summary

Types

build_opt()

Options for build/1.

Functions

build(opts \\ [])

Build a Mixture of Agents model.

output_size(opts \\ [])

Get the output size of a MixtureOfAgents model.

recommended_defaults()

Get recommended defaults for MixtureOfAgents.