Nasty.Semantic.Coreference.Neural.SpanModel (Nasty v0.3.0)

End-to-end span-based model for coreference resolution.

This model jointly learns mention detection and coreference resolution in a single end-to-end architecture. It consists of:

Shared BiLSTM encoder
Span scorer head (mention detection)
Pairwise scorer head (coreference resolution)

The model is trained with a joint loss function that combines both tasks.

Architecture

Text → Token Embeddings → BiLSTM → Span Representations
                                     ↓                  ↓
                              Span Scorer         Pair Scorer
                              (mention?)          (coref?)

Example

# Build model
model = SpanModel.build_model(
  vocab_size: 10000,
  embed_dim: 100,
  hidden_dim: 256
)

# Initialize parameters
params = SpanModel.init_params(model, template_input)

# Forward pass
{span_scores, coref_scores} = SpanModel.forward(
  model,
  params,
  token_ids,
  spans
)

Summary

Functions

build_encoder(vocab_size, embed_dim, hidden_dim, dropout)

Build the shared BiLSTM encoder.

build_model(opts)

Build the full end-to-end span model.

build_pair_scorer(pair_dim, hidden_layers, dropout)

Build pairwise scorer head.

build_span_scorer(span_dim, hidden_layers, dropout)

Build span scorer head.

build_width_embeddings(max_width, embed_dim)

Build learned width embeddings.

compute_loss(span_scores, coref_scores, gold_span_labels, gold_coref_labels, opts \\ [])

Compute joint loss.

extract_pair_features(span1, span2, tokens \\ nil)

Extract pairwise features between two spans.

forward(models, params, token_ids, spans)

Forward pass through the full model.

Functions

build_encoder(vocab_size, embed_dim, hidden_dim, dropout)

Build the shared BiLSTM encoder.

Parameters

vocab_size - Vocabulary size
embed_dim - Embedding dimension
hidden_dim - LSTM hidden dimension
dropout - Dropout rate

Returns

Axon model

build_model(opts)

Build the full end-to-end span model.

Parameters

opts - Model options

Options

:vocab_size - Vocabulary size (required)
:embed_dim - Token embedding dimension (default: 100)
:hidden_dim - LSTM hidden dimension (default: 256)
:width_emb_dim - Span width embedding dimension (default: 20)
:max_span_width - Maximum span width (default: 10)
:span_scorer_hidden - Span scorer hidden layers (default: [256, 128])
:pair_scorer_hidden - Pair scorer hidden layers (default: [512, 256])
:dropout - Dropout rate (default: 0.3)

Returns

Map with :encoder, :span_scorer, and :pair_scorer models

build_pair_scorer(pair_dim, hidden_layers, dropout)

Build pairwise scorer head.

Scores whether two spans are coreferent.

Parameters

pair_dim - Pair representation dimension
hidden_layers - Hidden layer sizes
dropout - Dropout rate

Returns

Axon model

build_span_scorer(span_dim, hidden_layers, dropout)

Build span scorer head.

Scores whether a span is a valid mention.

Parameters

span_dim - Span representation dimension
hidden_layers - Hidden layer sizes
dropout - Dropout rate

Returns

Axon model

build_width_embeddings(max_width, embed_dim)

Build learned width embeddings.

Parameters

max_width - Maximum span width
embed_dim - Embedding dimension

Returns

Axon model

compute_loss(span_scores, coref_scores, gold_span_labels, gold_coref_labels, opts \\ [])

Compute joint loss.

Parameters

span_scores - Predicted span scores
coref_scores - Predicted coreference scores
gold_span_labels - Gold span labels (1 = mention, 0 = non-mention)
gold_coref_labels - Gold coreference labels
opts - Loss options

Options

:span_loss_weight - Weight for span loss (default: 0.3)
:coref_loss_weight - Weight for coref loss (default: 0.7)

Returns

Total loss scalar

extract_pair_features(span1, span2, tokens \\ nil)

Extract pairwise features between two spans.

Features include:

Distance (sentence, token)
String match (exact, partial, head match)
Span properties (lengths, positions)

Parameters

span1 - First span
span2 - Second span
tokens - Document tokens (optional, for string matching)

Returns

Feature tensor [20]

forward(models, params, token_ids, spans)

Forward pass through the full model.

Parameters

models - Model map (encoder, scorers)
params - Parameters map
token_ids - Token ID tensor [batch, seq_len]
spans - List of span structs

Returns

{span_scores, coref_scores} tuple