Nasty.Semantic.Coreference.Neural.SpanModel (Nasty v0.3.0)

View Source

End-to-end span-based model for coreference resolution.

This model jointly learns mention detection and coreference resolution in a single end-to-end architecture. It consists of:

  1. Shared BiLSTM encoder
  2. Span scorer head (mention detection)
  3. Pairwise scorer head (coreference resolution)

The model is trained with a joint loss function that combines both tasks.

Architecture

Text  Token Embeddings  BiLSTM  Span Representations
                                                       
                              Span Scorer         Pair Scorer
                              (mention?)          (coref?)

Example

# Build model
model = SpanModel.build_model(
  vocab_size: 10000,
  embed_dim: 100,
  hidden_dim: 256
)

# Initialize parameters
params = SpanModel.init_params(model, template_input)

# Forward pass
{span_scores, coref_scores} = SpanModel.forward(
  model,
  params,
  token_ids,
  spans
)

Summary

Functions

build_encoder(vocab_size, embed_dim, hidden_dim, dropout)

Build the shared BiLSTM encoder.

Parameters

  • vocab_size - Vocabulary size
  • embed_dim - Embedding dimension
  • hidden_dim - LSTM hidden dimension
  • dropout - Dropout rate

Returns

  • Axon model

build_model(opts)

Build the full end-to-end span model.

Parameters

  • opts - Model options

Options

  • :vocab_size - Vocabulary size (required)
  • :embed_dim - Token embedding dimension (default: 100)
  • :hidden_dim - LSTM hidden dimension (default: 256)
  • :width_emb_dim - Span width embedding dimension (default: 20)
  • :max_span_width - Maximum span width (default: 10)
  • :span_scorer_hidden - Span scorer hidden layers (default: [256, 128])
  • :pair_scorer_hidden - Pair scorer hidden layers (default: [512, 256])
  • :dropout - Dropout rate (default: 0.3)

Returns

  • Map with :encoder, :span_scorer, and :pair_scorer models

build_pair_scorer(pair_dim, hidden_layers, dropout)

Build pairwise scorer head.

Scores whether two spans are coreferent.

Parameters

  • pair_dim - Pair representation dimension
  • hidden_layers - Hidden layer sizes
  • dropout - Dropout rate

Returns

  • Axon model

build_span_scorer(span_dim, hidden_layers, dropout)

Build span scorer head.

Scores whether a span is a valid mention.

Parameters

  • span_dim - Span representation dimension
  • hidden_layers - Hidden layer sizes
  • dropout - Dropout rate

Returns

  • Axon model

build_width_embeddings(max_width, embed_dim)

Build learned width embeddings.

Parameters

  • max_width - Maximum span width
  • embed_dim - Embedding dimension

Returns

  • Axon model

compute_loss(span_scores, coref_scores, gold_span_labels, gold_coref_labels, opts \\ [])

Compute joint loss.

Parameters

  • span_scores - Predicted span scores
  • coref_scores - Predicted coreference scores
  • gold_span_labels - Gold span labels (1 = mention, 0 = non-mention)
  • gold_coref_labels - Gold coreference labels
  • opts - Loss options

Options

  • :span_loss_weight - Weight for span loss (default: 0.3)
  • :coref_loss_weight - Weight for coref loss (default: 0.7)

Returns

  • Total loss scalar

extract_pair_features(span1, span2, tokens \\ nil)

Extract pairwise features between two spans.

Features include:

  • Distance (sentence, token)
  • String match (exact, partial, head match)
  • Span properties (lengths, positions)

Parameters

  • span1 - First span
  • span2 - Second span
  • tokens - Document tokens (optional, for string matching)

Returns

  • Feature tensor [20]

forward(models, params, token_ids, spans)

Forward pass through the full model.

Parameters

  • models - Model map (encoder, scorers)
  • params - Parameters map
  • token_ids - Token ID tensor [batch, seq_len]
  • spans - List of span structs

Returns

  • {span_scores, coref_scores} tuple