Nasty.Semantic.Coreference.Neural.SpanModel (Nasty v0.3.0)
View SourceEnd-to-end span-based model for coreference resolution.
This model jointly learns mention detection and coreference resolution in a single end-to-end architecture. It consists of:
- Shared BiLSTM encoder
- Span scorer head (mention detection)
- Pairwise scorer head (coreference resolution)
The model is trained with a joint loss function that combines both tasks.
Architecture
Text → Token Embeddings → BiLSTM → Span Representations
↓ ↓
Span Scorer Pair Scorer
(mention?) (coref?)Example
# Build model
model = SpanModel.build_model(
vocab_size: 10000,
embed_dim: 100,
hidden_dim: 256
)
# Initialize parameters
params = SpanModel.init_params(model, template_input)
# Forward pass
{span_scores, coref_scores} = SpanModel.forward(
model,
params,
token_ids,
spans
)
Summary
Functions
Build the shared BiLSTM encoder.
Build the full end-to-end span model.
Build pairwise scorer head.
Build span scorer head.
Build learned width embeddings.
Compute joint loss.
Extract pairwise features between two spans.
Forward pass through the full model.
Functions
Build the shared BiLSTM encoder.
Parameters
vocab_size- Vocabulary sizeembed_dim- Embedding dimensionhidden_dim- LSTM hidden dimensiondropout- Dropout rate
Returns
- Axon model
Build the full end-to-end span model.
Parameters
opts- Model options
Options
:vocab_size- Vocabulary size (required):embed_dim- Token embedding dimension (default: 100):hidden_dim- LSTM hidden dimension (default: 256):width_emb_dim- Span width embedding dimension (default: 20):max_span_width- Maximum span width (default: 10):span_scorer_hidden- Span scorer hidden layers (default: [256, 128]):pair_scorer_hidden- Pair scorer hidden layers (default: [512, 256]):dropout- Dropout rate (default: 0.3)
Returns
- Map with
:encoder,:span_scorer, and:pair_scorermodels
Build pairwise scorer head.
Scores whether two spans are coreferent.
Parameters
pair_dim- Pair representation dimensionhidden_layers- Hidden layer sizesdropout- Dropout rate
Returns
- Axon model
Build span scorer head.
Scores whether a span is a valid mention.
Parameters
span_dim- Span representation dimensionhidden_layers- Hidden layer sizesdropout- Dropout rate
Returns
- Axon model
Build learned width embeddings.
Parameters
max_width- Maximum span widthembed_dim- Embedding dimension
Returns
- Axon model
Compute joint loss.
Parameters
span_scores- Predicted span scorescoref_scores- Predicted coreference scoresgold_span_labels- Gold span labels (1 = mention, 0 = non-mention)gold_coref_labels- Gold coreference labelsopts- Loss options
Options
:span_loss_weight- Weight for span loss (default: 0.3):coref_loss_weight- Weight for coref loss (default: 0.7)
Returns
- Total loss scalar
Extract pairwise features between two spans.
Features include:
- Distance (sentence, token)
- String match (exact, partial, head match)
- Span properties (lengths, positions)
Parameters
span1- First spanspan2- Second spantokens- Document tokens (optional, for string matching)
Returns
- Feature tensor [20]
Forward pass through the full model.
Parameters
models- Model map (encoder, scorers)params- Parameters maptoken_ids- Token ID tensor [batch, seq_len]spans- List of span structs
Returns
{span_scores, coref_scores}tuple