Nasty.Semantic.Coreference.Neural.PairScorer (Nasty v0.3.0)

View Source

Neural pairwise coreference scorer.

Scores pairs of mentions for coreference likelihood using a feedforward network over mention representations and hand-crafted features.

Architecture

  1. Concatenate mention encodings [m1, m2]
  2. Extract hand-crafted features
  3. Concatenate all features
  4. Feedforward network (2-3 hidden layers)
  5. Sigmoid output for probability

Example

# Build model
model = PairScorer.build_model(
  mention_dim: 256,
  feature_dim: 20,
  hidden_dims: [512, 256]
)

# Score pair
score = PairScorer.score_pair(
  model,
  params,
  mention1_encoding,
  mention2_encoding,
  features
)

Summary

Functions

Batch score multiple mention pairs.

Build the pair scorer model.

Extract hand-crafted features from mention pair.

Get feature dimension (number of features extracted).

Types

features()

@type features() :: Nx.Tensor.t()

model()

@type model() :: Axon.t()

params()

@type params() :: map()

Functions

batch_score_pairs(model, params, pairs)

@spec batch_score_pairs(
  model(),
  params(),
  [{Nx.Tensor.t(), Nx.Tensor.t(), Nx.Tensor.t()}]
) :: Nx.Tensor.t()

Batch score multiple mention pairs.

Parameters

  • model - Trained model
  • params - Model parameters
  • pairs - List of {m1_encoding, m2_encoding, features} tuples

Returns

Tensor of coreference probabilities [batch_size]

build_model(opts \\ [])

@spec build_model(keyword()) :: model()

Build the pair scorer model.

Options

  • :mention_dim - Dimension of mention encodings (required)
  • :feature_dim - Number of hand-crafted features (default: 20)
  • :hidden_dims - List of hidden layer dimensions (default: [512, 256])
  • :dropout - Dropout rate (default: 0.3)

Returns

Axon model that takes mention pairs and features, returns coreference probability

extract_features(mention1, mention2, document \\ nil)

@spec extract_features(
  Nasty.AST.Semantic.Mention.t(),
  Nasty.AST.Semantic.Mention.t(),
  map() | nil
) ::
  Nx.Tensor.t()

Extract hand-crafted features from mention pair.

Features include:

  • Distance features (sentence, token, mention)
  • String match features (exact, partial, head)
  • Mention type features (pronoun, name, nominal)
  • Agreement features (gender, number)
  • Syntactic features (same sentence, same paragraph)

Parameters

  • mention1 - First mention
  • mention2 - Second mention
  • document - Document context (optional, for additional features)

Returns

Feature vector as tensor [feature_dim]

feature_dim()

@spec feature_dim() :: pos_integer()

Get feature dimension (number of features extracted).

score_pair(model, params, mention1_encoding, mention2_encoding, features)

@spec score_pair(model(), params(), Nx.Tensor.t(), Nx.Tensor.t(), Nx.Tensor.t()) ::
  float()

Score a pair of mentions for coreference.

Parameters

  • model - Trained model
  • params - Model parameters
  • mention1_encoding - Encoding of first mention [mention_dim]
  • mention2_encoding - Encoding of second mention [mention_dim]
  • features - Hand-crafted features [feature_dim]

Returns

Probability that mentions corefer (0.0 to 1.0)