Nasty.Semantic.Coreference.Scorer (Nasty v0.3.0)

View Source

Generic scoring module for coreference resolution.

Scores pairs of mentions for coreference likelihood using multiple features:

  • Sentence distance (recency)
  • Gender and number agreement
  • String matching (exact and partial)
  • Entity type compatibility
  • Mention type patterns (pronoun-name, etc.)

All weights are configurable to allow tuning for different languages and domains.

Summary

Functions

Scores based on sentence distance (recency).

Scores based on entity type match.

Scores based on gender agreement.

Scores based on number agreement.

Scores based on partial string match.

Boost score for pronoun-name pairs.

Scores a pair of mention clusters for merging.

Scores a pair of mentions for coreference likelihood.

Scores based on exact string match (case-insensitive).

Functions

distance_score(m1, m2, max_distance, weight)

Scores based on sentence distance (recency).

Closer mentions get higher scores. Mentions beyond max_distance get 0 score.

entity_type_score(m1, m2, weight)

@spec entity_type_score(
  Nasty.AST.Semantic.Mention.t(),
  Nasty.AST.Semantic.Mention.t(),
  float()
) ::
  float()

Scores based on entity type match.

Returns weight if both have same entity type, 0 otherwise.

gender_agreement_score(m1, m2, weight)

@spec gender_agreement_score(
  Nasty.AST.Semantic.Mention.t(),
  Nasty.AST.Semantic.Mention.t(),
  float()
) ::
  float()

Scores based on gender agreement.

Returns weight if genders agree, 0 otherwise.

number_agreement_score(m1, m2, weight)

@spec number_agreement_score(
  Nasty.AST.Semantic.Mention.t(),
  Nasty.AST.Semantic.Mention.t(),
  float()
) ::
  float()

Scores based on number agreement.

Returns weight if numbers agree, 0 otherwise.

partial_match_score(m1, m2, weight)

@spec partial_match_score(
  Nasty.AST.Semantic.Mention.t(),
  Nasty.AST.Semantic.Mention.t(),
  float()
) ::
  float()

Scores based on partial string match.

Returns weight if one text contains the other, 0 otherwise.

pronoun_name_boost(m1, m2, weight)

@spec pronoun_name_boost(
  Nasty.AST.Semantic.Mention.t(),
  Nasty.AST.Semantic.Mention.t(),
  float()
) ::
  float()

Boost score for pronoun-name pairs.

These are common coreference patterns (e.g., "John... he"). Returns weight if one is pronoun and other is proper name, 0 otherwise.

score_cluster_pair(cluster1, cluster2, opts \\ [])

@spec score_cluster_pair(
  [Nasty.AST.Semantic.Mention.t()],
  [Nasty.AST.Semantic.Mention.t()],
  keyword()
) ::
  float()

Scores a pair of mention clusters for merging.

Uses average linkage: averages scores of all mention pairs between clusters.

Options

  • :merge_strategy - Linkage type (default: :average)
    • :average - Average of all pairwise scores
    • :best - Maximum pairwise score
    • :worst - Minimum pairwise score

score_mention_pair(m1, m2, opts \\ [])

@spec score_mention_pair(
  Nasty.AST.Semantic.Mention.t(),
  Nasty.AST.Semantic.Mention.t(),
  keyword()
) ::
  float()

Scores a pair of mentions for coreference likelihood.

Returns a float score between 0.0 and ~2.0, where higher scores indicate stronger coreference evidence.

Parameters

  • mention1 - First mention
  • mention2 - Second mention
  • opts - Scoring options
    • :max_distance - Maximum sentence distance (default: 3)
    • :weights - Custom weight configuration (default: @default_weights)

Examples

iex> score = Scorer.score_mention_pair(m1, m2, max_distance: 3)
0.85

string_match_score(m1, m2, weight)

@spec string_match_score(
  Nasty.AST.Semantic.Mention.t(),
  Nasty.AST.Semantic.Mention.t(),
  float()
) ::
  float()

Scores based on exact string match (case-insensitive).

Returns weight if texts match exactly, 0 otherwise.