Nasty.Semantic.Coreference.MentionDetector (Nasty v0.3.0)

View Source

Generic mention detection for coreference resolution.

Extracts three types of mentions from documents:

  1. Pronouns - personal, possessive, reflexive
  2. Proper names - from entity recognition
  3. Definite noun phrases - determiners like "the", "this", "that"

The detector is language-agnostic and accepts callbacks for language-specific classification (pronoun types, gender inference, etc.).

Summary

Functions

Extracts all mentions from a document.

Extracts mentions from a single sentence.

Extracts all tokens from a clause.

Extracts tokens from a noun phrase.

Extracts tokens from a verb phrase.

Types

language_config()

@type language_config() :: %{
  pronoun?: (Nasty.AST.Token.t() -> boolean()),
  classify_pronoun: (String.t() -> {atom(), atom()}),
  infer_gender: (String.t(), atom() -> atom()),
  definite_determiner?: (String.t() -> boolean()),
  plural_marker?: (String.t() -> boolean())
}

Functions

extract_mentions(document, config)

@spec extract_mentions(Nasty.AST.Document.t(), language_config()) :: [
  Nasty.AST.Semantic.Mention.t()
]

Extracts all mentions from a document.

Parameters

  • document - Document AST to extract mentions from
  • config - Language-specific configuration with callback functions
    • :pronoun? - Function to check if token is a pronoun
    • :classify_pronoun - Function to get pronoun gender/number
    • :infer_gender - Function to infer gender from name/entity type
    • :definite_determiner? - Function to check if text is definite determiner
    • :plural_marker? - Function to check if text indicates plural

Returns

List of Mention structs with position, type, and agreement features.

Examples

iex> config = %{
...>   pronoun?: &EnglishConfig.pronoun?/1,
...>   classify_pronoun: &EnglishConfig.classify_pronoun/1,
...>   ...
...> }
iex> mentions = MentionDetector.extract_mentions(document, config)
[%Mention{text: "John", type: :proper_name}, ...]

extract_mentions_from_sentence(sentence, sent_idx, config)

@spec extract_mentions_from_sentence(
  Nasty.AST.Sentence.t(),
  non_neg_integer(),
  language_config()
) :: [
  Nasty.AST.Semantic.Mention.t()
]

Extracts mentions from a single sentence.

Returns pronoun, entity, and definite NP mentions.

extract_tokens_from_clause(clause)

@spec extract_tokens_from_clause(Nasty.AST.Clause.t()) :: [Nasty.AST.Token.t()]

Extracts all tokens from a clause.

Recursively extracts tokens from subject NP and predicate VP.

extract_tokens_from_np(np)

@spec extract_tokens_from_np(Nasty.AST.NounPhrase.t()) :: [Nasty.AST.Token.t()]

Extracts tokens from a noun phrase.

Includes determiner, modifiers, and head.

extract_tokens_from_vp(arg1)

@spec extract_tokens_from_vp(map()) :: [Nasty.AST.Token.t()]

Extracts tokens from a verb phrase.

Includes auxiliaries and main verb head.