Nasty.Utils.Query (Nasty v0.3.0)
View SourceHigh-level query API for extracting information from AST.
Provides convenient functions for common AST queries without requiring explicit traversal logic.
Examples
# Find all noun phrases
iex> Nasty.Utils.Query.find_all(document, :noun_phrase)
[%Nasty.AST.NounPhrase{}, ...]
# Extract entities
iex> Nasty.Utils.Query.extract_entities(document, type: :PERSON)
[%Nasty.AST.Entity{text: "John Smith", type: :PERSON}, ...]
# Find subject of sentence
iex> Nasty.Utils.Query.find_subject(sentence)
%Nasty.AST.NounPhrase{head: %Nasty.AST.Token{text: "cat"}}
Summary
Functions
Checks if all nodes of a type match a predicate.
Checks if any node in the tree matches a predicate.
Gets all content words (nouns, verbs, adjectives, adverbs).
Counts nodes of a specific type in the tree.
Extracts all named entities from the document.
Extracts text spans for all nodes matching a predicate.
Filters nodes by a custom predicate function.
Finds all nodes of a specific type.
Finds all tokens with a specific lemma.
Finds all tokens with a specific POS tag.
Finds all tokens matching a text pattern.
Finds the main verb of a sentence or clause.
Finds all objects (complements) of a verb phrase.
Finds the subject of a sentence or clause.
Gets all function words (determiners, prepositions, conjunctions, etc.).
Gets all sentences from a document.
Gets all tokens from any node.
Functions
Checks if all nodes of a type match a predicate.
Examples
iex> all_lowercase? = fn %Token{text: text} -> text == String.downcase(text) end
iex> tokens = Nasty.Utils.Query.find_all(document, :token)
iex> Enum.all?(tokens, all_lowercase?)
false
Checks if any node in the tree matches a predicate.
Examples
iex> has_verb? = &match?(%Nasty.AST.Token{pos_tag: :verb}, &1)
iex> Nasty.Utils.Query.any?(document, has_verb?)
true
@spec content_words(term()) :: [Nasty.AST.Token.t()]
Gets all content words (nouns, verbs, adjectives, adverbs).
Examples
iex> Nasty.Utils.Query.content_words(document)
[%Nasty.AST.Token{text: "cat", pos_tag: :noun}, ...]
@spec count(term(), atom()) :: non_neg_integer()
Counts nodes of a specific type in the tree.
Examples
iex> Nasty.Utils.Query.count(document, :token)
42
iex> Nasty.Utils.Query.count(document, :sentence)
7
@spec extract_entities( term(), keyword() ) :: [Nasty.AST.Semantic.Entity.t()]
Extracts all named entities from the document.
Options
:type- Filter by entity type (e.g., :PERSON, :ORG, :LOC)
Examples
iex> Nasty.Utils.Query.extract_entities(document)
[%Nasty.AST.Entity{text: "John", type: :PERSON}, ...]
iex> Nasty.Utils.Query.extract_entities(document, type: :PERSON)
[%Nasty.AST.Entity{text: "John", type: :PERSON}, ...]
Extracts text spans for all nodes matching a predicate.
Returns a list of {text, span} tuples.
Examples
iex> is_noun? = &match?(%Nasty.AST.Token{pos_tag: :noun}, &1)
iex> Nasty.Utils.Query.extract_spans(document, source_text, is_noun?)
[{"cat", %{start_pos: {1, 4}, end_pos: {1, 7}, ...}}, ...]
Filters nodes by a custom predicate function.
Examples
iex> is_question? = &match?(%Nasty.AST.Sentence{function: :interrogative}, &1)
iex> Nasty.Utils.Query.filter(document, is_question?)
[%Nasty.AST.Sentence{function: :interrogative}, ...]
Finds all nodes of a specific type.
Examples
iex> Nasty.Utils.Query.find_all(document, :noun_phrase)
[%Nasty.AST.NounPhrase{}, ...]
iex> Nasty.Utils.Query.find_all(document, :token)
[%Nasty.AST.Token{}, ...]
@spec find_by_lemma(term(), String.t()) :: [Nasty.AST.Token.t()]
Finds all tokens with a specific lemma.
Examples
iex> Nasty.Utils.Query.find_by_lemma(document, "run")
[%Nasty.AST.Token{text: "runs", lemma: "run"}, ...]
@spec find_by_pos(term(), atom()) :: [Nasty.AST.Token.t()]
Finds all tokens with a specific POS tag.
Examples
iex> Nasty.Utils.Query.find_by_pos(document, :noun)
[%Nasty.AST.Token{text: "cat", pos_tag: :noun}, ...]
iex> Nasty.Utils.Query.find_by_pos(document, :verb)
[%Nasty.AST.Token{text: "runs", pos_tag: :verb}, ...]
@spec find_by_text(term(), String.t() | Regex.t()) :: [Nasty.AST.Token.t()]
Finds all tokens matching a text pattern.
Examples
iex> Nasty.Utils.Query.find_by_text(document, "cat")
[%Nasty.AST.Token{text: "cat"}, ...]
iex> Nasty.Utils.Query.find_by_text(document, ~r/^run/)
[%Nasty.AST.Token{text: "run"}, %Nasty.AST.Token{text: "runs"}, ...]
@spec find_main_verb( Nasty.AST.Sentence.t() | Nasty.AST.Clause.t() | Nasty.AST.VerbPhrase.t() ) :: Nasty.AST.Token.t() | nil
Finds the main verb of a sentence or clause.
Returns the head verb token if present, otherwise nil.
Examples
iex> sentence = %Nasty.AST.Sentence{...}
iex> Nasty.Utils.Query.find_main_verb(sentence)
%Nasty.AST.Token{text: "runs", pos_tag: :verb}
@spec find_objects( Nasty.AST.VerbPhrase.t() | Nasty.AST.Clause.t() | Nasty.AST.Sentence.t() ) :: [term()]
Finds all objects (complements) of a verb phrase.
Examples
iex> vp = %Nasty.AST.VerbPhrase{complements: [obj1, obj2]}
iex> Nasty.Utils.Query.find_objects(vp)
[obj1, obj2]
@spec find_subject(Nasty.AST.Sentence.t() | Nasty.AST.Clause.t()) :: Nasty.AST.NounPhrase.t() | nil
Finds the subject of a sentence or clause.
Returns the subject noun phrase if present, otherwise nil.
Examples
iex> sentence = %Nasty.AST.Sentence{...}
iex> Nasty.Utils.Query.find_subject(sentence)
%Nasty.AST.NounPhrase{head: %Nasty.AST.Token{text: "cat"}}
@spec function_words(term()) :: [Nasty.AST.Token.t()]
Gets all function words (determiners, prepositions, conjunctions, etc.).
Examples
iex> Nasty.Utils.Query.function_words(document)
[%Nasty.AST.Token{text: "the", pos_tag: :det}, ...]
@spec sentences(Nasty.AST.Document.t()) :: [Nasty.AST.Sentence.t()]
Gets all sentences from a document.
Examples
iex> Nasty.Utils.Query.sentences(document)
[%Nasty.AST.Sentence{}, ...]
@spec tokens(term()) :: [Nasty.AST.Token.t()]
Gets all tokens from any node.
Examples
iex> Nasty.Utils.Query.tokens(document)
[%Nasty.AST.Token{}, ...]