Nasty.Language.English.SentenceParser (Nasty v0.3.0)
View SourceSentence and clause parser for English.
Builds Clause and Sentence structures from phrases.
Approaches
- Rule-based parsing (default): Subject (NP) + Predicate (VP)
- PCFG parsing: Statistical phrase structure parsing
Examples
# Rule-based (default)
iex> tokens = [...] # "The cat sat."
iex> SentenceParser.parse_sentences(tokens)
{:ok, [sentence]}
# PCFG-based
iex> SentenceParser.parse_sentences(tokens, model: :pcfg)
{:ok, [sentence]}
Summary
Functions
Parses a clause from tokens, detecting coordination and subordination.
Parses a single sentence from tokens.
Parses tokens into a list of sentences.
PCFG-based sentence parsing using statistical phrase structure grammar.
Rule-based sentence parsing (original implementation).
Functions
@spec parse_clause([Nasty.AST.Token.t()]) :: {:ok, Nasty.AST.Clause.t() | [Nasty.AST.Clause.t()]} | :error
Parses a clause from tokens, detecting coordination and subordination.
Grammar: Simple: (NP) VP Coordinated: Clause CoordConj Clause Subordinate: SubordConj Clause
@spec parse_sentence([Nasty.AST.Token.t()]) :: Nasty.AST.Sentence.t() | nil
Parses a single sentence from tokens.
Grammar: NP VP (simplified for Phase 3)
@spec parse_sentences( [Nasty.AST.Token.t()], keyword() ) :: {:ok, [Nasty.AST.Sentence.t()]} | {:error, term()}
Parses tokens into a list of sentences.
Identifies sentence boundaries and parses each sentence separately.
Options
:model- Model type::rule_based(default) or:pcfg:pcfg_model- Trained PCFG model (optional, will load from registry if not provided)
Returns
{:ok, sentences}- List of parsed sentences{:error, reason}- Parsing failed
@spec parse_sentences_pcfg( [Nasty.AST.Token.t()], keyword() ) :: {:ok, [Nasty.AST.Sentence.t()]} | {:error, term()}
PCFG-based sentence parsing using statistical phrase structure grammar.
If no model is provided via :pcfg_model option, attempts to load
the latest PCFG model from the registry. Falls back to rule-based
parsing if no model is available.
@spec parse_sentences_rule_based([Nasty.AST.Token.t()]) :: {:ok, [Nasty.AST.Sentence.t()]} | {:error, term()}
Rule-based sentence parsing (original implementation).