Nasty.Language.Catalan.PhraseParser (Nasty v0.3.0)

View Source

Phrase structure parser for Catalan.

Builds syntactic phrases (NounPhrase, VerbPhrase, etc.) from POS-tagged tokens using bottom-up pattern matching with Catalan word order.

Catalan-Specific Features

  • Post-nominal adjectives: "la casa vermella" (the red house)
  • Pre-nominal quantifiers: "molts llibres" (many books)
  • Flexible word order: SVO is default but flexible
  • Interpunct words: treated as single lexical units
  • Clitic pronouns: em, et, es, el, la

Grammar Rules (Simplified CFG)

NP   Det? QuantAdj* Noun Adj* PP*
VP   Aux* MainVerb NP? PP* Adv*
PP   Prep NP
AdjP  Adv? Adj
AdvP  Adv

Summary

Functions

Parses a Catalan noun phrase starting at the given position.

Parses a prepositional phrase.

Parses a Catalan verb phrase starting at the given position.

Functions

parse_noun_phrase(tokens, start_pos)

@spec parse_noun_phrase([Nasty.AST.Token.t()], non_neg_integer()) ::
  {:ok, Nasty.AST.NounPhrase.t(), non_neg_integer()} | :error

Parses a Catalan noun phrase starting at the given position.

Grammar: Det? QuantAdj (Noun | PropN | Pron) Adj PP*

Returns {:ok, noun_phrase, next_pos} or :error

parse_prep_phrase(tokens, start_pos)

@spec parse_prep_phrase([Nasty.AST.Token.t()], non_neg_integer()) ::
  {:ok, Nasty.AST.PrepositionalPhrase.t(), non_neg_integer()} | :error

Parses a prepositional phrase.

Grammar: Prep NP

parse_verb_phrase(tokens, start_pos)

@spec parse_verb_phrase([Nasty.AST.Token.t()], non_neg_integer()) ::
  {:ok, Nasty.AST.VerbPhrase.t(), non_neg_integer()} | :error

Parses a Catalan verb phrase starting at the given position.

Grammar: Aux MainVerb NP? PP Adv*

Returns {:ok, verb_phrase, next_pos} or :error