Nasty.Language.Spanish.PhraseParser (Nasty v0.3.0)

View Source

Phrase structure parser for Spanish.

Builds syntactic phrases (NounPhrase, VerbPhrase, etc.) from POS-tagged tokens using bottom-up pattern matching with Spanish word order.

Spanish-Specific Features

  • Post-nominal adjectives: "la casa roja" (the red house)
  • Pre-nominal quantifiers: "muchos libros" (many books)
  • Flexible word order: SVO is default but flexible
  • Clitic pronouns: already attached to verbs by tokenizer

Grammar Rules (Simplified CFG)

NP   Det? QuantAdj* Noun Adj* PP*
VP   Aux* MainVerb NP? PP* Adv*
PP   Prep NP
AdjP  Adv? Adj
AdvP  Adv

Examples

iex> tokens = [
...>   %Token{text: "la", pos_tag: :det},
...>   %Token{text: "casa", pos_tag: :noun},
...>   %Token{text: "roja", pos_tag: :adj}
...> ]
iex> PhraseParser.parse_noun_phrase(tokens, 0)
{:ok, noun_phrase, 3}  # Consumed 3 tokens

Summary

Functions

Parses a Spanish adjectival phrase starting at the given position.

Parses a Spanish adverbial phrase (simple adverb for now).

Parses a Spanish noun phrase starting at the given position.

Parses a Spanish prepositional phrase starting at the given position.

Parses a Spanish relative clause starting at the given position.

Parses a Spanish verb phrase starting at the given position.

Functions

parse_adjectival_phrase(tokens, start_pos)

@spec parse_adjectival_phrase([Nasty.AST.Token.t()], non_neg_integer()) ::
  {:ok, Nasty.AST.AdjectivalPhrase.t(), non_neg_integer()} | :error

Parses a Spanish adjectival phrase starting at the given position.

Grammar: Adv? Adj

Examples: "muy bonita" (very pretty), "bastante grande" (quite big)

Returns {:ok, adj_phrase, next_pos} or :error

parse_adverbial_phrase(tokens, start_pos)

@spec parse_adverbial_phrase([Nasty.AST.Token.t()], non_neg_integer()) ::
  {:ok, Nasty.AST.AdverbialPhrase.t(), non_neg_integer()} | :error

Parses a Spanish adverbial phrase (simple adverb for now).

Grammar: Adv

Returns {:ok, adv_phrase, next_pos} or :error

parse_noun_phrase(tokens, start_pos)

@spec parse_noun_phrase([Nasty.AST.Token.t()], non_neg_integer()) ::
  {:ok, Nasty.AST.NounPhrase.t(), non_neg_integer()} | :error

Parses a Spanish noun phrase starting at the given position.

Grammar: Det? QuantAdj (Noun | PropN | Pron) Adj PP*

Spanish adjectives typically come AFTER the noun (post-nominal), but quantifying adjectives come before (e.g., "muchos", "pocos").

Returns {:ok, noun_phrase, next_pos} or :error

parse_prepositional_phrase(tokens, start_pos)

@spec parse_prepositional_phrase([Nasty.AST.Token.t()], non_neg_integer()) ::
  {:ok, Nasty.AST.PrepositionalPhrase.t(), non_neg_integer()} | :error

Parses a Spanish prepositional phrase starting at the given position.

Grammar: Prep NP

Spanish prepositions: a, ante, bajo, con, contra, de, desde, en, entre, hacia, hasta, para, por, según, sin, sobre, tras

Returns {:ok, prep_phrase, next_pos} or :error

parse_relative_clause(tokens, start_pos)

@spec parse_relative_clause([Nasty.AST.Token.t()], non_neg_integer()) ::
  {:ok, Nasty.AST.RelativeClause.t(), non_neg_integer()} | :error

Parses a Spanish relative clause starting at the given position.

Grammar: RelPron/RelAdv Clause

Relative pronouns: que, quien, quienes, cual, cuales, cuyo Relative adverbs: donde, cuando, como

Returns {:ok, relative_clause, next_pos} or :error

parse_verb_phrase(tokens, start_pos)

@spec parse_verb_phrase([Nasty.AST.Token.t()], non_neg_integer()) ::
  {:ok, Nasty.AST.VerbPhrase.t(), non_neg_integer()} | :error

Parses a Spanish verb phrase starting at the given position.

Grammar: Aux MainVerb NP? PP Adv*

Spanish verb phrases are similar to English, with:

  • Auxiliaries (haber, ser, estar) before main verb
  • Object NP after verb
  • PPs and adverbs as complements

Returns {:ok, verb_phrase, next_pos} or :error