Nasty.AST.Token (Nasty v0.3.0)

View Source

Token node representing a single word or punctuation mark.

Uses Universal Dependencies POS tag set for cross-linguistic consistency.

Summary

Types

Morphological features following Universal Dependencies.

Universal Dependencies POS tags.

t()

Functions

Checks if a POS tag is a content word (open class).

Checks if a POS tag is a function word (closed class).

Returns all supported Universal Dependencies POS tags.

Types

morphology()

@type morphology() :: %{required(atom()) => atom()}

Morphological features following Universal Dependencies.

Common features:

  • number: :singular | :plural

  • tense: :past | :present | :future

  • person: :first | :second | :third

  • case: :nominative | :accusative | :genitive | etc.

  • gender: :masculine | :feminine | :neuter

  • mood: :indicative | :subjunctive | :imperative

  • voice: :active | :passive

Reference: https://universaldependencies.org/u/feat/

pos_tag()

@type pos_tag() ::
  :adj
  | :adp
  | :adv
  | :aux
  | :cconj
  | :det
  | :intj
  | :noun
  | :num
  | :part
  | :pron
  | :propn
  | :punct
  | :sconj
  | :sym
  | :verb
  | :x

Universal Dependencies POS tags.

Open Class Words (content)

  • :adj - Adjective
  • :adv - Adverb
  • :intj - Interjection
  • :noun - Noun
  • :propn - Proper noun
  • :verb - Verb

Closed Class Words (function)

  • :adp - Adposition (preposition/postposition)
  • :aux - Auxiliary verb
  • :cconj - Coordinating conjunction
  • :det - Determiner
  • :num - Numeral
  • :part - Particle
  • :pron - Pronoun
  • :sconj - Subordinating conjunction

Other

  • :punct - Punctuation
  • :sym - Symbol
  • :x - Other (foreign words, typos, etc.)

Reference: https://universaldependencies.org/u/pos/

t()

@type t() :: %Nasty.AST.Token{
  language: Nasty.AST.Node.language(),
  lemma: String.t(),
  morphology: morphology(),
  pos_tag: pos_tag(),
  span: Nasty.AST.Node.span(),
  text: String.t()
}

Functions

content_word?(pos_tag)

@spec content_word?(pos_tag()) :: boolean()

Checks if a POS tag is a content word (open class).

Examples

iex> Nasty.AST.Token.content_word?(:noun)
true
iex> Nasty.AST.Token.content_word?(:det)
false

function_word?(pos_tag)

@spec function_word?(pos_tag()) :: boolean()

Checks if a POS tag is a function word (closed class).

Examples

iex> Nasty.AST.Token.function_word?(:det)
true
iex> Nasty.AST.Token.function_word?(:noun)
false

new(text, pos_tag, language, span, opts \\ [])

Creates a new token.

Examples

iex> span = Nasty.AST.Node.make_span({1, 0}, 0, {1, 3}, 3)
iex> token = Nasty.AST.Token.new("cat", :noun, :en, span)
iex> token.text
"cat"
iex> token.pos_tag
:noun

pos_tags()

@spec pos_tags() :: [pos_tag()]

Returns all supported Universal Dependencies POS tags.