Nasty.Language.English.TransformerPOSTagger (Nasty v0.3.0)
View SourceTransformer-based Part-of-Speech tagger for English.
Uses pre-trained transformer models (BERT, RoBERTa, etc.) fine-tuned for POS tagging to achieve state-of-the-art accuracy (98-99%).
The tagger supports multiple transformer models and provides seamless integration with the existing Nasty POS tagging API.
Summary
Functions
Gets the label map (ID to UPOS tag).
Returns the number of POS labels.
Tags tokens with POS tags using a transformer model.
Gets the tag to ID map (UPOS tag to ID).
Functions
Gets the label map (ID to UPOS tag).
Examples
TransformerPOSTagger.label_map()
# => %{0 => "ADJ", 1 => "ADP", ...}
@spec num_labels() :: integer()
Returns the number of POS labels.
Examples
TransformerPOSTagger.num_labels()
# => 17
@spec tag_pos( [Nasty.AST.Token.t()], keyword() ) :: {:ok, [Nasty.AST.Token.t()]} | {:error, term()}
Tags tokens with POS tags using a transformer model.
Options
:model- Model to use: atom name (e.g., :roberta_base) or :transformer (uses default):cache_dir- Directory for model caching:device- Device to use (:cpu or :cuda, default: :cpu):use_cache- Whether to use prediction caching (default: true)
Examples
{:ok, tokens} = Tokenizer.tokenize("The cat sat")
{:ok, tagged} = TransformerPOSTagger.tag_pos(tokens)
# Use specific model
{:ok, tagged} = TransformerPOSTagger.tag_pos(tokens, model: :bert_base_cased)
# Disable caching for variable inputs
{:ok, tagged} = TransformerPOSTagger.tag_pos(tokens, use_cache: false)
Gets the tag to ID map (UPOS tag to ID).
Examples
TransformerPOSTagger.tag_to_id()
# => %{adj: 0, adp: 1, ...}