Nasty.Language.English.TransformerNER (Nasty v0.3.0)
View SourceTransformer-based Named Entity Recognition for English.
Uses pre-trained transformer models fine-tuned for NER to identify and classify named entities (persons, organizations, locations, etc.) using the BIO (Begin-Inside-Outside) tagging scheme.
Expected F1 scores: 93-95% on CoNLL-2003.
Summary
Functions
Gets the label map (ID to BIO tag).
Returns the number of NER labels.
Recognizes named entities in tokens using a transformer model.
Gets the tag to ID map (BIO tag to ID).
Functions
Gets the label map (ID to BIO tag).
Examples
TransformerNER.label_map()
# => %{0 => "O", 1 => "B-PER", 2 => "I-PER", ...}
@spec num_labels() :: integer()
Returns the number of NER labels.
Examples
TransformerNER.num_labels()
# => 9
@spec recognize_entities( [Nasty.AST.Token.t()], keyword() ) :: {:ok, [Nasty.AST.Semantic.Entity.t()]} | {:error, term()}
Recognizes named entities in tokens using a transformer model.
Options
:model- Model to use: atom name (e.g., :roberta_base) or :transformer (uses default):cache_dir- Directory for model caching:device- Device to use (:cpu or :cuda, default: :cpu):use_cache- Whether to use prediction caching (default: true)
Examples
{:ok, tokens} = Tokenizer.tokenize("John lives in Paris")
{:ok, entities} = TransformerNER.recognize_entities(tokens)
# Use specific model
{:ok, entities} = TransformerNER.recognize_entities(tokens, model: :bert_base_cased)
Gets the tag to ID map (BIO tag to ID).
Examples
TransformerNER.tag_to_id()
# => %{o: 0, b_per: 1, i_per: 2, ...}