Nasty.Statistics.Neural.Pretrained (Nasty v0.3.0)
View SourceIntegration with pre-trained transformer models via Bumblebee.
Provides access to state-of-the-art pre-trained models from HuggingFace for tasks like POS tagging, NER, and text classification.
Supported Models
- BERT (bert-base-uncased, bert-base-cased)
- RoBERTa (roberta-base, roberta-large)
- DistilBERT (distilbert-base-uncased)
- Custom fine-tuned models
Usage
# Load a pre-trained BERT model for POS tagging
{:ok, model} = Pretrained.load_model("bert-base-uncased", task: :pos_tagging)
# Fine-tune on your data
{:ok, fine_tuned} = Pretrained.fine_tune(model, training_data, epochs: 3)
# Use for prediction
{:ok, tags} = Pretrained.predict(fine_tuned, words)Note
This module requires downloading models from HuggingFace. Models are cached locally after the first download.
Full implementation requires:
- Model downloading and caching
- Tokenization with Bumblebee tokenizers
- Fine-tuning interface
- Integration with existing pipeline
Future Enhancements
- Support for multilingual models (mBERT, XLM-R)
- Zero-shot classification
- Model quantization for efficiency
- Custom model registration
Summary
Functions
Fine-tunes a pre-trained model on task-specific data.
Lists available pre-trained models.
Loads a pre-trained model from Bumblebee/HuggingFace.
Makes predictions using a pre-trained or fine-tuned model.
Functions
Fine-tunes a pre-trained model on task-specific data.
Parameters
model- Pre-trained modeltraining_data- Task-specific training dataopts- Fine-tuning options
Options
:epochs- Number of epochs (default: 3):learning_rate- Learning rate (default: 2e-5):batch_size- Batch size (default: 16):warmup_ratio- Warmup ratio (default: 0.1)
Returns
{:ok, fine_tuned_model}- Fine-tuned model{:error, reason}- Fine-tuning failed
@spec list_models() :: [map()]
Lists available pre-trained models.
Returns
List of available model names with metadata.
Loads a pre-trained model from Bumblebee/HuggingFace.
Parameters
model_name- Model identifier (e.g., "bert-base-uncased")opts- Loading options
Options
:task- Task type: :pos_tagging, :ner, :classification:cache_dir- Model cache directory (default: ~/.cache/nasty/models):device- Device to load on: :cpu or :cuda (default: :cpu)
Returns
{:ok, model}- Loaded model{:error, reason}- Loading failed
Examples
{:ok, model} = Pretrained.load_model("bert-base-uncased", task: :pos_tagging)
Makes predictions using a pre-trained or fine-tuned model.
Parameters
model- Model (pre-trained or fine-tuned)input- Input text or tokensopts- Prediction options
Returns
{:ok, predictions}- Model predictions{:error, reason}- Prediction failed