Nasty Examples Catalog
View SourceComprehensive catalog of all example scripts demonstrating Nasty's capabilities.
Quick Start
All examples can be run directly:
elixir examples/example_name.exs
Or make them executable:
chmod +x examples/example_name.exs
./examples/example_name.exs
Basic Examples
tokenizer_example.exs
Purpose: Introduction to tokenization
What it demonstrates:
- Basic tokenization with NimbleParsec
- Position tracking (line, column, byte offsets)
- Handling contractions (don't, it's)
- Punctuation as separate tokens
- Sentence boundary detection
Run:
elixir examples/tokenizer_example.exs
Best for: Understanding the first step in the NLP pipeline
hmm_pos_tagger_example.exs
Purpose: Statistical POS tagging with Hidden Markov Models
What it demonstrates:
- Training HMM POS taggers from CoNLL-U data
- Viterbi algorithm for sequence tagging
- Model evaluation and accuracy metrics
- Comparison with rule-based tagging
- Model persistence (save/load)
Run:
elixir examples/hmm_pos_tagger_example.exs
Best for: Learning about statistical NLP models
neural_pos_tagger_example.exs
Purpose: Neural POS tagging with BiLSTM-CRF
What it demonstrates:
- BiLSTM-CRF architecture with Axon/EXLA
- Training neural models on UD corpora
- Character-level embeddings for OOV handling
- GPU acceleration with EXLA
- 97-98% accuracy on benchmark datasets
Run:
elixir examples/neural_pos_tagger_example.exs
Best for: Understanding deep learning for NLP
Language-Specific Examples
spanish_example.exs
Purpose: Spanish language processing
What it demonstrates:
- Spanish tokenization (¿?, ¡!, del, al contractions)
- Spanish POS tagging with morphology
- Gender/number agreement
- Parsing Spanish sentence structure
- Entity recognition with Spanish lexicons
Run:
elixir examples/spanish_example.exs
Best for: Working with Romance languages
catalan_example.exs
Purpose: Catalan language processing
What it demonstrates:
- Catalan-specific tokenization (interpunct l·l, apostrophes)
- All 10 Catalan diacritics (à, è, é, í, ï, ò, ó, ú, ü, ç)
- Article contractions (del, al, pel, cal)
- Catalan morphology and POS tagging
- Entity recognition with Catalan lexicons
- Translation between Catalan and English
Run:
elixir examples/catalan_example.exs
Best for: Catalan NLP applications
Translation Examples
translation_example.exs
Purpose: Basic AST-based translation
What it demonstrates:
- English ↔ Spanish translation
- AST-level translation preserving grammar
- Morphological agreement enforcement
- Word order transformations
- Rendering translated AST to text
Run:
elixir examples/translation_example.exs
Best for: Getting started with translation
roundtrip_translation.exs
Purpose: Translation quality analysis
What it demonstrates:
- English → Spanish → English roundtrips
- English → Catalan → English roundtrips
- Spanish → English → Spanish roundtrips
- Similarity metrics and quality assessment
- Challenging translation cases
- Performance across complexity levels
Run:
elixir examples/roundtrip_translation.exs
Best for: Evaluating translation quality
multilingual_pipeline.exs
Purpose: Side-by-side multilingual comparison
What it demonstrates:
- Processing same content in English, Spanish, Catalan
- Token-level comparison across languages
- POS tagging differences
- Morphological feature comparison
- Translation matrix (all language pairs)
- Performance benchmarking
- Language-specific features summary
Run:
elixir examples/multilingual_pipeline.exs
Best for: Understanding cross-language differences
Advanced NLP Tasks
summarization.exs
Purpose: Extractive text summarization
What it demonstrates:
- Position-weighted sentence scoring
- Entity density calculation
- Discourse marker detection
- Keyword frequency (TF)
- MMR (Maximal Marginal Relevance) for diversity
- Compression ratio vs. fixed sentence count
Run:
elixir examples/summarization.exs
Best for: Document summarization applications
question_answering.exs
Purpose: Extractive question answering
What it demonstrates:
- Question classification (WHO, WHAT, WHEN, WHERE, WHY, HOW)
- Answer extraction strategies
- Entity type filtering
- Keyword matching with lemmatization
- Confidence scoring
- Multiple answer support
Run:
elixir examples/question_answering.exs
Best for: Building Q&A systems
text_classification.exs
Purpose: Document classification
What it demonstrates:
- Multinomial Naive Bayes classifier
- Feature extraction (BOW, n-grams, POS patterns, entities, lexical)
- Training on labeled data
- Multi-class classification
- Model evaluation (accuracy, precision, recall, F1)
- Sentiment analysis example
Run:
elixir examples/text_classification.exs
Best for: Text categorization tasks
information_extraction.exs
Purpose: Structured information extraction
What it demonstrates:
- Relation extraction (employment, organization, location)
- Event extraction (acquisitions, foundings, announcements)
- Template-based extraction
- Pattern matching with verb patterns
- Confidence scoring
- Integration with NER and dependencies
Run:
elixir examples/information_extraction.exs
Best for: Knowledge base construction
Code Interoperability
code_generation.exs
Purpose: Natural language to code
What it demonstrates:
- Intent recognition from natural language
- Constraint extraction (comparison, property, range)
- Elixir code generation
- List operations (sort, filter, map, reduce)
- Arithmetic expressions
- Conditional statements
Run:
elixir examples/code_generation.exs
Best for: Natural language programming interfaces
code_explanation.exs
Purpose: Code to natural language
What it demonstrates:
- Elixir AST parsing
- Code explanation generation
- Pipeline explanation
- Function call description
- Variable usage analysis
Run:
elixir examples/code_explanation.exs
Best for: Code documentation and understanding
Neural Network Examples
pretrained_model_usage.exs
Purpose: Using pre-trained transformers
What it demonstrates:
- BERT and RoBERTa via Bumblebee
- Fine-tuning for POS tagging and NER
- Zero-shot classification
- Model quantization (INT8)
- Multilingual models (XLM-RoBERTa)
Run:
elixir examples/pretrained_model_usage.exs
Best for: Leveraging pre-trained models
transformer_pos_example.exs
Purpose: Transformer-based POS tagging
What it demonstrates:
- RoBERTa for POS tagging
- Fine-tuning transformers
- 98-99% accuracy
- Cross-lingual transfer
- Model comparison
Run:
elixir examples/transformer_pos_example.exs
Best for: State-of-the-art accuracy
advanced_neural_features.exs
Purpose: Advanced neural NLP features
What it demonstrates:
- Multiple neural architectures
- Ensemble methods
- Model quantization
- Zero-shot learning
- Cross-lingual transfer
- Performance optimization
Run:
elixir examples/advanced_neural_features.exs
Best for: Production neural NLP systems
Comprehensive Demos
comprehensive_demo.exs
Purpose: Complete NLP pipeline walkthrough
What it demonstrates:
- Full pipeline from tokenization to summarization
- All major NLP tasks
- Entity recognition
- Dependency extraction
- Semantic role labeling
- Coreference resolution
- Information extraction
Run:
./examples/comprehensive_demo.exs
Best for: Overview of all capabilities
Example Selection Guide
By Use Case
Text Analysis:
- tokenizer_example.exs
- hmm_pos_tagger_example.exs
- comprehensive_demo.exs
Machine Learning:
- neural_pos_tagger_example.exs
- transformer_pos_example.exs
- text_classification.exs
- advanced_neural_features.exs
Multilingual:
- spanish_example.exs
- catalan_example.exs
- translation_example.exs
- roundtrip_translation.exs
- multilingual_pipeline.exs
Information Extraction:
- question_answering.exs
- information_extraction.exs
- summarization.exs
Code Integration:
- code_generation.exs
- code_explanation.exs
By Difficulty Level
Beginner:
- tokenizer_example.exs
- spanish_example.exs
- translation_example.exs
- summarization.exs
Intermediate:
- hmm_pos_tagger_example.exs
- catalan_example.exs
- question_answering.exs
- text_classification.exs
- multilingual_pipeline.exs
Advanced:
- neural_pos_tagger_example.exs
- information_extraction.exs
- transformer_pos_example.exs
- advanced_neural_features.exs
- roundtrip_translation.exs
By Processing Time
Fast (<1 second):
- tokenizer_example.exs
- translation_example.exs
- spanish_example.exs
Medium (1-10 seconds):
- catalan_example.exs
- multilingual_pipeline.exs
- summarization.exs
- question_answering.exs
Slow (>10 seconds):
- hmm_pos_tagger_example.exs (if training)
- neural_pos_tagger_example.exs
- transformer_pos_example.exs
- roundtrip_translation.exs
Running Multiple Examples
Run all basic examples:
for example in tokenizer_example spanish_example translation_example; do
echo "Running ${example}..."
elixir examples/${example}.exs
echo "---"
done
Run all translation examples:
for example in translation_example roundtrip_translation multilingual_pipeline; do
elixir examples/${example}.exs
done
Run all language-specific examples:
elixir examples/spanish_example.exs
elixir examples/catalan_example.exs
elixir examples/multilingual_pipeline.exs
Expected Output
Typical Output Format
Most examples output:
- Section headers: Clearly marked sections
- Input text: What's being processed
- Results: Parsed output, tags, entities, etc.
- Statistics: Counts, accuracy, timing
- Summary: Key takeaways
Example Output Snippet
========================================
Spanish Language Processing Demo
========================================
1. Tokenization
---------------
Input: El gato duerme en el sofá.
Tokens:
El (1:1)
gato (1:4)
duerme (1:9)
...
2. POS Tagging
--------------
Tagged tokens:
El → det
gato → noun
duerme → verb
...Troubleshooting
Common Issues
Example won't run:
# Make sure dependencies are installed
mix deps.get
mix compile
# Check file permissions
chmod +x examples/example_name.exs
Missing models: Some examples (neural, transformer) require trained models. See TRAINING_NEURAL.md for training instructions.
Out of memory: Neural/transformer examples may need more memory. Reduce batch size or use smaller models.
Creating Your Own Examples
Template for new examples:
#!/usr/bin/env elixir
# Your Example Name
#
# Brief description of what this example demonstrates
Mix.install([
{:nasty, path: Path.expand("..", __DIR__)}
])
alias Nasty.Language.English
IO.puts("\n========================================")
IO.puts("Your Example Title")
IO.puts("========================================\n")
# Example 1: First concept
IO.puts("1. First Section")
IO.puts("----------------")
# Your code here
# Example 2: Second concept
IO.puts("\n2. Second Section")
IO.puts("-----------------")
# Your code here
IO.puts("\n========================================")
IO.puts("Example Complete!")
IO.puts("========================================\n")See Also
- GETTING_STARTED.md - Tutorial for beginners
- USER_GUIDE.md - Comprehensive usage guide
- API.md - API reference
- TRANSLATION.md - Translation system guide