Nasty Examples Catalog

View Source

Comprehensive catalog of all example scripts demonstrating Nasty's capabilities.

Quick Start

All examples can be run directly:

elixir examples/example_name.exs

Or make them executable:

chmod +x examples/example_name.exs
./examples/example_name.exs

Basic Examples

tokenizer_example.exs

Purpose: Introduction to tokenization

What it demonstrates:

  • Basic tokenization with NimbleParsec
  • Position tracking (line, column, byte offsets)
  • Handling contractions (don't, it's)
  • Punctuation as separate tokens
  • Sentence boundary detection

Run:

elixir examples/tokenizer_example.exs

Best for: Understanding the first step in the NLP pipeline


hmm_pos_tagger_example.exs

Purpose: Statistical POS tagging with Hidden Markov Models

What it demonstrates:

  • Training HMM POS taggers from CoNLL-U data
  • Viterbi algorithm for sequence tagging
  • Model evaluation and accuracy metrics
  • Comparison with rule-based tagging
  • Model persistence (save/load)

Run:

elixir examples/hmm_pos_tagger_example.exs

Best for: Learning about statistical NLP models


neural_pos_tagger_example.exs

Purpose: Neural POS tagging with BiLSTM-CRF

What it demonstrates:

  • BiLSTM-CRF architecture with Axon/EXLA
  • Training neural models on UD corpora
  • Character-level embeddings for OOV handling
  • GPU acceleration with EXLA
  • 97-98% accuracy on benchmark datasets

Run:

elixir examples/neural_pos_tagger_example.exs

Best for: Understanding deep learning for NLP


Language-Specific Examples

spanish_example.exs

Purpose: Spanish language processing

What it demonstrates:

  • Spanish tokenization (¿?, ¡!, del, al contractions)
  • Spanish POS tagging with morphology
  • Gender/number agreement
  • Parsing Spanish sentence structure
  • Entity recognition with Spanish lexicons

Run:

elixir examples/spanish_example.exs

Best for: Working with Romance languages


catalan_example.exs

Purpose: Catalan language processing

What it demonstrates:

  • Catalan-specific tokenization (interpunct l·l, apostrophes)
  • All 10 Catalan diacritics (à, è, é, í, ï, ò, ó, ú, ü, ç)
  • Article contractions (del, al, pel, cal)
  • Catalan morphology and POS tagging
  • Entity recognition with Catalan lexicons
  • Translation between Catalan and English

Run:

elixir examples/catalan_example.exs

Best for: Catalan NLP applications


Translation Examples

translation_example.exs

Purpose: Basic AST-based translation

What it demonstrates:

  • English ↔ Spanish translation
  • AST-level translation preserving grammar
  • Morphological agreement enforcement
  • Word order transformations
  • Rendering translated AST to text

Run:

elixir examples/translation_example.exs

Best for: Getting started with translation


roundtrip_translation.exs

Purpose: Translation quality analysis

What it demonstrates:

  • English → Spanish → English roundtrips
  • English → Catalan → English roundtrips
  • Spanish → English → Spanish roundtrips
  • Similarity metrics and quality assessment
  • Challenging translation cases
  • Performance across complexity levels

Run:

elixir examples/roundtrip_translation.exs

Best for: Evaluating translation quality


multilingual_pipeline.exs

Purpose: Side-by-side multilingual comparison

What it demonstrates:

  • Processing same content in English, Spanish, Catalan
  • Token-level comparison across languages
  • POS tagging differences
  • Morphological feature comparison
  • Translation matrix (all language pairs)
  • Performance benchmarking
  • Language-specific features summary

Run:

elixir examples/multilingual_pipeline.exs

Best for: Understanding cross-language differences


Advanced NLP Tasks

summarization.exs

Purpose: Extractive text summarization

What it demonstrates:

  • Position-weighted sentence scoring
  • Entity density calculation
  • Discourse marker detection
  • Keyword frequency (TF)
  • MMR (Maximal Marginal Relevance) for diversity
  • Compression ratio vs. fixed sentence count

Run:

elixir examples/summarization.exs

Best for: Document summarization applications


question_answering.exs

Purpose: Extractive question answering

What it demonstrates:

  • Question classification (WHO, WHAT, WHEN, WHERE, WHY, HOW)
  • Answer extraction strategies
  • Entity type filtering
  • Keyword matching with lemmatization
  • Confidence scoring
  • Multiple answer support

Run:

elixir examples/question_answering.exs

Best for: Building Q&A systems


text_classification.exs

Purpose: Document classification

What it demonstrates:

  • Multinomial Naive Bayes classifier
  • Feature extraction (BOW, n-grams, POS patterns, entities, lexical)
  • Training on labeled data
  • Multi-class classification
  • Model evaluation (accuracy, precision, recall, F1)
  • Sentiment analysis example

Run:

elixir examples/text_classification.exs

Best for: Text categorization tasks


information_extraction.exs

Purpose: Structured information extraction

What it demonstrates:

  • Relation extraction (employment, organization, location)
  • Event extraction (acquisitions, foundings, announcements)
  • Template-based extraction
  • Pattern matching with verb patterns
  • Confidence scoring
  • Integration with NER and dependencies

Run:

elixir examples/information_extraction.exs

Best for: Knowledge base construction


Code Interoperability

code_generation.exs

Purpose: Natural language to code

What it demonstrates:

  • Intent recognition from natural language
  • Constraint extraction (comparison, property, range)
  • Elixir code generation
  • List operations (sort, filter, map, reduce)
  • Arithmetic expressions
  • Conditional statements

Run:

elixir examples/code_generation.exs

Best for: Natural language programming interfaces


code_explanation.exs

Purpose: Code to natural language

What it demonstrates:

  • Elixir AST parsing
  • Code explanation generation
  • Pipeline explanation
  • Function call description
  • Variable usage analysis

Run:

elixir examples/code_explanation.exs

Best for: Code documentation and understanding


Neural Network Examples

pretrained_model_usage.exs

Purpose: Using pre-trained transformers

What it demonstrates:

  • BERT and RoBERTa via Bumblebee
  • Fine-tuning for POS tagging and NER
  • Zero-shot classification
  • Model quantization (INT8)
  • Multilingual models (XLM-RoBERTa)

Run:

elixir examples/pretrained_model_usage.exs

Best for: Leveraging pre-trained models


transformer_pos_example.exs

Purpose: Transformer-based POS tagging

What it demonstrates:

  • RoBERTa for POS tagging
  • Fine-tuning transformers
  • 98-99% accuracy
  • Cross-lingual transfer
  • Model comparison

Run:

elixir examples/transformer_pos_example.exs

Best for: State-of-the-art accuracy


advanced_neural_features.exs

Purpose: Advanced neural NLP features

What it demonstrates:

  • Multiple neural architectures
  • Ensemble methods
  • Model quantization
  • Zero-shot learning
  • Cross-lingual transfer
  • Performance optimization

Run:

elixir examples/advanced_neural_features.exs

Best for: Production neural NLP systems


Comprehensive Demos

comprehensive_demo.exs

Purpose: Complete NLP pipeline walkthrough

What it demonstrates:

  • Full pipeline from tokenization to summarization
  • All major NLP tasks
  • Entity recognition
  • Dependency extraction
  • Semantic role labeling
  • Coreference resolution
  • Information extraction

Run:

./examples/comprehensive_demo.exs

Best for: Overview of all capabilities


Example Selection Guide

By Use Case

Text Analysis:

  • tokenizer_example.exs
  • hmm_pos_tagger_example.exs
  • comprehensive_demo.exs

Machine Learning:

  • neural_pos_tagger_example.exs
  • transformer_pos_example.exs
  • text_classification.exs
  • advanced_neural_features.exs

Multilingual:

  • spanish_example.exs
  • catalan_example.exs
  • translation_example.exs
  • roundtrip_translation.exs
  • multilingual_pipeline.exs

Information Extraction:

  • question_answering.exs
  • information_extraction.exs
  • summarization.exs

Code Integration:

  • code_generation.exs
  • code_explanation.exs

By Difficulty Level

Beginner:

  1. tokenizer_example.exs
  2. spanish_example.exs
  3. translation_example.exs
  4. summarization.exs

Intermediate:

  1. hmm_pos_tagger_example.exs
  2. catalan_example.exs
  3. question_answering.exs
  4. text_classification.exs
  5. multilingual_pipeline.exs

Advanced:

  1. neural_pos_tagger_example.exs
  2. information_extraction.exs
  3. transformer_pos_example.exs
  4. advanced_neural_features.exs
  5. roundtrip_translation.exs

By Processing Time

Fast (<1 second):

  • tokenizer_example.exs
  • translation_example.exs
  • spanish_example.exs

Medium (1-10 seconds):

  • catalan_example.exs
  • multilingual_pipeline.exs
  • summarization.exs
  • question_answering.exs

Slow (>10 seconds):

  • hmm_pos_tagger_example.exs (if training)
  • neural_pos_tagger_example.exs
  • transformer_pos_example.exs
  • roundtrip_translation.exs

Running Multiple Examples

Run all basic examples:

for example in tokenizer_example spanish_example translation_example; do
  echo "Running ${example}..."
  elixir examples/${example}.exs
  echo "---"
done

Run all translation examples:

for example in translation_example roundtrip_translation multilingual_pipeline; do
  elixir examples/${example}.exs
done

Run all language-specific examples:

elixir examples/spanish_example.exs
elixir examples/catalan_example.exs
elixir examples/multilingual_pipeline.exs

Expected Output

Typical Output Format

Most examples output:

  1. Section headers: Clearly marked sections
  2. Input text: What's being processed
  3. Results: Parsed output, tags, entities, etc.
  4. Statistics: Counts, accuracy, timing
  5. Summary: Key takeaways

Example Output Snippet

========================================
Spanish Language Processing Demo
========================================

1. Tokenization
---------------
Input: El gato duerme en el sofá.

Tokens:
  El (1:1)
  gato (1:4)
  duerme (1:9)
  ...

2. POS Tagging
--------------
Tagged tokens:
  El  det
  gato  noun
  duerme  verb
  ...

Troubleshooting

Common Issues

Example won't run:

# Make sure dependencies are installed
mix deps.get
mix compile

# Check file permissions
chmod +x examples/example_name.exs

Missing models: Some examples (neural, transformer) require trained models. See TRAINING_NEURAL.md for training instructions.

Out of memory: Neural/transformer examples may need more memory. Reduce batch size or use smaller models.

Creating Your Own Examples

Template for new examples:

#!/usr/bin/env elixir

# Your Example Name
#
# Brief description of what this example demonstrates

Mix.install([
  {:nasty, path: Path.expand("..", __DIR__)}
])

alias Nasty.Language.English

IO.puts("\n========================================")
IO.puts("Your Example Title")
IO.puts("========================================\n")

# Example 1: First concept
IO.puts("1. First Section")
IO.puts("----------------")

# Your code here

# Example 2: Second concept
IO.puts("\n2. Second Section")
IO.puts("-----------------")

# Your code here

IO.puts("\n========================================")
IO.puts("Example Complete!")
IO.puts("========================================\n")

See Also