Nasty.Language.Spanish.Summarizer (Nasty v0.3.0)
View SourceGenerates summaries of Spanish documents.
Delegates to generic extractive summarization with Spanish-specific configuration.
Extractive Summarization
Ranks sentences by importance using:
- TF-IDF term frequency
- Position in document
- Named entity density
- Sentence length
- Spanish discourse markers
Spanish-Specific Features
- Stop words (el, la, de, en, y, etc.)
- Sentence boundaries (., !, ?, ;, ¿, ¡)
- Discourse markers (además, sin embargo, por lo tanto, en conclusión)
Example
iex> doc = parse("El gato es un animal. Los gatos son carnívoros. Les gusta dormir.")
iex> summary = Summarizer.summarize(doc, ratio: 0.5)
{:ok, %Document{...}}
Summary
Functions
Generates an extractive summary of a Spanish document.
Functions
@spec summarize( Nasty.AST.Document.t(), keyword() ) :: {:ok, Nasty.AST.Document.t()} | {:error, term()}
Generates an extractive summary of a Spanish document.
Delegates to the Spanish adapter which uses generic extractive summarization with Spanish-specific configuration (stop words, discourse markers, punctuation).
Options
:ratio- Fraction of sentences to include (default: 0.3):max_sentences- Maximum number of sentences (default: unlimited):min_sentences- Minimum number of sentences (default: 1):method- Selection method::greedy(default) or:mmr:mmr_lambda- MMR lambda parameter (0.0-1.0), default 0.7
Examples
iex> {:ok, summary} = Summarizer.summarize(doc, ratio: 0.3)
iex> {:ok, summary} = Summarizer.summarize(doc, max_sentences: 3, method: :mmr)