mix nasty.train.pcfg (Nasty v0.3.0)
View SourceTrains a PCFG (Probabilistic Context-Free Grammar) model from treebank data.
Usage
mix nasty.train.pcfg --corpus data/train.conllu --output priv/models/en/pcfg.modelOptions
--corpus- Path to training corpus in CoNLL-U format (required)--test- Path to test corpus for evaluation (optional)--output- Path to save trained model (required)--smoothing- Smoothing constant (default: 0.001)--cnf- Convert grammar to CNF (default: true)--language- Language code (default: en)
Examples
# Train basic PCFG
mix nasty.train.pcfg \
--corpus data/en_ewt-ud-train.conllu \
--output priv/models/en/pcfg.model
# Train with evaluation
mix nasty.train.pcfg \
--corpus data/en_ewt-ud-train.conllu \
--test data/en_ewt-ud-test.conllu \
--output priv/models/en/pcfg.model \
--smoothing 0.0001