Nous.Plugins.InputGuard (nous v0.13.3)
View SourceModular malicious input classifier plugin.
InputGuard detects prompt injection, jailbreak attempts, and other malicious inputs using a composable strategy pattern. Detection backends, aggregation modes, and policy actions are all configurable.
Architecture
User Input → InputGuard (before_request hook)
├─ Strategy 1: Pattern matching
├─ Strategy 2: LLM Judge
├─ Strategy N: Custom function
↓
Aggregator (any / majority / all)
↓
Policy (block / warn / log / callback)
↓
Modified Context (or halted execution)Configuration
Store configuration in deps under the :input_guard_config key:
agent = Nous.new("openai:gpt-4",
plugins: [Nous.Plugins.InputGuard]
)
{:ok, result} = Nous.run(agent, "Hello",
deps: %{
input_guard_config: %{
strategies: [
{Nous.Plugins.InputGuard.Strategies.Pattern, []},
{Nous.Plugins.InputGuard.Strategies.LLMJudge, model: "openai:gpt-4o-mini"},
{MyApp.InputGuard.Blocklist, words: ["hack", "exploit"]}
],
policy: %{suspicious: :warn, blocked: :block},
aggregation: :any,
short_circuit: false,
on_violation: &MyApp.log_violation/1,
skip_empty: true
}
}
)Configuration Options
:strategies— List of{module, keyword_opts}tuples. Each module must implementNous.Plugins.InputGuard.Strategy. Default:[{Strategies.Pattern, []}]:policy— Map of severity to action. Default:%{suspicious: :warn, blocked: :block}:aggregation— How to combine results from multiple strategies.:any(default) flags if any strategy flags,:majorityif more than half flag,:allonly if every strategy flags.:short_circuit— Whentrue, stops running strategies on first:blockedresult. Default:false:on_violation— Optional callback functionfn result -> ... endcalled when input is flagged.:skip_empty— Skip checking empty or whitespace-only messages. Default:true
Streaming Limitation
InputGuard operates via the before_request plugin hook, which is not invoked
during run_stream in AgentRunner. When using streaming, InputGuard will not
apply — validate input before calling run_stream if needed.