LlmGuard.Stage (LlmGuard v0.3.1)
View SourcePipeline stage for LLM security guardrails.
This module integrates LlmGuard with CrucibleIR pipelines by providing a
stage implementation that validates inputs and outputs according to
CrucibleIR.Reliability.Guardrail configuration.
Context Requirements
The stage expects the following in the context:
experiment.reliability.guardrails-CrucibleIR.Reliability.Guardrailstruct- Either
inputsoroutputs- Content to validate (string or list of strings)
Configuration
The stage reads guardrail configuration from the experiment and converts it to LlmGuard configuration. Supported guardrail options:
prompt_injection_detection- Enable prompt injection detectionjailbreak_detection- Enable jailbreak detectionpii_detection- Enable PII detectionpii_redaction- Enable PII redaction (implies pii_detection)content_moderation- Enable content moderationfail_on_detection- Return error on threat detection (vs warning)
Usage
# Create a guardrail configuration
guardrail = %CrucibleIR.Reliability.Guardrail{
profiles: [:default],
prompt_injection_detection: true,
jailbreak_detection: true,
pii_detection: true,
pii_redaction: false,
fail_on_detection: true
}
# Add to experiment context
context = %{
experiment: %{
reliability: %{
guardrails: guardrail
}
},
inputs: "User input to validate"
}
# Run the stage
{:ok, updated_context} = LlmGuard.Stage.run(context)
# Check results
results = updated_context.guardrails
# => %{
# status: :safe | :detected | :error,
# validated_inputs: [...],
# detections: [...],
# ...
# }Results
The stage adds a :guardrails key to the context with validation results:
status- Overall status (:safe,:detected,:error)validated_inputsorvalidated_outputs- Sanitized contentdetections- List of detected threats (if any)errors- List of errors (if any)config- LlmGuard config used for validation
Error Handling
If fail_on_detection is true, the stage returns {:error, reason} when
threats are detected. Otherwise, it returns {:ok, context} with detection
details in context.guardrails.
Summary
Functions
Describes the stage for pipeline introspection.
Converts CrucibleIR Guardrail configuration to LlmGuard Config.
Runs security checks on inputs or outputs.
Types
Functions
@spec describe(stage_opts()) :: map()
Describes the stage for pipeline introspection.
Returns a description of what this stage does and its configuration.
Parameters
opts- Stage options (currently unused)
Returns
A map describing the stage.
Examples
iex> LlmGuard.Stage.describe()
%{
name: "LlmGuard Security Stage",
description: "Validates inputs/outputs for security threats",
type: :security
}
@spec from_ir_config(struct()) :: LlmGuard.Config.t()
Converts CrucibleIR Guardrail configuration to LlmGuard Config.
Maps CrucibleIR guardrail settings to LlmGuard's configuration format.
Parameters
guardrail- CrucibleIR.Reliability.Guardrail struct
Returns
LlmGuard.Config struct
Examples
iex> guardrail = %CrucibleIR.Reliability.Guardrail{
...> prompt_injection_detection: true,
...> pii_detection: true
...> }
iex> config = LlmGuard.Stage.from_ir_config(guardrail)
iex> config.prompt_injection_detection
true
@spec run(context(), stage_opts()) :: stage_result()
Runs security checks on inputs or outputs.
Expects context with:
experiment.reliability.guardrails- Guardrail configurationinputsoroutputs- Content to validate
Returns updated context with :guardrails results, or error if
fail_on_detection is enabled and threats detected.
Parameters
context- Pipeline context mapopts- Stage options (currently unused, for future extensibility)
Returns
{:ok, updated_context}- Validation completed, results incontext.guardrails{:error, reason}- Validation error or threat detected (iffail_on_detection: true)
Examples
iex> guardrail = %CrucibleIR.Reliability.Guardrail{
...> prompt_injection_detection: true,
...> fail_on_detection: false
...> }
iex> context = %{
...> experiment: %{reliability: %{guardrails: guardrail}},
...> inputs: "Safe message"
...> }
iex> {:ok, result} = LlmGuard.Stage.run(context)
iex> result.guardrails.status
:safe