GenServer for managing Kreuzberg plugins.
This module provides a centralized registry for managing different types of plugins:
- Post-processors: Transform extracted text content
- Validators: Validate extraction configuration parameters
- OCR backends: Provide OCR functionality
The registry maintains plugin metadata including module references, configuration, priorities, stages, and language support. All operations are thread-safe through the GenServer interface.
State Structure
The internal state is a map with four top-level keys:
:post_processors: Maps processor name to%{module: ..., config: ..., stage: ...}:validators: Maps validator name to%{module: ..., priority: ...}:sorted_validators: Pre-sorted list of validators by priority (descending) for performance:ocr_backends: Maps backend name to%{module: ..., languages: ...}
Usage
Typically used during application startup to register available plugins:
{:ok, _pid} = Kreuzberg.Plugin.Registry.start_link([])
Kreuzberg.Plugin.Registry.register_post_processor(MyPostProcessor, %{enabled: true}, :pre)
Kreuzberg.Plugin.Registry.register_validator(MyValidator, priority: 10)
Kreuzberg.Plugin.Registry.register_ocr_backend(MyOCRBackend, languages: ["en", "de"])
Summary
Functions
Returns a specification to start this module under a supervisor.
Clear all registered OCR backends.
Clear all registered post-processors.
Clear all registered validators.
Get a specific OCR backend by name.
Get OCR backends that support a specific language.
Get a specific post-processor by name.
Get post-processors for a specific processing stage.
Get a specific validator by name.
Get validators sorted by priority (highest first).
List all registered OCR backends.
List all registered post-processors.
List all registered validators.
Register an OCR backend plugin.
Register a post-processor plugin.
Register a validator plugin.
Start the registry GenServer.
Unregister an OCR backend plugin by name.
Unregister a post-processor plugin by name.
Unregister a validator plugin by name.
Functions
Returns a specification to start this module under a supervisor.
See Supervisor.
@spec clear_ocr_backends(GenServer.server() | nil) :: :ok
Clear all registered OCR backends.
Parameters
server- GenServer name/pid (optional)
Returns
:ok
Examples
Kreuzberg.Plugin.Registry.clear_ocr_backends()
@spec clear_post_processors(GenServer.server() | nil) :: :ok
Clear all registered post-processors.
Parameters
server- GenServer name/pid (optional)
Returns
:ok
Examples
Kreuzberg.Plugin.Registry.clear_post_processors()
@spec clear_validators(GenServer.server() | nil) :: :ok
Clear all registered validators.
Parameters
server- GenServer name/pid (optional)
Returns
:ok
Examples
Kreuzberg.Plugin.Registry.clear_validators()
@spec get_ocr_backend(atom() | String.t(), GenServer.server() | nil) :: {:ok, map()} | {:error, String.t()}
Get a specific OCR backend by name.
Parameters
name- The OCR backend nameserver- GenServer name/pid (optional)
Returns
{:ok, metadata}- OCR backend metadata{:error, "Not found"}- OCR backend not found
Examples
{:ok, metadata} = Kreuzberg.Plugin.Registry.get_ocr_backend(:my_ocr)
@spec get_ocr_backends_by_language(String.t(), GenServer.server() | nil) :: map()
Get OCR backends that support a specific language.
Uses a single-pass reduce for optimal performance (OPTIMIZATION 2).
Parameters
language- The language code (string)server- GenServer name/pid (optional)
Returns
A map of backend names to metadata for backends supporting the language
Examples
en_backends = Kreuzberg.Plugin.Registry.get_ocr_backends_by_language("en")
@spec get_post_processor(atom() | String.t(), GenServer.server() | nil) :: {:ok, map()} | {:error, String.t()}
Get a specific post-processor by name.
Parameters
name- The post-processor nameserver- GenServer name/pid (optional)
Returns
{:ok, metadata}- Post-processor metadata{:error, "Not found"}- Post-processor not found
Examples
{:ok, metadata} = Kreuzberg.Plugin.Registry.get_post_processor(:my_processor)
@spec get_post_processors_by_stage(atom(), GenServer.server() | nil) :: map()
Get post-processors for a specific processing stage.
Parameters
stage- The processing stage to filter by (atom)server- GenServer name/pid (optional)
Returns
A map of post-processor names to metadata for the specified stage
Examples
pre_processors = Kreuzberg.Plugin.Registry.get_post_processors_by_stage(:pre)
@spec get_validator(atom() | String.t(), GenServer.server() | nil) :: {:ok, map()} | {:error, String.t()}
Get a specific validator by name.
Parameters
name- The validator nameserver- GenServer name/pid (optional)
Returns
{:ok, metadata}- Validator metadata{:error, "Not found"}- Validator not found
Examples
{:ok, metadata} = Kreuzberg.Plugin.Registry.get_validator(:my_validator)
@spec get_validators_by_priority(GenServer.server() | nil) :: [{atom(), map()}]
Get validators sorted by priority (highest first).
This is the primary way to retrieve validators for execution. The list is pre-calculated and cached in the state for performance.
Parameters
server- GenServer name/pid (optional)
Returns
A list of {name, metadata} tuples sorted by priority descending
Examples
validators = Kreuzberg.Plugin.Registry.get_validators_by_priority()
Enum.each(validators, fn {name, metadata} ->
apply(metadata.module, :validate, [...])
end)
@spec list_ocr_backends(GenServer.server() | nil) :: map()
List all registered OCR backends.
Returns a map of OCR backend names to their metadata.
Parameters
server- GenServer name/pid (optional)
Returns
A map where keys are backend names and values are metadata maps containing:
:module- The OCR backend module:languages- List of supported language codes
Examples
backends = Kreuzberg.Plugin.Registry.list_ocr_backends()
@spec list_post_processors(GenServer.server() | nil) :: map()
List all registered post-processors.
Returns a map of post-processor names to their metadata.
Parameters
server- GenServer name/pid (optional)
Returns
A map where keys are processor names and values are metadata maps containing:
:module- The processor module:config- The processor configuration:stage- The processing stage
Examples
processors = Kreuzberg.Plugin.Registry.list_post_processors()
IO.inspect(processors)
@spec list_validators(GenServer.server() | nil) :: map()
List all registered validators.
Returns a map of validator names to their metadata, not sorted.
For sorted validators, use get_validators_by_priority/1.
Parameters
server- GenServer name/pid (optional)
Returns
A map where keys are validator names and values are metadata maps containing:
:module- The validator module:priority- The validation priority
Examples
validators = Kreuzberg.Plugin.Registry.list_validators()
@spec register_ocr_backend(module(), keyword() | nil, GenServer.server() | nil) :: :ok | {:error, String.t()}
Register an OCR backend plugin.
Parameters
module- The module implementing the OCR backend behavioropts- Options keyword list with::languages- Supported language codes (list of strings, optional, defaults to [])
server- GenServer name/pid (optional)
Returns
:okon success{:error, reason}on failure
Examples
Kreuzberg.Plugin.Registry.register_ocr_backend(MyOCR)
Kreuzberg.Plugin.Registry.register_ocr_backend(MyOCR, languages: ["en", "de", "fr"])
@spec register_post_processor( atom() | module(), map() | module() | nil, atom() | nil, GenServer.server() | nil ) :: :ok | {:error, String.t()}
Register a post-processor plugin.
Parameters
module- The module implementing the post-processor behaviorconfig- Configuration map for the post-processor (optional, defaults to %{})stage- Processing stage (atom), e.g.,:pre,:post,:cleanup(optional, defaults to:post)server- GenServer name/pid (optional, defaults to default registry)
Returns
:okon success{:error, reason}on failure
Examples
Kreuzberg.Plugin.Registry.register_post_processor(MyProcessor)
Kreuzberg.Plugin.Registry.register_post_processor(MyProcessor, %{enabled: true}, :pre)
@spec register_validator(module(), keyword() | nil, GenServer.server() | nil) :: :ok | {:error, String.t()}
Register a validator plugin.
Parameters
module- The module implementing the validator behavioropts- Options keyword list with::priority- Validation priority (integer, higher runs first, optional, defaults to 0)
server- GenServer name/pid (optional)
Returns
:okon success{:error, reason}on failure
Examples
Kreuzberg.Plugin.Registry.register_validator(MyValidator)
Kreuzberg.Plugin.Registry.register_validator(MyValidator, priority: 10)
@spec start_link(keyword()) :: GenServer.on_start()
Start the registry GenServer.
Options are passed directly to GenServer.start_link/3.
Examples
{:ok, pid} = Kreuzberg.Plugin.Registry.start_link([])
{:ok, pid} = Kreuzberg.Plugin.Registry.start_link(name: :plugin_registry)
@spec unregister_ocr_backend(atom() | String.t(), GenServer.server() | nil) :: :ok
Unregister an OCR backend plugin by name.
Parameters
name- The name of the OCR backendserver- GenServer name/pid (optional)
Returns
:ok
Examples
Kreuzberg.Plugin.Registry.unregister_ocr_backend(:my_ocr)
@spec unregister_post_processor(atom() | String.t(), GenServer.server() | nil) :: :ok
Unregister a post-processor plugin by name.
Parameters
name- The name of the post-processor (atom or string)server- GenServer name/pid (optional)
Returns
:ok
Examples
Kreuzberg.Plugin.Registry.unregister_post_processor(:my_processor)
@spec unregister_validator(atom() | String.t(), GenServer.server() | nil) :: :ok
Unregister a validator plugin by name.
Parameters
name- The name of the validatorserver- GenServer name/pid (optional)
Returns
:ok
Examples
Kreuzberg.Plugin.Registry.unregister_validator(:my_validator)