Kreuzberg.Plugin.Registry (kreuzberg v4.4.5)

Copy Markdown View Source

GenServer for managing Kreuzberg plugins.

This module provides a centralized registry for managing different types of plugins:

  • Post-processors: Transform extracted text content
  • Validators: Validate extraction configuration parameters
  • OCR backends: Provide OCR functionality

The registry maintains plugin metadata including module references, configuration, priorities, stages, and language support. All operations are thread-safe through the GenServer interface.

State Structure

The internal state is a map with four top-level keys:

  • :post_processors: Maps processor name to %{module: ..., config: ..., stage: ...}
  • :validators: Maps validator name to %{module: ..., priority: ...}
  • :sorted_validators: Pre-sorted list of validators by priority (descending) for performance
  • :ocr_backends: Maps backend name to %{module: ..., languages: ...}

Usage

Typically used during application startup to register available plugins:

{:ok, _pid} = Kreuzberg.Plugin.Registry.start_link([])
Kreuzberg.Plugin.Registry.register_post_processor(MyPostProcessor, %{enabled: true}, :pre)
Kreuzberg.Plugin.Registry.register_validator(MyValidator, priority: 10)
Kreuzberg.Plugin.Registry.register_ocr_backend(MyOCRBackend, languages: ["en", "de"])

Summary

Functions

Returns a specification to start this module under a supervisor.

Clear all registered OCR backends.

Clear all registered post-processors.

Clear all registered validators.

Get a specific OCR backend by name.

Get OCR backends that support a specific language.

Get a specific post-processor by name.

Get post-processors for a specific processing stage.

Get a specific validator by name.

Get validators sorted by priority (highest first).

List all registered OCR backends.

List all registered post-processors.

List all registered validators.

Start the registry GenServer.

Unregister an OCR backend plugin by name.

Unregister a post-processor plugin by name.

Unregister a validator plugin by name.

Functions

child_spec(init_arg)

Returns a specification to start this module under a supervisor.

See Supervisor.

clear_ocr_backends(server \\ nil)

@spec clear_ocr_backends(GenServer.server() | nil) :: :ok

Clear all registered OCR backends.

Parameters

  • server - GenServer name/pid (optional)

Returns

  • :ok

Examples

Kreuzberg.Plugin.Registry.clear_ocr_backends()

clear_post_processors(server \\ nil)

@spec clear_post_processors(GenServer.server() | nil) :: :ok

Clear all registered post-processors.

Parameters

  • server - GenServer name/pid (optional)

Returns

  • :ok

Examples

Kreuzberg.Plugin.Registry.clear_post_processors()

clear_validators(server \\ nil)

@spec clear_validators(GenServer.server() | nil) :: :ok

Clear all registered validators.

Parameters

  • server - GenServer name/pid (optional)

Returns

  • :ok

Examples

Kreuzberg.Plugin.Registry.clear_validators()

get_ocr_backend(name, server \\ nil)

@spec get_ocr_backend(atom() | String.t(), GenServer.server() | nil) ::
  {:ok, map()} | {:error, String.t()}

Get a specific OCR backend by name.

Parameters

  • name - The OCR backend name
  • server - GenServer name/pid (optional)

Returns

  • {:ok, metadata} - OCR backend metadata
  • {:error, "Not found"} - OCR backend not found

Examples

{:ok, metadata} = Kreuzberg.Plugin.Registry.get_ocr_backend(:my_ocr)

get_ocr_backends_by_language(language, server \\ nil)

@spec get_ocr_backends_by_language(String.t(), GenServer.server() | nil) :: map()

Get OCR backends that support a specific language.

Uses a single-pass reduce for optimal performance (OPTIMIZATION 2).

Parameters

  • language - The language code (string)
  • server - GenServer name/pid (optional)

Returns

A map of backend names to metadata for backends supporting the language

Examples

en_backends = Kreuzberg.Plugin.Registry.get_ocr_backends_by_language("en")

get_post_processor(name, server \\ nil)

@spec get_post_processor(atom() | String.t(), GenServer.server() | nil) ::
  {:ok, map()} | {:error, String.t()}

Get a specific post-processor by name.

Parameters

  • name - The post-processor name
  • server - GenServer name/pid (optional)

Returns

  • {:ok, metadata} - Post-processor metadata
  • {:error, "Not found"} - Post-processor not found

Examples

{:ok, metadata} = Kreuzberg.Plugin.Registry.get_post_processor(:my_processor)

get_post_processors_by_stage(stage, server \\ nil)

@spec get_post_processors_by_stage(atom(), GenServer.server() | nil) :: map()

Get post-processors for a specific processing stage.

Parameters

  • stage - The processing stage to filter by (atom)
  • server - GenServer name/pid (optional)

Returns

A map of post-processor names to metadata for the specified stage

Examples

pre_processors = Kreuzberg.Plugin.Registry.get_post_processors_by_stage(:pre)

get_validator(name, server \\ nil)

@spec get_validator(atom() | String.t(), GenServer.server() | nil) ::
  {:ok, map()} | {:error, String.t()}

Get a specific validator by name.

Parameters

  • name - The validator name
  • server - GenServer name/pid (optional)

Returns

  • {:ok, metadata} - Validator metadata
  • {:error, "Not found"} - Validator not found

Examples

{:ok, metadata} = Kreuzberg.Plugin.Registry.get_validator(:my_validator)

get_validators_by_priority(server \\ nil)

@spec get_validators_by_priority(GenServer.server() | nil) :: [{atom(), map()}]

Get validators sorted by priority (highest first).

This is the primary way to retrieve validators for execution. The list is pre-calculated and cached in the state for performance.

Parameters

  • server - GenServer name/pid (optional)

Returns

A list of {name, metadata} tuples sorted by priority descending

Examples

validators = Kreuzberg.Plugin.Registry.get_validators_by_priority()
Enum.each(validators, fn {name, metadata} ->
  apply(metadata.module, :validate, [...])
end)

list_ocr_backends(server \\ nil)

@spec list_ocr_backends(GenServer.server() | nil) :: map()

List all registered OCR backends.

Returns a map of OCR backend names to their metadata.

Parameters

  • server - GenServer name/pid (optional)

Returns

A map where keys are backend names and values are metadata maps containing:

  • :module - The OCR backend module
  • :languages - List of supported language codes

Examples

backends = Kreuzberg.Plugin.Registry.list_ocr_backends()

list_post_processors(server \\ nil)

@spec list_post_processors(GenServer.server() | nil) :: map()

List all registered post-processors.

Returns a map of post-processor names to their metadata.

Parameters

  • server - GenServer name/pid (optional)

Returns

A map where keys are processor names and values are metadata maps containing:

  • :module - The processor module
  • :config - The processor configuration
  • :stage - The processing stage

Examples

processors = Kreuzberg.Plugin.Registry.list_post_processors()
IO.inspect(processors)

list_validators(server \\ nil)

@spec list_validators(GenServer.server() | nil) :: map()

List all registered validators.

Returns a map of validator names to their metadata, not sorted. For sorted validators, use get_validators_by_priority/1.

Parameters

  • server - GenServer name/pid (optional)

Returns

A map where keys are validator names and values are metadata maps containing:

  • :module - The validator module
  • :priority - The validation priority

Examples

validators = Kreuzberg.Plugin.Registry.list_validators()

register_ocr_backend(module, opts \\ nil, server \\ nil)

@spec register_ocr_backend(module(), keyword() | nil, GenServer.server() | nil) ::
  :ok | {:error, String.t()}

Register an OCR backend plugin.

Parameters

  • module - The module implementing the OCR backend behavior
  • opts - Options keyword list with:
    • :languages - Supported language codes (list of strings, optional, defaults to [])
  • server - GenServer name/pid (optional)

Returns

  • :ok on success
  • {:error, reason} on failure

Examples

Kreuzberg.Plugin.Registry.register_ocr_backend(MyOCR)
Kreuzberg.Plugin.Registry.register_ocr_backend(MyOCR, languages: ["en", "de", "fr"])

register_post_processor(name_or_module, config_or_module \\ nil, stage \\ nil, server \\ nil)

@spec register_post_processor(
  atom() | module(),
  map() | module() | nil,
  atom() | nil,
  GenServer.server() | nil
) :: :ok | {:error, String.t()}

Register a post-processor plugin.

Parameters

  • module - The module implementing the post-processor behavior
  • config - Configuration map for the post-processor (optional, defaults to %{})
  • stage - Processing stage (atom), e.g., :pre, :post, :cleanup (optional, defaults to :post)
  • server - GenServer name/pid (optional, defaults to default registry)

Returns

  • :ok on success
  • {:error, reason} on failure

Examples

Kreuzberg.Plugin.Registry.register_post_processor(MyProcessor)
Kreuzberg.Plugin.Registry.register_post_processor(MyProcessor, %{enabled: true}, :pre)

register_validator(module, opts \\ nil, server \\ nil)

@spec register_validator(module(), keyword() | nil, GenServer.server() | nil) ::
  :ok | {:error, String.t()}

Register a validator plugin.

Parameters

  • module - The module implementing the validator behavior
  • opts - Options keyword list with:
    • :priority - Validation priority (integer, higher runs first, optional, defaults to 0)
  • server - GenServer name/pid (optional)

Returns

  • :ok on success
  • {:error, reason} on failure

Examples

Kreuzberg.Plugin.Registry.register_validator(MyValidator)
Kreuzberg.Plugin.Registry.register_validator(MyValidator, priority: 10)

start_link(opts \\ [])

@spec start_link(keyword()) :: GenServer.on_start()

Start the registry GenServer.

Options are passed directly to GenServer.start_link/3.

Examples

{:ok, pid} = Kreuzberg.Plugin.Registry.start_link([])
{:ok, pid} = Kreuzberg.Plugin.Registry.start_link(name: :plugin_registry)

unregister_ocr_backend(name, server \\ nil)

@spec unregister_ocr_backend(atom() | String.t(), GenServer.server() | nil) :: :ok

Unregister an OCR backend plugin by name.

Parameters

  • name - The name of the OCR backend
  • server - GenServer name/pid (optional)

Returns

  • :ok

Examples

Kreuzberg.Plugin.Registry.unregister_ocr_backend(:my_ocr)

unregister_post_processor(name, server \\ nil)

@spec unregister_post_processor(atom() | String.t(), GenServer.server() | nil) :: :ok

Unregister a post-processor plugin by name.

Parameters

  • name - The name of the post-processor (atom or string)
  • server - GenServer name/pid (optional)

Returns

  • :ok

Examples

Kreuzberg.Plugin.Registry.unregister_post_processor(:my_processor)

unregister_validator(name, server \\ nil)

@spec unregister_validator(atom() | String.t(), GenServer.server() | nil) :: :ok

Unregister a validator plugin by name.

Parameters

  • name - The name of the validator
  • server - GenServer name/pid (optional)

Returns

  • :ok

Examples

Kreuzberg.Plugin.Registry.unregister_validator(:my_validator)