# `Tribunal.Judge`
[🔗](https://github.com/georgeguimaraes/tribunal/blob/v1.3.6/lib/tribunal/judge.ex#L1)

Behaviour for LLM-as-judge assertions.

All judges (built-in and custom) implement this behaviour. This provides
a consistent interface for evaluation criteria.

## Example

    defmodule MyApp.Judges.BrandVoice do
      @behaviour Tribunal.Judge

      @impl true
      def name, do: :brand_voice

      @impl true
      def prompt(test_case, _opts) do
        """
        Evaluate if the response matches our brand voice guidelines:

        - Friendly but professional tone
        - No jargon or technical terms
        - Empathetic and helpful

        Response to evaluate:
        #{test_case.actual_output}

        Query: #{test_case.input}
        """
      end
    end

## Configuration

Register your custom judges in config:

    config :tribunal, :custom_judges, [
      MyApp.Judges.BrandVoice,
      MyApp.Judges.Compliance
    ]

Then use them like built-in assertions:

    assert_judge :brand_voice, response, query: input

# `evaluate_result`
*optional* 

```elixir
@callback evaluate_result(result :: map(), opts :: keyword()) ::
  {:pass, map()} | {:fail, map()}
```

Optional: customize how the LLM result is interpreted.

By default, uses verdict and threshold logic. Override for custom pass/fail logic.

Should return `{:pass, details}` or `{:fail, details}`.

# `name`

```elixir
@callback name() :: atom()
```

Returns the atom name for this judge.

This name is used to invoke the judge in assertions:

    assert_judge :my_judge_name, response, opts

# `negative_metric?`
*optional* 

```elixir
@callback negative_metric?() :: boolean()
```

Optional: whether "no" verdict means pass (for negative metrics like toxicity).

When true, verdict "no" = pass and "yes" = fail.
When false (default), verdict "yes" = pass and "no" = fail.

# `prompt`

```elixir
@callback prompt(test_case :: Tribunal.TestCase.t(), opts :: keyword()) :: String.t()
```

Builds the evaluation prompt for the LLM judge.

Receives the test case and any options passed to the assertion.
Should return a prompt string that asks the LLM to evaluate
the response and return a JSON verdict.

The prompt should instruct the LLM to return JSON with:
- `verdict`: "yes", "no", or "partial"
- `reason`: explanation for the verdict
- `score`: confidence score 0.0-1.0

# `validate`
*optional* 

```elixir
@callback validate(test_case :: Tribunal.TestCase.t()) :: :ok | {:error, String.t()}
```

Optional: validate that the test case has required fields.

Return `:ok` if valid, or `{:error, reason}` if not.
Default implementation always returns `:ok`.

# `all_judge_names`

Returns list of all judge names (built-in + custom).

# `all_judges`

Returns all judge modules (built-in + custom).

# `builtin_judge?`

Checks if a name is a built-in judge.

# `builtin_judge_names`

Returns list of built-in judge names.

# `builtin_judges`

Returns all built-in judge modules.

# `custom_judge?`

Checks if a name is a registered custom judge.

# `custom_judge_names`

Returns list of custom judge names.

# `custom_judges`

Returns all configured custom judge modules.

# `find`

Finds a judge module by name.

Returns `{:ok, module}` or `:error`.

---

*Consult [api-reference.md](api-reference.md) for complete listing*