Puck.Compaction.Summarize (Puck v0.2.11)

Copy Markdown View Source

LLM-based summarization compaction strategy.

This strategy uses an LLM to summarize older conversation history while preserving the most recent messages verbatim. Similar to Claude Code's /compact command.

Supports both ReqLLM and BAML backends transparently:

  • ReqLLM: Pass a :client option with a Puck.Client
  • BAML: Pass a :client_registry option to use Puck's built-in BAML function

Configuration

Common options:

  • :keep_last - Number of recent messages to preserve verbatim (default: 3)
  • :max_tokens (required for auto-compaction) - Token threshold; should_compact?/2 returns false unless this is set
  • :prompt - Custom summarization prompt (optional)

ReqLLM-specific:

  • :client (required for ReqLLM) - Puck.Client to use for summarization calls

BAML-specific:

  • :client_registry (required for BAML) - Client registry map for LLM provider configuration

How It Works

  1. Splits messages: older messages to summarize vs last K messages to keep
  2. Formats older messages as a conversation transcript
  3. Calls the LLM with a summarization prompt
  4. Returns new context: [summary_message] ++ last_k_messages

Examples

ReqLLM:

client = Puck.Client.new({Puck.Backends.ReqLLM, "anthropic:claude-sonnet-4-5"})

{:ok, compacted} = Puck.Compaction.compact(context, {Puck.Compaction.Summarize, %{
  client: client,
  keep_last: 3
}})

BAML (auto-detected when using BAML backend with auto_compaction):

registry = %{
  primary: "claude",
  clients: [%{
    name: "claude",
    provider: "anthropic",
    options: %{model: "claude-sonnet-4-5", api_key: System.get_env("ANTHROPIC_API_KEY")}
  }]
}

{:ok, compacted} = Puck.Compaction.compact(context, {Puck.Compaction.Summarize, %{
  client_registry: registry,
  keep_last: 3
}})