View Source LangChain.Chains.SummarizeConversationChain (LangChain v0.4.1)

When an AI conversation has many back-and-forth messages (from user to assistant to user to assistant, etc.), the number of messages and the total token count can be large. Large token counts present the following problems:

Increased cost/price per generation
Increased generation times
Risk of exceeding the total token limit, resulting in an error

This chain is run as a separate process to summarize and condense a separate conversation chain. It is assumed that the chain the user sees in the UI retains all their original messages and they are not seeing the full, raw message list.

We don't want to perform more work than we need to, so we'll only kick off the summary process once the number of messages has reached some threshold, then we'll retain a configured number of the most recent messages to help retain continuity for the conversation.

Options

:llm - The LLM to use for performing the summarization. There is no need for streaming.
:keep_count - The number of raw messages to retain. It will be the most recent messages and defaults to 2 (a user and assistant message).
:threshold_count - The total number of messages (excluding the system message) that must be present before the summarizing operation is performed. Running the summarization on a short conversation chain will return the chain unchanged and not make any calls to an LLM.
:override_system_prompt - When the system prompt should be customized for the instructions on how to summarize, this can be used to provide a customized replacement of the system prompt.
:messages - When explicit control of multiple messages is needed, they can be provided as a list. They can be LangChain.PromptTemplates and the concatenated list of messages will be in the @conversation param. When this is used, any value in :override_system_prompt is ignored.

Examples

A basic example that processes the messages in a separate LLMChain, returning an updated chain with summarized contents.

{:ok, summarized_chain} =
  %{
    llm: ChatOpenAI.new!(%{model: "gpt-4o-mini", stream: false}),
    keep_count: 2,
    threshold_count: 6
  }
  |> SummarizeConversationChain.new!()
  |> SummarizeConversationChain.summarize(chain_to_summarize)

Using a :with_fallback option to still try and summarize if the LLM errors from the Azure host OpenAI.

# Azure configured OpenAI LLM
fallback_llm =
  ChatOpenAI.new!(%{
    stream: false,
    endpoint: System.fetch_env!("AZURE_OPENAI_ENDPOINT"),
    api_key: System.fetch_env!("AZURE_OPENAI_KEY")
  })

{:ok, summarized_chain} =
  %{
    llm: ChatOpenAI.new!(%{model: "gpt-4o-mini", stream: false}),
    keep_count: 2,
    threshold_count: 6
  }
  |> SummarizeConversationChain.new!()
  |> SummarizeConversationChain.evaluate(chain_to_summarize, with_fallbacks: [fallback_llm])

Summary

Types

t()

Functions

combine_messages_for_summary_text(summarizer, to_summarize)

Create a single text message to represent the current set of messages being summarized from the to_summarize chain. Uses the settings from SummarizeConversationChain. A nil is returned when the threshold has not been reached for running the summary procedure.

new(attrs \\ %{})

Start a new SummarizeConversationChain configuration.

new!(attrs \\ %{})

Start a new SummarizeConversationChain and return it or raise an error if invalid.

run(summarizer, text_to_summarize, opts \\ [])

Run a SummarizeConversationChain to summarize a text representation of a sequence of user and assistant messages.

summarize(summarizer, to_summarize, opts \\ [])

Summarize the to_summarize LLMChain using the %SummarizeConversationChain{} configuration and opts. Returns a new, potentially modified LLMChain after completing the summarization process.