LangChain.Trajectory (LangChain v0.6.3)

Copy Markdown View Source

Captures the structured sequence of messages and tool calls produced during an LLMChain run for inspection, serialization, and comparison.

A trajectory provides a first-class API for extracting the decision-making path from a chain run — which tools were called, in what order, with what arguments — enabling golden-file testing, logging, and correctness assertions for agent workflows.

Why trajectories matter

When building agent systems, the final answer is only part of the story. Two agents can produce the same answer through very different reasoning paths — one might make a single efficient tool call while another makes five redundant ones. Trajectories let you evaluate the process, not just the outcome.

This is especially important for:

  • Regression testing — catch when a prompt change causes the agent to take a different (possibly worse) path even if the final answer is correct
  • Cost control — detect unnecessary tool calls that waste tokens and time
  • Safety — verify that dangerous tools were NOT called
  • Debugging — understand exactly what the agent did and why

Usage

trajectory = Trajectory.from_chain(chain)

# Serialize for logging or golden-file comparison
map = Trajectory.to_map(trajectory)

# Deserialize back from stored map
trajectory = Trajectory.from_map(map)

# Compare against expected tool call sequence
Trajectory.matches?(trajectory, [
  %{name: "search", arguments: %{"query" => "weather"}},
  %{name: "get_forecast", arguments: nil}
])

# Filter tool calls by name
Trajectory.calls_by_name(trajectory, "search")

# Group tool calls by conversation turn
Trajectory.calls_by_turn(trajectory)

Metadata

Each trajectory captures metadata about the chain run including the model name and LLM module. You can also add custom metadata:

trajectory = Trajectory.from_chain(chain)
trajectory.metadata
#=> %{model: "gpt-4", llm_module: LangChain.ChatModels.ChatOpenAI}

Evaluation patterns

Golden-file testing

Save a known-good trajectory and compare future runs against it:

# Save the golden file
golden = chain |> Trajectory.from_chain() |> Trajectory.to_map()
File.write!("test/fixtures/weather_agent.json", Jason.encode!(golden))

# In your test
golden = "test/fixtures/weather_agent.json" |> File.read!() |> Jason.decode!()
expected = Trajectory.from_map(golden)
actual = Trajectory.from_chain(chain)
assert Trajectory.matches?(actual, expected)

Verifying tools were NOT called

Use refute with superset mode to ensure dangerous tools weren't invoked:

# Using Trajectory.Assertions
use LangChain.Trajectory.Assertions

refute_trajectory trajectory, [
  %{name: "delete_all", arguments: nil}
], mode: :superset

Flexible matching

When you care about which tools were called but not exact arguments:

Trajectory.matches?(trajectory, [
  %{name: "search", arguments: nil},
  %{name: "summarize", arguments: nil}
])

When you care that certain tools were called but allow extra calls:

Trajectory.matches?(trajectory, [
  %{name: "search", arguments: nil}
], mode: :superset)

Comparison modes

The matches?/3 function supports three modes via the :mode option:

  • :strict (default) — same tool calls in the same order and count
  • :unordered — same tool calls in any order, same count
  • :superset — actual contains at least all expected calls

And two argument comparison modes via the :args option:

  • :exact (default) — arguments must match exactly
  • :subset — expected arguments must be a subset of actual arguments

External references

For more on trajectory-based evaluation of agent systems, see:

Arguments use string keys

Tool call arguments come from JSON decoding and use string keys (e.g. %{"city" => "Paris"} not %{city: "Paris"}). Expected arguments in matches?/3 should use string keys as well.

Summary

Types

t()

A simplified tool call map with the tool name and its arguments.

Functions

Return all tool calls matching the given tool name.

Group tool calls by conversation turn (assistant message index).

Build a Trajectory from a chain's exchanged_messages.

Deserialize a trajectory from a plain map previously produced by to_map/1.

Compare a trajectory's tool calls against an expected sequence.

Serialize a trajectory to plain maps for logging, storage, or golden-file comparison.

Types

t()

@type t() :: %LangChain.Trajectory{
  messages: [LangChain.Message.t()],
  metadata: map(),
  token_usage: LangChain.TokenUsage.t() | nil,
  tool_calls: [tool_call_map()]
}

tool_call_map()

@type tool_call_map() :: %{name: String.t(), arguments: map() | nil}

A simplified tool call map with the tool name and its arguments.

Functions

calls_by_name(trajectory, name)

@spec calls_by_name(t(), String.t()) :: [tool_call_map()]

Return all tool calls matching the given tool name.

Example

Trajectory.calls_by_name(trajectory, "search")
#=> [%{name: "search", arguments: %{"query" => "weather"}}]

calls_by_turn(trajectory)

@spec calls_by_turn(t()) :: [{non_neg_integer(), [tool_call_map()]}]

Group tool calls by conversation turn (assistant message index).

Returns a list of {turn_index, [tool_call_map]} tuples where turn_index is the 0-based position of the assistant message among all assistant messages that contained tool calls.

Example

Trajectory.calls_by_turn(trajectory)
#=> [{0, [%{name: "search", arguments: %{"query" => "weather"}}]},
#    {1, [%{name: "get_forecast", arguments: %{"city" => "Paris"}}]}]

from_chain(llm_chain)

@spec from_chain(LangChain.Chains.LLMChain.t()) :: t()

Build a Trajectory from a chain's exchanged_messages.

Uses exchanged_messages — the messages added during the chain run — rather than messages which includes pre-loaded system and user messages. This focuses the trajectory on the agent's actual decision-making path.

Extracts tool calls into a flat list and aggregates token usage across all assistant messages.

Important: call immediately after run/2

LLMChain.run/2 clears exchanged_messages at the start of each invocation. This means from_chain/1 captures only the messages from the most recent run call. If you need to capture a trajectory, call from_chain/1 immediately after run/2 returns — before any subsequent run call on the same chain.

Example

trajectory = Trajectory.from_chain(chain)

from_map(map)

@spec from_map(map()) :: t()

Deserialize a trajectory from a plain map previously produced by to_map/1.

Restores tool_calls and token_usage but stores messages as raw maps since full Message struct reconstruction requires schema context that plain maps don't carry.

Note: After a JSON roundtrip (Jason.encode!Jason.decode!), atom keys become strings and module names become string representations. This means metadata fields like :llm_module will differ between a fresh trajectory and one restored from JSON. The matches?/3 function compares only tool calls, so this does not affect matching.

Example

map = Trajectory.to_map(trajectory)
restored = Trajectory.from_map(map)
restored.tool_calls == trajectory.tool_calls

matches?(actual, expected, opts \\ [])

@spec matches?(
  t() | LangChain.Chains.LLMChain.t() | [tool_call_map()],
  t() | [tool_call_map()],
  keyword()
) :: boolean()

Compare a trajectory's tool calls against an expected sequence.

expected can be a Trajectory struct or a bare list of %{name: ..., arguments: ...} maps for inline test expectations.

When arguments is nil in an expected entry, it matches any arguments for that tool name.

Options

  • :mode — comparison mode (default :strict)

    • :strict — same tool calls in the same order and count
    • :unordered — same tool calls in any order
    • :superset — actual contains at least all expected calls
  • :args — argument comparison (default :exact)

    • :exact — arguments must match exactly
    • :subset — expected arguments are a subset of actual arguments

Examples

# Strict order and exact arguments
Trajectory.matches?(trajectory, [
  %{name: "search", arguments: %{"query" => "weather"}}
])

# Any order, ignore extra calls
Trajectory.matches?(trajectory, expected, mode: :superset, args: :subset)

to_map(trajectory)

@spec to_map(t()) :: map()

Serialize a trajectory to plain maps for logging, storage, or golden-file comparison.

Messages are converted to maps with :role, :content, :tool_calls, and :tool_results keys. Content is normalized to strings via ContentPart.content_to_string/1.