Captures the structured sequence of messages and tool calls produced during
an LLMChain run for inspection, serialization, and comparison.
A trajectory provides a first-class API for extracting the decision-making path from a chain run — which tools were called, in what order, with what arguments — enabling golden-file testing, logging, and correctness assertions for agent workflows.
Why trajectories matter
When building agent systems, the final answer is only part of the story. Two agents can produce the same answer through very different reasoning paths — one might make a single efficient tool call while another makes five redundant ones. Trajectories let you evaluate the process, not just the outcome.
This is especially important for:
- Regression testing — catch when a prompt change causes the agent to take a different (possibly worse) path even if the final answer is correct
- Cost control — detect unnecessary tool calls that waste tokens and time
- Safety — verify that dangerous tools were NOT called
- Debugging — understand exactly what the agent did and why
Usage
trajectory = Trajectory.from_chain(chain)
# Serialize for logging or golden-file comparison
map = Trajectory.to_map(trajectory)
# Deserialize back from stored map
trajectory = Trajectory.from_map(map)
# Compare against expected tool call sequence
Trajectory.matches?(trajectory, [
%{name: "search", arguments: %{"query" => "weather"}},
%{name: "get_forecast", arguments: nil}
])
# Filter tool calls by name
Trajectory.calls_by_name(trajectory, "search")
# Group tool calls by conversation turn
Trajectory.calls_by_turn(trajectory)Metadata
Each trajectory captures metadata about the chain run including the model name and LLM module. You can also add custom metadata:
trajectory = Trajectory.from_chain(chain)
trajectory.metadata
#=> %{model: "gpt-4", llm_module: LangChain.ChatModels.ChatOpenAI}Evaluation patterns
Golden-file testing
Save a known-good trajectory and compare future runs against it:
# Save the golden file
golden = chain |> Trajectory.from_chain() |> Trajectory.to_map()
File.write!("test/fixtures/weather_agent.json", Jason.encode!(golden))
# In your test
golden = "test/fixtures/weather_agent.json" |> File.read!() |> Jason.decode!()
expected = Trajectory.from_map(golden)
actual = Trajectory.from_chain(chain)
assert Trajectory.matches?(actual, expected)Verifying tools were NOT called
Use refute with superset mode to ensure dangerous tools weren't invoked:
# Using Trajectory.Assertions
use LangChain.Trajectory.Assertions
refute_trajectory trajectory, [
%{name: "delete_all", arguments: nil}
], mode: :supersetFlexible matching
When you care about which tools were called but not exact arguments:
Trajectory.matches?(trajectory, [
%{name: "search", arguments: nil},
%{name: "summarize", arguments: nil}
])When you care that certain tools were called but allow extra calls:
Trajectory.matches?(trajectory, [
%{name: "search", arguments: nil}
], mode: :superset)Comparison modes
The matches?/3 function supports three modes via the :mode option:
:strict(default) — same tool calls in the same order and count:unordered— same tool calls in any order, same count:superset— actual contains at least all expected calls
And two argument comparison modes via the :args option:
:exact(default) — arguments must match exactly:subset— expected arguments must be a subset of actual arguments
External references
For more on trajectory-based evaluation of agent systems, see:
- LangSmith Trajectory Evaluation — trajectory-level evaluators for scoring agent behavior
- AgentEvals — reference implementations of trajectory matching algorithms
Arguments use string keys
Tool call arguments come from JSON decoding and use string keys
(e.g. %{"city" => "Paris"} not %{city: "Paris"}). Expected arguments
in matches?/3 should use string keys as well.
Summary
Functions
Return all tool calls matching the given tool name.
Group tool calls by conversation turn (assistant message index).
Build a Trajectory from a chain's exchanged_messages.
Deserialize a trajectory from a plain map previously produced by to_map/1.
Compare a trajectory's tool calls against an expected sequence.
Serialize a trajectory to plain maps for logging, storage, or golden-file comparison.
Types
@type t() :: %LangChain.Trajectory{ messages: [LangChain.Message.t()], metadata: map(), token_usage: LangChain.TokenUsage.t() | nil, tool_calls: [tool_call_map()] }
A simplified tool call map with the tool name and its arguments.
Functions
@spec calls_by_name(t(), String.t()) :: [tool_call_map()]
Return all tool calls matching the given tool name.
Example
Trajectory.calls_by_name(trajectory, "search")
#=> [%{name: "search", arguments: %{"query" => "weather"}}]
@spec calls_by_turn(t()) :: [{non_neg_integer(), [tool_call_map()]}]
Group tool calls by conversation turn (assistant message index).
Returns a list of {turn_index, [tool_call_map]} tuples where turn_index
is the 0-based position of the assistant message among all assistant messages
that contained tool calls.
Example
Trajectory.calls_by_turn(trajectory)
#=> [{0, [%{name: "search", arguments: %{"query" => "weather"}}]},
# {1, [%{name: "get_forecast", arguments: %{"city" => "Paris"}}]}]
@spec from_chain(LangChain.Chains.LLMChain.t()) :: t()
Build a Trajectory from a chain's exchanged_messages.
Uses exchanged_messages — the messages added during the chain run — rather
than messages which includes pre-loaded system and user messages. This
focuses the trajectory on the agent's actual decision-making path.
Extracts tool calls into a flat list and aggregates token usage across all assistant messages.
Important: call immediately after run/2
LLMChain.run/2 clears exchanged_messages at the start of each
invocation. This means from_chain/1 captures only the messages from the
most recent run call. If you need to capture a trajectory, call
from_chain/1 immediately after run/2 returns — before any subsequent
run call on the same chain.
Example
trajectory = Trajectory.from_chain(chain)
Deserialize a trajectory from a plain map previously produced by to_map/1.
Restores tool_calls and token_usage but stores messages as raw maps
since full Message struct reconstruction requires schema context that
plain maps don't carry.
Note: After a JSON roundtrip (Jason.encode! → Jason.decode!), atom
keys become strings and module names become string representations. This
means metadata fields like :llm_module will differ between a fresh
trajectory and one restored from JSON. The matches?/3 function compares
only tool calls, so this does not affect matching.
Example
map = Trajectory.to_map(trajectory)
restored = Trajectory.from_map(map)
restored.tool_calls == trajectory.tool_calls
@spec matches?( t() | LangChain.Chains.LLMChain.t() | [tool_call_map()], t() | [tool_call_map()], keyword() ) :: boolean()
Compare a trajectory's tool calls against an expected sequence.
expected can be a Trajectory struct or a bare list of
%{name: ..., arguments: ...} maps for inline test expectations.
When arguments is nil in an expected entry, it matches any arguments for
that tool name.
Options
:mode— comparison mode (default:strict):strict— same tool calls in the same order and count:unordered— same tool calls in any order:superset— actual contains at least all expected calls
:args— argument comparison (default:exact):exact— arguments must match exactly:subset— expected arguments are a subset of actual arguments
Examples
# Strict order and exact arguments
Trajectory.matches?(trajectory, [
%{name: "search", arguments: %{"query" => "weather"}}
])
# Any order, ignore extra calls
Trajectory.matches?(trajectory, expected, mode: :superset, args: :subset)
Serialize a trajectory to plain maps for logging, storage, or golden-file comparison.
Messages are converted to maps with :role, :content, :tool_calls, and
:tool_results keys. Content is normalized to strings via
ContentPart.content_to_string/1.