Navigator Pattern: Journaled Tasks

Copy Markdown View Source

Use the Navigator pattern to build crash-safe, resumable workflows. Instead of keeping an agent process alive, re-invoke the agent with a journal that records what's already done.

Prerequisites

Core Idea

The Navigator is a stateless agent that "re-navigates" its mission by checking a journal. Each (task "id" expr) call either returns a cached result (if already done) or executes and records the result.

# Turn 1: empty journal
{:ok, step} = SubAgent.run(agent, llm: llm, context: %{}, journal: %{})
# step.journal => %{"fetch-data" => [1, 2, 3]}

# Turn 2: re-invoke with saved journal
{:ok, step} = SubAgent.run(agent, llm: llm, context: %{journal: saved_journal})
# "fetch-data" returns cached result — no duplicate API call

The app owns persistence. Save step.journal to your database between runs; pass it back on re-invocation.

How It Works

  1. The agent receives a journal (possibly empty) via the journal: option
  2. The engine injects a Mission Log into the system prompt showing completed tasks
  3. The LLM sees what's done and generates code for remaining work
  4. (task "id" expr) checks the journal: cache hit skips expr, cache miss evaluates and records
  5. The updated journal is returned in step.journal

The Re-run Pattern

A typical workflow has three phases: initial run, external event, and re-invocation.

Phase 1: Initial Run

agent = SubAgent.new(
  prompt: "Process order {{order_id}}: charge card, then ship item",
  signature: "(order_id :int) -> {status :keyword}",
  tools: %{"charge_card" => &Billing.charge/1, "ship_item" => &Shipping.ship/1},
  max_turns: 5
)

{:ok, step} = SubAgent.run(agent, llm: llm, context: %{order_id: 42}, journal: %{})
# step.journal => %{"charge_order_42" => "tx_abc"}
# step.return => %{status: :waiting}

MyRepo.save_journal(order_id, step.journal)

The LLM generates:

(do
  (task "charge_order_42" (tool/charge_card {:order_id 42}))
  (return {:status :waiting}))

Phase 2: External Event

Something happens outside the agent — a webhook, a human approval, a timer:

journal = MyRepo.get_journal(order_id)
journal = Map.put(journal, "payment_confirmed_42", true)
MyRepo.save_journal(order_id, journal)

The journal is a plain map. Any code can write to it between runs.

Phase 3: Re-invocation

journal = MyRepo.get_journal(order_id)
{:ok, step} = SubAgent.run(agent, llm: llm, context: %{order_id: 42}, journal: journal)
# step.return => %{status: :shipped}

The LLM sees the Mission Log:

## Mission Log (Completed Tasks)
- [done] charge_order_42: "tx_abc"
- [done] payment_confirmed_42: true

It generates code that skips completed work and continues:

(do
  (task "charge_order_42" (tool/charge_card {:order_id 42}))       ;; cached
  (task "payment_confirmed_42" nil)                                 ;; cached
  (task "ship_order_42" (tool/ship_item {:tx_id "tx_abc"}))        ;; executes
  (return {:status :shipped}))

Semantic Task IDs

Task IDs are the contract between the LLM and your application. Follow these guidelines:

Encode Intent and Data

IDs should describe what was done and to what, preventing stale cache hits if the agent re-plans.

BadGoodWhy
"step_1""charge_order_42"Positional IDs break when the plan changes
"fetch""fetch_users_active"Ambiguous IDs cause wrong cache hits
"task_a""send_email_alice_welcome"Semantic IDs are self-documenting

One ID Per Side-Effect

Each task should wrap exactly one side-effect. Avoid nesting tasks or combining multiple side-effects under one ID:

;; Good: one task per side-effect
(do
  (task "charge_order_42" (tool/charge_card {:order_id 42}))
  (task "ship_order_42" (tool/ship_item {:order_id 42})))

;; Bad: nested tasks
(task "process_order"
  (do (task "charge" (tool/charge_card {:order_id 42}))
      (task "ship" (tool/ship_item {:order_id 42}))))

Naming Convention for External Decisions

When the application writes decisions into the journal between runs, use a shared naming convention:

# In your system prompt: "Use task ID manager_decision_RECIPIENT_AMOUNT
#   for approval lookups"
# In your app code:
journal = Map.put(journal, "manager_decision_bob_5000", :approved)

The prompt tells the LLM the naming pattern; the app writes to the same key. Without this contract, the LLM might look up "approval_status" while the app wrote to "manager_decision_bob_5000".

Semantic Progress with Plans

For multi-step workflows, define a plan: to get automatic progress tracking. The LLM reports completion via (step-done "id" "summary") and sees a progress checklist in feedback messages.

agent = SubAgent.new(
  prompt: "Process order {{order_id}}",
  plan: ["Charge card", "Ship item", "Send confirmation"],
  tools: %{"charge_card" => &Billing.charge/1, "ship_item" => &Shipping.ship/1},
  max_turns: 10
)

The plan accepts a string list (auto-numbered IDs "1", "2", ...) or explicit {id, description} tuples:

plan: [{"charge", "Charge card"}, {"ship", "Ship item"}]

The LLM marks steps complete with summaries:

(do
  (task "charge_order_42" (tool/charge_card {:order_id 42}))
  (step-done "charge" (str "Charged " tx_id))
  (task "ship_order_42" (tool/ship_item {:order_id 42}))
  (step-done "ship" "Shipped via FedEx"))

Progress appears as a markdown checklist in feedback messages:

## Progress
- [x] Charge card — Charged tx_abc
- [x] Ship item — Shipped via FedEx
- [ ] Send confirmation

Clearing Cached Tasks

Use (task-reset "id") to clear a cached task result from the journal, forcing re-execution on the next (task "id" expr) call:

(task-reset "fetch-users")  ;; clears journal entry
(task "fetch-users" (tool/get-users {}))  ;; re-executes

task-reset only affects the journal (task cache). Step summaries from step-done are independent.

Deferred Visibility

step-done summaries use deferred visibility: they appear in the progress checklist on the next turn, not the current one. If the current turn errors, its summaries are discarded entirely. This prevents the LLM from seeing "checked off" steps in the same turn it marks them, encouraging it to verify results before reporting completion.

Journal entries from (task ...) are committed immediately — even if a later expression in the same turn fails, the task cache is preserved. This makes tasks reliable checkpoints while keeping semantic progress honest.

Limitations

step-done and task-reset must be called at top level in sequential do blocks. They do not work inside pmap, pcalls, or map closures due to process isolation. Call step-done after parallel work completes.

See PTC-Lisp Specification — sections 5.14 and 5.15 for full reference.

Failure Semantics

  • If expr succeeds, the result is committed to the journal
  • If expr calls (fail ...) or crashes, the result is not committed and the failure propagates
  • Re-running after a failure safely retries the failed task

This makes each (task ...) an atomic checkpoint.

Progressive Enhancement

(task) works without a journal — it executes expr normally without caching. A trace warning is emitted so you know idempotency is inactive. This lets you adopt the pattern incrementally.

Planning Design Space

PtcRunner's planning sits at a specific point in the broader agentic AI landscape. Understanding the alternatives helps choose the right approach for your use case.

Plan Creation

ApproachDescriptionPtcRunner
LLM-generatedA planner agent produces the stepsSupported — run a planner SubAgent, pass result to plan:
Developer-specifiedHard-coded step listSupported — pass a literal list to plan:
Emergent (no plan)Agent decides dynamically each turnDefault when plan: is omitted (ReAct-style)

Plan Mutability

The plan: field is immutable once the agent starts. The LLM adapts by doing out-of-plan work (extra step-done calls appear under "Out-of-Plan Steps"). For full re-planning, run a new SubAgent with an updated plan.

This is a deliberate tradeoff: immutable plans are predictable and auditable. Dynamic re-planning (as in LangGraph or smolagents) is more adaptive but harder to reason about.

Progress Visibility

Most frameworks inject the full plan into the system prompt. PtcRunner renders a progress checklist in user feedback messages instead. This keeps the system prompt cacheable and avoids context pollution as the plan grows.

FrameworkPlan visibility
CrewAIFull plan injected into task context
LangGraphExecutor receives current step + query
PtcRunnerChecklist in feedback messages (cacheable)

Caching vs Reporting

PtcRunner separates two concerns that most frameworks conflate:

  • task — idempotent caching. Prevents re-execution on retry or re-invocation. Raw tool results.
  • step-done — semantic progress. Human-readable summary for the checklist. No caching effect.

Use both together: task for reliability, step-done for visibility.

Checkpoint Recovery

The journal acts as an external checkpoint store. Save step.journal between runs; pass it back on re-invocation. The agent skips completed task calls and continues from where it left off.

This differs from in-process checkpointing (LangGraph's state snapshots) — PtcRunner agents are stateless, and the application owns persistence.

Choosing a Planning Style

ScenarioRecommendation
Well-defined multi-step workflowUse plan: with explicit step IDs
Exploratory or open-ended taskOmit plan:, let the agent decide (ReAct)
Two-phase: plan then executeRun a planner SubAgent first, feed plan.return.steps to executor
Long-running with retriesUse plan: + journal persistence for checkpoint recovery

See Also