Python Parity Fixtures
View SourceMilestone 0 focuses on capturing golden event streams from the Python Codex SDK so that every Elixir parity test can replay deterministic transcripts. This document explains how to harvest, review, and maintain those fixtures.
Goals
- Produce JSONL logs that represent the canonical behavior of key workflows (thread lifecycle, tools, structured output, sandbox approvals, error paths).
- Store fixtures under
integration/fixtures/pythonwith stable filenames and metadata. - Regenerate fixtures as the Python SDK evolves, keeping a clear audit trail.
Current Fixtures
thread_basic.jsonl– Baseline single-turn conversation fixture used by parity tests.thread_auto_run_step1.jsonl/thread_auto_run_step2.jsonl– Continuation-aware auto-run scenario exercised by the Elixir auto-run loop tests.thread_auto_run_pending.jsonl– Pending continuation fixture validating max-attempt guardrails.thread_tool_auto_step1.jsonl/thread_tool_auto_step2.jsonl– Tool invocation scenario with continuation/resumption.thread_tool_auto_pending.jsonl– Tool-approval denial transcript used to assert policy handling.
Harvesting Workflow
Clone Python SDK
Check out theopenai/codexrepository next to this project (or setCODEX_PYTHON_SDK_PATH).Install Dependencies
python3 -m venv .venv source .venv/bin/activate pip install -e .[dev]Build or Download codex-rs Binary
Ensure the Python SDK runs against the samecodex-rsversion we pin in this repo. Point to it via--codex-pathwhen running the harvester if needed.Run Harvester
python3 scripts/harvest_python_fixtures.py \ --python-sdk ../openai/codex \ --output integration/fixtures/pythonUse
--scenarioto target a subset (e.g.,--scenario thread_basic).Review Output
Inspect generated.jsonlfiles and associated schemas. Confirm naming, metadata comments (if any), and absence of secrets.Commit Fixtures
Add new or updated files underintegration/fixtures/. Note in PR and update the parity checklist.
Scenario Modules
The harvester expects the Python repo to provide modules under harvest_scenarios.* with a record(output_path, **kwargs) function. Each function should:
- Execute the relevant workflow using the Python SDK.
- Stream codex events into
output_pathas JSONL. - Optionally write structured output schemas under
integration/fixtures/schemas.
Example skeleton (in Python repo):
from codex.client import CodexClient
def record(output_path, codex_path=None):
client = CodexClient(codex_binary=codex_path)
thread = client.start_thread()
turn = client.run(thread, "hello")
with open(output_path, "w", encoding="utf-8") as f:
for event in turn.events:
f.write(event.json() + "\n")Maintenance Checklist
- Update
SCENARIOSinscripts/harvest_python_fixtures.pywhen new workflows need coverage. - Track harvested scenarios and their freshness in
docs/python-parity-checklist.md. - Regenerate fixtures whenever the Python SDK changes behavior; keep diffs to confirm expected deltas.
- Ensure sensitive data is redacted before committing.
Troubleshooting
- Module Not Found: Verify
PYTHONPATHincludes the Python repo (the harvester adds it automatically). - codex-rs Mismatch: Rebuild or download the binary pinned in
config/native.exsonce available. - Fixture Drift: Re-run harvester and compare diffs. Legitimate changes should be accompanied by updated Elixir tests.