Agent Trace Dry Run Guide¶
This guide documents how agent.trace works in practice, how to run it in cycles,
and how to interpret control outcomes (ok vs blocked) using realistic scenarios.
1) What the skill does¶
agent.trace is an incremental control skill for agent execution.
Per cycle it:
- Validates incoming runtime events.
- Analyzes execution state and emits structured trace artifacts.
- Monitors thresholds and returns control signals.
Steps use explicit config.depends_on declarations and execute sequentially
through the DAG scheduler (see docs/SCHEDULER.md).
Main outputs used by orchestrators:
updated_trace_statetrace_session_iddecision_graphassumptionsalternative_pathsrisk_candidatesconfidencecontrol_statusrisk_flagsalerts
2) Execution model (important)¶
Cycles are not hardcoded in the skill.
- The skill contract is stable.
- The orchestrator decides how many cycles to run.
- State continuity is external via
trace_state+trace_session_id.
Typical pattern:
- Call cycle N with new
events. - Read
updated_trace_state. - Call cycle N+1 using previous state/session.
- Use
control_statusto continue, replan, or stop. - End with
mode: finalize.
3) Local instance model used in this project¶
This project runs trace dry-runs using a local host instance under artifacts:
- Local service config:
artifacts/trace-instance/.agent-skills/services.yaml - Local active bindings:
artifacts/trace-instance/.agent-skills/active_bindings.json - Local analyze binding:
artifacts/trace-instance/.agent-skills/bindings/local/ops.trace.analyze/python_ops_trace_analyze_openai_local.yaml - Local monitor binding:
artifacts/trace-instance/.agent-skills/bindings/local/ops.trace.monitor/python_ops_trace_monitor_local.yaml - Local implementation:
artifacts/trace-instance/modules/local_instance/trace_openai_service.py
This keeps experimentation instance-scoped and avoids modifying official runtime services.
4) NPM dry-run scripts¶
Directory:
artifacts/trace-instance/npm-dry-run/
Scripts:
npm run dry-run: baseline risk scenarionpm run dry-run:mitigated: mitigation scenarionpm run dry-run:real-agent: uses OpenAI to generate iterative recommendations and feeds each step intoagent.trace
PowerShell launcher (same terminal where your OpenAI key is already active):
$env:PATH="C:\Program Files\nodejs;" + $env:PATH
& "C:\Program Files\nodejs\npm.cmd" run dry-run --prefix "c:\Users\Usuario\agent-skills\artifacts\trace-instance\npm-dry-run"
& "C:\Program Files\nodejs\npm.cmd" run dry-run:mitigated --prefix "c:\Users\Usuario\agent-skills\artifacts\trace-instance\npm-dry-run"
& "C:\Program Files\nodejs\npm.cmd" run dry-run:real-agent --prefix "c:\Users\Usuario\agent-skills\artifacts\trace-instance\npm-dry-run"
5) Baselines captured¶
Blocked baseline snapshot:
artifacts/trace-instance/npm-dry-run/baselines/2026-03-15-openai-blocked-v1/
Mitigated baseline snapshot:
artifacts/trace-instance/npm-dry-run/baselines/2026-03-15-openai-mitigated-v1/
Real-agent blocked baseline snapshot:
artifacts/trace-instance/npm-dry-run/baselines/2026-03-15-openai-real-agent-blocked-v1/
Each baseline folder includes:
cycle1.input.json,cycle1.output.jsoncycle2.input.json,cycle2.output.jsoncycle3.input.json,cycle3.output.jsonbaseline_summary.json
6) How to read outcomes¶
control_status is a control decision, not just analytics.
blocked: execution should replan before continuing.ok: execution can proceed under configured thresholds.
risk_flags can still exist when status is ok.
This means risk is present but currently within allowed threshold policy.
7) Observed behavior in this project¶
Baseline scenario:
- Session continuity preserved across cycles.
- Risk persisted and final status remained
blocked.
Mitigated scenario:
- Session continuity preserved across cycles.
- Added mitigation events and validation evidence.
- Final status moved to
okwhile still exposing non-zero risk flags.
Real-agent scenario (OpenAI-generated steps):
- Session continuity preserved across all 3 cycles.
analysis_sourcestayed inopenaimode across cycles.- Final status remained
blocked, with persistent risk flags requiring governance replan.
This is expected and desirable for governance: no hidden risk, but controlled progression.
8) Recommended operational policy¶
For first production usage:
- Treat
blockedas mandatory replan. - Require explicit mitigation evidence events before retry.
- Persist cycle inputs/outputs for audit trails.
- Keep thresholds explicit per scenario instead of relying on implicit defaults.
9) Known constraints¶
- If OpenAI key is missing in the running process, analysis falls back to heuristic mode.
- Node/npm must be available in PATH (or launched via absolute executable path on Windows).
- On Windows, real-agent subprocesses should force UTF-8 (
PYTHONIOENCODING=utf-8,PYTHONUTF8=1) to avoid Unicode print errors in multi-cycle runs. - The local trace service is instance-scoped under artifacts and intended for controlled experiments before official promotion.