CognitiveState v1¶
Date: 2026-03-19
This document describes the CognitiveState v1 extension to the runtime execution model.
CognitiveState v1 adds structured cognitive blocks to ExecutionState without
breaking any existing behavior. All legacy fields (vars, outputs, events,
step_results, status, trace_id) remain unchanged and fully functional.
Motivation¶
The original ExecutionState supports flat key-value vars and outputs that
work well for simple linear skills. Multi-step reasoning skills need:
- an immutable reasoning context (why does this run exist?)
- structured working memory for intermediate cognitive artifacts
- typed result metadata beyond a flat output dict
- data lineage and execution trace for observability
- merge strategies when multiple steps write to the same cognitive slot
CognitiveState v1 addresses all five without duplicating or replacing legacy fields.
Architecture¶
ExecutionState now contains four cognitive blocks plus an extension point:
ExecutionState
├── inputs (dict) ── caller-provided, read-only
├── vars (dict) ── legacy intermediate values
├── outputs (dict) ── legacy final outputs
├── events (list) ── runtime event log
├── step_results (dict) ── per-step execution records
│
├── frame (FrameState) ── immutable reasoning context
├── working (WorkingState) ── mutable cognitive working memory
├── output (OutputState) ── structured result metadata
├── trace (TraceState) ── execution trace + metrics
└── extensions (dict) ── open namespace for plugins
Each block serves a distinct audience:
| Block | Audience | Mutability | Purpose |
|---|---|---|---|
| frame | steps (read) | frozen | Why does this run exist? Boundaries. |
| working | steps (read/write) | mutable | Intermediate cognitive processing. |
| output | steps (write), caller (read) | mutable | Semantic result metadata. |
| trace | engine (write), ops (read) | mutable | Observability, data lineage. |
| extensions | plugins (read/write) | mutable | Open namespace for future features. |
Data Structures¶
FrameState (frozen)¶
Immutable context established when the run is created. Read-only during execution.
@dataclass(frozen=True)
class FrameState:
goal: str | None = None
context: dict[str, Any] = field(default_factory=dict)
constraints: dict[str, Any] = field(default_factory=dict)
success_criteria: dict[str, Any] = field(default_factory=dict)
assumptions: tuple[str, ...] = ()
priority: str | None = None
Usage in step input mappings:
input:
objective: frame.goal
max_budget: frame.constraints.budget
WorkingState (mutable)¶
Working memory for multi-step cognitive processing. Dies with the run — NOT persistent memory. Contains 10 typed cognitive slots:
@dataclass
class WorkingState:
artifacts: dict[str, Any] # named intermediate products
entities: list[dict[str, Any]] # identified entities
options: list[dict[str, Any]] # candidate options
criteria: list[dict[str, Any]] # evaluation criteria
evidence: list[dict[str, Any]] # collected evidence
risks: list[dict[str, Any]] # identified risks
hypotheses: list[dict[str, Any]] # active hypotheses
uncertainties: list[dict[str, Any]] # known unknowns
intermediate_decisions: list[dict[str, Any]] # reasoning checkpoints
messages: list[dict[str, Any]] # conversation accumulator
Usage in step input/output mappings:
# Write to working memory
output:
entities: working.entities
draft: working.artifacts.first_draft
# Read from working memory
input:
prior_risks: working.risks
draft_text: working.artifacts.first_draft
OutputState (mutable)¶
Structured result metadata. Distinct from outputs (the legacy flat dict).
@dataclass
class OutputState:
result: Any = None
result_type: str | None = None
summary: str | None = None
status_reason: str | None = None
Usage in step output mappings:
output:
final_summary: output.summary
quality_label: output.result_type
TraceStep (frozen)¶
One step's trace entry with data lineage. Generated automatically by the engine.
@dataclass(frozen=True)
class TraceStep:
step_id: str
capability_id: str
status: str
started_at: datetime | None = None
ended_at: datetime | None = None
reads: tuple[str, ...] = () # references read during input mapping
writes: tuple[str, ...] = () # targets written during output mapping
latency_ms: int | None = None
TraceMetrics¶
Live aggregate execution metrics updated as steps execute.
@dataclass
class TraceMetrics:
step_count: int = 0
llm_calls: int = 0
tool_calls: int = 0
tokens_in: int = 0
tokens_out: int = 0
elapsed_ms: int = 0
TraceState¶
Container for execution trace, combining per-step entries with aggregated metrics.
@dataclass
class TraceState:
steps: list[TraceStep] = field(default_factory=list)
metrics: TraceMetrics = field(default_factory=TraceMetrics)
Namespaces¶
Reference Resolution (reading)¶
The ReferenceResolver supports 7 namespaces with access control:
| Namespace | Source | On missing | Path traversal |
|---|---|---|---|
inputs |
state.inputs[key] | None | dict only |
vars |
state.vars[key] | error | dict only |
outputs |
state.outputs[key] | error | dict only |
frame |
state.frame.attr | None | dataclass → dict → list |
working |
state.working.attr | error | dataclass → dict → list |
output |
state.output.attr | None | dataclass → dict → list |
extensions |
state.extensions[key] | None | dict → list |
Path traversal resolves nested values by walking through dataclass attributes, dict keys, and list/tuple indices:
# Dataclass attribute access
frame.goal → state.frame.goal
# Nested dict key access
frame.constraints.budget → state.frame.constraints["budget"]
# List index access
working.entities.0 → state.working.entities[0]
# Deep nested path
working.entities.0.name → state.working.entities[0]["name"]
Anything not matching a known namespace is treated as a literal string value.
Output Mapping (writing)¶
The OutputMapper supports 5 writable namespaces:
| Namespace | Target | Notes |
|---|---|---|
vars |
state.vars | legacy flat dict |
outputs |
state.outputs | legacy flat dict |
working |
state.working | auto-creates intermediate dicts |
output |
state.output | auto-creates intermediate dicts |
extensions |
state.extensions | auto-creates intermediate dicts |
Three namespaces are read-only and reject writes:
| Namespace | Reason |
|---|---|
inputs |
Caller-provided, immutable |
frame |
Frozen dataclass, immutable |
trace |
Engine-managed, not step-writable |
Merge Strategies¶
When multiple steps write to the same target, the merge strategy (declared in
step.config.merge_strategy) controls collision behavior:
| Strategy | Behavior |
|---|---|
overwrite |
Default. Raises error on duplicate target within same step mapping. |
append |
For list targets, extends via list concatenation. |
deep_merge |
For dict targets, recursively merges keys (overlay wins for non-dict values). |
replace |
Unconditionally overwrites, no duplicate error. |
Example in a skill step:
- id: collect_risks
uses: analysis.risk.extract
config:
merge_strategy: append
input:
text: vars.document_text
output:
risks: working.risks
Execution Metadata¶
CognitiveState v1 adds metadata fields to ExecutionState:
| Field | Type | Purpose |
|---|---|---|
state_version |
str | Always "1.0.0" for CognitiveState v1 |
skill_version |
str | None | Skill version from spec metadata |
iteration |
int | Run iteration counter (for future retry/loop support) |
current_step |
str | None | ID of the step currently executing |
parent_run_id |
str | None | Parent run ID for nested skill composition |
updated_at |
datetime | None | Timestamp of last state mutation |
Trace Enrichment¶
The ExecutionEngine automatically enriches trace data during execution:
- Per-step TraceStep: generated for every step (success or failure), recording:
- step_id, capability_id, status
- started_at, ended_at timestamps
- reads: references resolved during input mapping
- writes: targets written during output mapping
-
latency_ms: step wall-clock time
-
Live TraceMetrics: updated after each step completes:
- step_count incremented
- elapsed_ms accumulated
-
llm_calls, tokens_in, tokens_out extracted from step meta when available
-
StepResult enrichment: each
StepResultalso carriesreads,writes, andlatency_msfor per-step inspection.
Data Lineage¶
The trace records which references each step reads and which targets it writes. This enables:
- dependency analysis between steps
- impact analysis when a step fails
- audit of which data flows through which capabilities
Example trace output:
state.trace.steps[0].reads # ("inputs.text", "frame.goal")
state.trace.steps[0].writes # ("vars.summary", "working.artifacts.draft")
Backward Compatibility¶
CognitiveState v1 is fully backward-compatible:
- All new fields have defaults. Existing code that creates
ExecutionStatewithout the new fields continues to work. - Legacy
vars/outputs/eventsremain the primary dataflow namespaces. - The
inputs→vars→outputspipeline is unchanged. - Reference syntax
inputs.*,vars.*,outputs.*works exactly as before. - Skills that do not use cognitive namespaces are unaffected.
Migration Path¶
Skills can adopt CognitiveState v1 incrementally:
- No changes needed: existing skills continue working.
- Add frame: pass
FrameStateat run creation for reasoning context. - Use working memory: write to
working.*targets for intermediate cognitive state. - Use output metadata: write to
output.*for structured result description. - Leverage trace: inspect
state.tracefor execution analysis after completion.
Test Coverage¶
CognitiveState v1 is validated by two dedicated test suites:
runtime/test_cognitive_state_regression.py: 86 tests locking legacy pipeline behavior.runtime/test_cognitive_state_v1.py: 99 tests validating all new functionality.
Run both via:
python -m runtime.test_cognitive_state_regression
python -m runtime.test_cognitive_state_v1
Module Changes Summary¶
| Module | Change |
|---|---|
runtime/models.py |
Added FrameState, WorkingState, OutputState, TraceStep, TraceMetrics, TraceState. Extended ExecutionState and StepResult. |
runtime/execution_state.py |
create_execution_state accepts optional frame, skill_version, parent_run_id. |
runtime/reference_resolver.py |
Rewritten with path traversal, 7 namespaces, ACL (permissive vs strict). |
runtime/output_mapper.py |
Rewritten with 5 writable namespaces, 4 merge strategies, nested writing. |
runtime/execution_planner.py |
Validation extended for new writable/read-only namespaces. |
runtime/execution_engine.py |
TraceStep generation, TraceMetrics live update, reads/writes tracking. |