Consumer-Facing Neutral API¶

Date: 2026-03-11 Status: Implemented v1 gateway baseline Scope: Single domain contract exposed through HTTP/OpenAPI and MCP bridge adapters

Purpose¶

This layer exposes runtime execution to external users without coupling clients to internal binding protocols.

The same domain operations are now available through:

HTTP/OpenAPI adapter
MCP tool bridge adapter

Both adapters call the same runtime stack, with gateway-mediated operations for skill discovery/list/attach and neutral API operations for execution.

Architecture¶

Core modules:

runtime/engine_factory.py
Shared runtime construction used by CLI and customer-facing adapters.
customer_facing/neutral_api.py
Protocol-neutral domain facade.
Operations: health, describe_skill, execute_skill, execute_capability, async runs, run status/cancel, checkpoints, resume.
gateway/core.py
Agent-facing gateway layer.
Operations: list_skills, discover, attach, diagnostics, reset_diagnostics_metrics.
Includes attach target validation and diagnostics persistence metadata.
customer_facing/http_openapi_server.py
HTTP adapter with v1 routes and OpenAPI spec endpoint.
Reuses runtime/openapi_error_contract.py for deterministic error mapping.
Optional API-key authentication (x-api-key) and in-memory per-client rate limiting.
customer_facing/mcp_tool_bridge.py
MCP-oriented tool adapter over the same neutral operations.
Includes stdio loop for lightweight bridge hosting.

v1 HTTP Routes¶

Base version: /v1

GET /v1/health
GET /v1/skills/{skill_id}/describe
GET /v1/skills/list
GET /v1/skills/governance
GET /v1/skills/diagnostics
POST /v1/skills/discover
POST /v1/skills/{skill_id}/attach
POST /v1/skills/{skill_id}/execute
POST /v1/capabilities/{capability_id}/execute
POST /v1/capabilities/{capability_id}/explain
POST /v1/skills/{skill_id}/execute/async
POST /v1/run_async (alias) and POST /run_async (alias)
GET /v1/runs
GET /v1/runs/{run_id}
POST /v1/runs/{run_id}/cancel
GET /v1/runs/{run_id}/checkpoints
POST /v1/runs/{run_id}/resume
POST /v1/runs/{run_id}/approve
POST /v1/runs/{run_id}/deny
POST /v1/runs/{run_id}/replay
POST /v1/runs/{run_id}/fork
GET /run_status/{run_id} and GET /v1/run_status/{run_id} (legacy aliases)
POST /run_cancel/{run_id} and POST /v1/run_cancel/{run_id} (legacy aliases)
GET /openapi.json
GET /v1/metrics
GET /v1/metrics/prometheus

Async run status model:

Canonical (/v1/runs/*): pending, running, waiting_for_human, waiting_for_signal, replaying, completed, failed, canceled.
Legacy aliases project canceled to failed for compatibility with older clients.

Checkpoint and resume contract:

GET /v1/runs/{run_id}/checkpoints returns checkpoint list + checkpoint_head.
POST /v1/runs/{run_id}/resume accepts optional checkpoint_id.
Current slice behavior: resume mode is checkpoint_resume; the runtime restores the checkpointed state and continues execution from the remaining steps.
POST /v1/runs/{run_id}/approve accepts optional approver and notes, then resumes a run that is waiting for human approval.
POST /v1/runs/{run_id}/deny accepts optional approver and notes, then cancels the run canonically.
POST /v1/runs/{run_id}/replay accepts optional checkpoint_id, creates a new replay run, and continues from the restored checkpoint state.
POST /v1/runs/{run_id}/fork accepts optional checkpoint_id, creates a new pending run, and preserves source linkage without re-executing.

Async launch idempotency:

POST /v1/run_async and POST /run_async accept optional idempotency_key (or x-idempotency-key header).
Repeating the same async launch with the same skill_id + idempotency_key returns the existing run instead of creating a duplicate.
Reusing the same idempotency_key with a different async request payload returns 409 idempotency_conflict.
Server-side TTL for idempotency keys is configured with AGENT_SKILLS_IDEMPOTENCY_TTL_SECONDS (default 86400). After expiry, the same key can create a new run.

Idempotency observability:

Runtime counters are available via GET /v1/metrics:
runtime.idempotency.created
runtime.idempotency.reused
runtime.idempotency.conflict
runtime.idempotency.expired
Prometheus exposition is available via GET /v1/metrics/prometheus with normalized names:
agent_skills_runtime_idempotency_created_total
agent_skills_runtime_idempotency_reused_total
agent_skills_runtime_idempotency_conflict_total
agent_skills_runtime_idempotency_expired_total

Security model (configurable):

GET /v1/health and GET /openapi.json can remain unauthenticated.
All execution/describe routes can require x-api-key.
Protected routes can enforce request rate limits with 429 responses.

OpenAPI spec file:

docs/specs/consumer_facing_v1_openapi.json

MCP Tool Surface¶

Exposed tools:

runtime.health
skill.describe
skill.list
skill.discover
skill.diagnostics
skill.metrics.reset
skill.attach
skill.execute
capability.execute
capability.explain
skill.governance.list

Bridge entrypoint:

tooling/run_customer_mcp_bridge.py

Error Contract¶

Both adapters map runtime exceptions through:

runtime/openapi_error_contract.py

Error payload shape:

{
  "error": {
    "code": "string",
    "message": "string",
    "type": "string"
  },
  "trace_id": "string"
}

Runbook¶

Run HTTP/OpenAPI server¶

python tooling/run_customer_http_api.py --host 127.0.0.1 --port 8080

Run with API key + rate limit:

python tooling/run_customer_http_api.py --host 127.0.0.1 --port 8080 --api-key local-dev-key --rate-limit-requests 20 --rate-limit-window-seconds 60

Run MCP bridge (stdio)¶

python tooling/run_customer_mcp_bridge.py

Example stdio request line:

{"id":"1","method":"tools/call","params":{"name":"skill.execute","arguments":{"skill_id":"agent.plan-from-objective","inputs":{"objective":"Build a plan"}}}}

Example attach request line (generic sidecar attach):

{"id":"2","method":"tools/call","params":{"name":"skill.attach","arguments":{"skill_id":"agent.trace","target_type":"output","target_ref":"<existing-trace-or-output-ref>","include_trace":true,"inputs":{"goal":"Trace current orchestration","events":[],"trace_state":{},"trace_session_id":"session-1"}}}}

Verify both adapters¶

python tooling/verify_customer_facing_neutral.py

Verify HTTP controls (auth + throttling)¶

python tooling/verify_customer_http_controls.py

Verify HTTP/MCP parity snapshot¶

python tooling/verify_customer_facing_parity_snapshot.py

Inspect idempotency telemetry¶

JSON metrics snapshot:

curl -s http://127.0.0.1:8080/v1/metrics

Prometheus counters only:

curl -s http://127.0.0.1:8080/v1/metrics/prometheus | grep idempotency

Operational interpretation:

High created with low reused is expected for unique client requests.
Rising reused confirms retries are being deduplicated successfully.
Rising conflict indicates clients are reusing keys with changed payloads and should be investigated.
Rising expired indicates key cleanup is active; if too high, review retry windows and AGENT_SKILLS_IDEMPOTENCY_TTL_SECONDS.

Design Invariants¶

Consumer-facing contract is protocol-neutral and stable.
Internal provider protocols (pythoncall/openapi/mcp/openrpc) remain behind binding resolution.
Adding or changing internal protocols must not require external API contract changes.
Trace propagation uses x-trace-id header or trace_id body field.
Skill ranking is heuristic; product agents should apply selection policy and not rely exclusively on top-1 score.

Product Agent Orchestration Pattern¶

Recommended policy for product-facing agents:

Discover candidates for the primary user intent.
Execute primary skill.
Attach optional sidecar skills (monitoring/control/reporting) when requested by user policy.
Return both business output and execution trace/control summary.

This pattern treats sidecar skills as a normal classified skill category (role=sidecar, invocation=attach|both) and avoids hard-coded behavior for any single skill id.

Next Steps¶

Replace in-memory rate limiting with distributed/shared limiter for multi-instance deployment.
Replace static API key with pluggable authn/authz provider.
Replace stdio MCP bridge with full MCP server transport integration.
Extend parity snapshots across more capabilities and representative error cases.