Refs & The Monad Stack

A ref is the fundamental unit of data exchange in swarm-mcp. Instead of passing raw text between combinators, every agent execution produces a ref — a lightweight pointer to results stored on disk, enriched with metadata that travels alongside the data.

Why Refs Instead of Raw Text?

Lazy evaluation. Agent output — which can be thousands of tokens — stays on disk until something explicitly needs it. Combinators like chain, reduce, and filter can inspect metadata, route, validate, and transform without deserializing large payloads through the MCP protocol.

Composability. A ref returned by run can be handed directly to guard, filter, reduce, or used as context in a chain stage. The downstream combinator decides when (and whether) to unwrap. This keeps pipeline definitions concise and avoids redundant data copies.

Safety. Every ref carries the metadata needed to reason about the result: who produced it, what it cost, whether it passed type validation, what security classification it carries, and whether it was encrypted. That metadata is never separated from the result because it is part of the ref.

Ref Anatomy

A ref is a flat Python dict (serialized to JSON at the MCP boundary). The core fields produced by every run call are:

{
  "agent_id": "agent-3f7a",
  "ref": "run-9b2c/agent-3f7a",
  "exit_code": 0,
  "duration_seconds": 14.3,
  "cost_usd": 0.0041,
  "model": "claude-opus-4-5",
  "output_dir": "/tmp/swarm-mcp/run-9b2c/agent-3f7a",
  "error": null
}

Field	Type	Description
`agent_id`	string	Unique identifier for this agent execution
`ref`	string	Canonical address: `run_id/agent_id` — used to resolve content from disk
`exit_code`	int	0 = success, non-zero = failure
`duration_seconds`	float	Wall-clock time for the agent run
`cost_usd`	float	Token cost for this execution
`model`	string	Claude model variant used
`output_dir`	string	Absolute path to the agent's output directory
`error`	string or null	Error message if `exit_code != 0`

The ref field (run_id/agent_id) is the stable address used by _resolve_ref() to load the agent's text output from disk without carrying that text in the ref dict itself.

What Lives on Disk

Each agent execution creates a directory at /tmp/swarm-mcp/{run_id}/{agent_id}/ containing five files:

`result.json`

The full serialized result from the Claude SDK, including the complete message history, tool calls, and metadata. This is the authoritative source of truth. _resolve_ref() reads this file when a combinator needs the agent's text output.

`stream.jsonl`

A newline-delimited JSON log of every event emitted by the Claude streaming API during the run. Useful for debugging token-by-token what the agent did and when.

`artifacts.jsonl`

Logs written by the PostToolUse hook. Every tool call the agent made (bash, computer, file read/write, MCP calls) is recorded here with its inputs and outputs. Use this to audit exactly what side effects the agent produced.

`output.md`

The agent's final text response, unwrapped from the SDK result structure and written as plain Markdown. Human-readable; useful for quick inspection without parsing JSON.

`prompt.txt`

The original prompt sent to the agent. Stored so that you can reconstruct exactly what the agent was asked, independent of the calling context.

Inspecting a run

After a run or par call, you can browse /tmp/swarm-mcp/{run_id}/ to inspect every agent's artifacts, stream, and output without any additional tool calls.

The Monad Stack

After a raw ref is produced by an agent execution, enrich_ref() applies a stack of monad-style stamps — each one adds a new layer of metadata without mutating the fields already present. You can think of each stamp as wrapping the ref in a new context that downstream combinators can inspect and enforce.

The stamps are applied in this order:

1. Provenance

stamp_provenance(ref, parent_refs, content_hash, timestamp)

Stamps where this result came from and what it contains.

Field added	Description
`parent_refs`	List of ref addresses that fed into this result (e.g., in a `chain` stage)
`content_hash`	SHA-256 of the output text — enables deduplication and integrity checks
`timestamp`	ISO-8601 wall-clock time the result was produced

2. Cost

stamp_cost(ref, step_cost, budget_tracker)

Stamps financial accounting for the run and its context.

Field added	Description
`step_cost`	Cost of this specific agent execution in USD
`spent_so_far`	Cumulative cost across the current run or pipeline
`remaining`	Budget remaining (`limit - spent_so_far`)
`limit`	The `max_budget` ceiling set by the caller

3. Time

stamp_deadline(ref, deadline)

Stamps time-awareness so downstream combinators can abort work that has already missed its window.

Field added	Description
`deadline`	Unix timestamp after which this result (or dependent work) is stale
`time_remaining`	Seconds between stamp time and the deadline

4. Validated

stamp_validated(ref, validated_as, verdict, validation_ref)

Stamps the result of type-gate validation (used by filter and retry).

Field added	Description
`validated_as`	The declared output type that was checked (e.g., `"json"`, `"python"`)
`validation_verdict`	`"VALID"` or `"INVALID"`
`validation_ref`	Ref address of the validator agent's own result, for auditability

5. Retried

stamp_retry(ref, attempt, max_retries, prior_errors)

Stamps retry state so the agent receiving a re-run knows its history.

Field added	Description
`attempt`	Current attempt number (1-indexed)
`max_retries`	Maximum attempts allowed
`prior_errors`	List of error messages from previous failed attempts

6. Classified

stamp_classification(ref, level, allowed_mcps, denied_mcps)

Stamps security classification so guard can enforce access control policies.

Field added	Description
`level`	Human-readable classification label (e.g., `"confidential"`)
`level_numeric`	Numeric equivalent for comparison operators
`allowed_mcps`	List of MCP server names this result may flow to
`denied_mcps`	List of MCP server names this result must not reach

7. Encrypted

stamp_encrypted(ref, key_id, algorithm)

Stamps encryption provenance so consumers know whether they need to decrypt before use.

Field added	Description
`key_id`	Identifier of the key used to encrypt the payload
`algorithm`	Encryption algorithm (e.g., `"AES-256-GCM"`)

A Fully Enriched Ref

After the full monad stack has been applied, a ref might look like this:

{
  "agent_id": "agent-3f7a",
  "ref": "run-9b2c/agent-3f7a",
  "exit_code": 0,
  "duration_seconds": 14.3,
  "cost_usd": 0.0041,
  "model": "claude-opus-4-5",
  "output_dir": "/tmp/swarm-mcp/run-9b2c/agent-3f7a",
  "error": null,

  "parent_refs": ["run-8a1b/agent-0c9d"],
  "content_hash": "e3b0c44298fc1c149afb4c8996fb924...",
  "timestamp": "2026-03-17T14:22:05Z",

  "step_cost": 0.0041,
  "spent_so_far": 0.0312,
  "remaining": 0.9688,
  "limit": 1.00,

  "deadline": 1742221200,
  "time_remaining": 3595.2,

  "validated_as": "json",
  "validation_verdict": "VALID",
  "validation_ref": "run-9b2c/validator-agent-7e2f",

  "attempt": 2,
  "max_retries": 3,
  "prior_errors": ["JSONDecodeError: Expecting value at line 1"],

  "level": "internal",
  "level_numeric": 2,
  "allowed_mcps": ["filesystem", "github"],
  "denied_mcps": ["web-search"],

  "key_id": "key-prod-2026-03",
  "algorithm": "AES-256-GCM"
}

Not every ref will have all layers — stamps are applied selectively based on what the calling combinator configured.

Unwrap vs Pass

Most of the time you should pass refs between combinators and let the pipeline stay lazy:

// chain: pass the ref from stage 1 as context into stage 2
// stage 2 receives the text automatically — you never unwrap manually
{
  "stages": [
    { "prompt": "Analyze this dataset and identify anomalies." },
    { "prompt": "Write a remediation plan for the anomalies above." }
  ]
}

Unwrap (call _resolve_ref() / read output.md) when:

You need to display final output to a user
You are passing text into a non-swarm-mcp system that does not understand refs
You are building a custom combinator that must operate on the actual content

Don't unwrap prematurely

Unwrapping inside a pipeline defeats lazy evaluation. If you find yourself extracting text from a ref only to pass it into another run call, use chain or reduce instead — they handle the unwrap internally and maintain full metadata lineage.

Mixed inputs

_extract_texts() handles mixed input arrays that contain plain strings, dicts with a .text field, and refs with a .ref field. reduce and map_reduce use this automatically, so you can freely mix literal text and refs in the same input list.