Skip to content
← Blog
7 min read

The Missing Hand: Output Verification for Managed Agents

Anthropic's Managed Agents architecture solves reliable execution. But nobody verifies what the agent produced. Here's how to add a verification 'hand' to any agent pipeline.

Anthropic published Scaling Managed Agents — a deep technical dive into how they decoupled the brain (Claude), hands (tools and sandboxes), and session (durable event log) in their agent platform.

The architecture is elegant. Tools follow a universal interface: execute(name, input) → string. The session log captures every event durably. If a container dies, spin up a new one and replay from the log. Stateless, replaceable, horizontally scalable.

But read the article carefully and you'll notice what's missing.

The Gap

The article solves “can the agent run reliably?” — container orchestration, failure recovery, credential isolation, context management. All critical infrastructure problems, all well-solved.

It doesn't address “is what the agent said true?”

These are different problems. An agent can run perfectly — every tool call succeeds, every container stays healthy, every event logs cleanly — and still produce output full of hallucinated claims. Infrastructure reliability and output accuracy are orthogonal.

This isn't a criticism of the architecture. It's an observation about what's not there yet. And it matters because the agents making decisions in production — financial analysis, research synthesis, compliance checks — are exactly the ones where a wrong claim has real consequences.

Verification as a Hand

The beauty of the Managed Agents model is that adding verification doesn't require changing the architecture. Verification is just another tool — another “hand” with the same interface:

json
result = execute("verify_output", agent_response)
→ {
    "trust_score": 0.87,
    "claims": [
      {
        "text": "NVIDIA Q4 revenue was $22B",
        "verdict": "supported",
        "confidence": 0.94
      },
      {
        "text": "beating estimates by 12%",
        "verdict": "contradicted",
        "confidence": 0.91,
        "correction": "NVIDIA beat estimates by 8.6%, not 12%"
      }
    ],
    "contradicted": 1
  }

The brain doesn't need to know how verification works. It calls the tool, gets back a trust score, and decides whether to proceed. The harness doesn't need modification. The session log records the verification receipt alongside the action — one timeline showing both what happened and whether it was accurate.

Why Third-Party Verification Matters

There's a structural reason why the verification hand should be external to the agent platform.

If Claude verifies its own output, you have a circular dependency — the same system producing claims is judging them. This is the equivalent of a student grading their own exam. Research on LLM self-consistency shows this degrades under adversarial pressure and correlated failure modes. When the model is wrong, it's often wrong in the same direction on both generation and verification.

External verification uses different models, different evidence sources, and different evaluation logic. The verification layer has no shared context with the generator — no conversation history, no system prompt, no prior assumptions. Each claim is checked independently against source material.

This matters even more for platform providers. Anthropic, OpenAI, Google — none of them should be the sole arbiter of whether their own models are producing accurate output. Independent verification creates the separation of concerns that regulated industries require and that users should expect.

The Session Log as Audit Trail

The Managed Agents session is an append-only log of all events. This is already most of what you need for a verification audit trail. Each verification receipt includes:

  • Claim hash — SHA-256 of the normalized claim text, for deduplication across systems
  • Per-claim verdicts — supported, contradicted, or unverifiable
  • Confidence scores — how certain the verification is
  • Evidence chains — what sources support or contradict each claim
  • Corrections — what the claim should say if contradicted

When these receipts are session events, you can answer questions that matter in production: “When did the agent make this claim? Was it verified? What evidence was checked? Did the agent proceed after verification or get blocked?”

For regulated use cases — finance, healthcare, legal — this is the difference between “the agent ran” and “the agent ran and we can prove its output was checked.”

How It Works Today

Two integration points, both available now:

As an OutputGuardrail (Claude Agent SDK)

python
from agents import Agent, Runner
from veroq_agentmesh import veroq_output_guardrail

agent = Agent(
    name="Financial Analyst",
    instructions="You analyze earnings reports.",
    output_guardrails=[veroq_output_guardrail],
)

# If the agent's output contains contradicted claims,
# OutputGuardrailTripwireTriggered is raised with full details
result = await Runner.run(agent, "Summarize NVIDIA's Q4 earnings")

pip install veroq-agentmesh — trips the guardrail if claims are contradicted or confidence drops below threshold. Fails open on API errors (won't block your agent if the verification service is down).

As a Custom Tool (Managed Agents)

python
agent = client.beta.agents.create(
    name="Verified Analyst",
    model="claude-sonnet-4-6",
    tools=[
        {"type": "agent_toolset_20260401"},
        {
            "type": "custom",
            "name": "verify_claims",
            "description": "Verify factual claims in text against "
                         "live evidence. Returns per-claim verdicts "
                         "with confidence scores and corrections.",
            "input_schema": {
                "type": "object",
                "properties": {
                    "text": {
                        "type": "string",
                        "description": "The text containing claims"
                    }
                },
                "required": ["text"]
            }
        }
    ]
)

Or connect the MCP server directly — 78 tools including verification, market data, SEC filings, and more:

bash
npx veroq-mcp

What This Changes

The Managed Agents architecture makes it trivial to add verification to any agent pipeline. The hard work — decoupling tools from the brain, making everything stateless, building the durable session log — is already done.

The missing piece is small: one more hand that checks whether what the agent said is actually true. The interface is the same. The session log captures it for free. The only question is whether you care enough about accuracy to add the call.

For agents that research, analyze, recommend, or decide — you probably should.