Presumptive Inference - The Silent Agent Killer

A deployment agent runs. It reports success. Nothing actually deploys.

No errors. No crashes. Just a confident AI telling you everything went great while the production server sits unchanged.

This is Presumptive Inference.

Once you see it, you find it everywhere.

What's Actually Going On

LLMs don't execute a plan. They predict what a successful execution looks like.

These are very different things.

When an agent is told to "create a file, then read it back," something subtle happens in its reasoning:

Create the file → calls writeFile → gets a response
The file should now exist → skips the check
Read it back → calls readFile → gets an error

Step 2 is where it breaks. The agent treats "I intended to do X" as equivalent to "X happened."

If you've ever watched a junior developer git push without checking if the tests passed — it's exactly that.

Two Flavors

The assume-success trap. The agent calls a tool, gets some acknowledgment, and moves on. It never checks the actual result.

Agents "create" Kubernetes pods that never exist. They "write" config files that are empty. They "deploy" services that return 404s.

The state hallucination. The agent remembers setting a variable three steps ago. But if you trace the execution, the tool call that was supposed to set it either failed silently or was never made.

The agent confabulated its own execution history.

Top path: what you want. Bottom path: what you get by default.

Why Prompting Won't Fix This

"Always verify your actions." "Double-check before proceeding."

This works sometimes. On a good day. With a large model. For about 15 minutes — before the agent starts optimizing away the verification steps because they "seem redundant."

The problem is structural. You can't solve a runtime guarantee with a compile-time instruction.

What Actually Works

The Verification-Step Pattern

Never allow an agent to move to step N+1 until step N has been independently verified.

async function verifiedWrite(filename: string, content: string) {
  const writeResult = await agent.callTool("writeFile", { filename, content });
  
  // Don't trust the write result alone.
  const verifyResult = await agent.callTool("checkFileExists", { filename });
  
  if (!verifyResult.exists) {
    throw new Error(
      `Presumptive Inference detected: writeFile reported success ` +
      `but ${filename} does not exist.`
    );
  }
  
  // Verify content hash if you're paranoid (you should be)
  const readBack = await agent.callTool("readFile", { filename });
  if (readBack.content !== content) {
    throw new Error("Content mismatch after write.");
  }
}

Three tool calls instead of one. More expensive. But cheaper than debugging phantom deployments at 2am.

Explicit State Machines

The deeper fix is taking state management away from the agent entirely.

Use a finite state machine where transitions are code, not vibes.

The agent can reason about what to do next. But it can't just decide it's in the "Verifying" state. The orchestrator checks. The orchestrator decides.

Chain of Verification Prompting

Before any destructive or irreversible action, force the agent to enumerate its assumptions:

"Before executing this tool, list every assumption you're making about the current state. For each assumption, state how it was verified."

It doesn't catch everything. But it catches enough to be worth the extra tokens.

The Uncomfortable Truth

Presumptive inference isn't a bug in the model. It's a feature.

LLMs are trained to be confident and forward-moving. That's what makes them useful for generating code, writing prose, answering questions.

It's also what makes them dangerous when you hand them a deploy button.

The fix isn't better models. It's better architecture around the models we have.