February 9, 2026

My AI Agent Sent 500 Messages to My Wife

A real-world agent failure mode and the guardrails that failed it.

AI Agentic Workflows Failure Modes

The Outline

1. The Hook: The Ice Storm Experiment

Context: Start with the scene you gave the reporter. Charlotte, January 2026. Ice storm. Snowed in.
The Goal: You weren’t trying to break anything; you were trying to set up an advanced personal assistant to manage your “Daily Digest” workflow.
The Result: “Within seconds of enabling the iMessage channel, the agent interpreted ‘authenticate’ as ‘try to authenticate everyone you’ve ever spoken to.‘“

2. The “Oh S**t” Moment (Visuals)

Embed Screenshot 1: The wall of blue texts to your wife.
Embed Screenshot 2: The “pairing code” spam to random contacts.
Commentary: “It wasn’t just my wife. The agent began iterating through my recent iCloud message headers. If you had texted me in the last month, you got a pairing code. I had to physically pull the power cord on the Mac Mini to stop it.”

3. The Technical Root Cause (The “Meat”)

This is where you prove you are an engineer, not just a victim.
The Bug: Explain that the original Moltbot code likely lacked a conditional check for an authorized user before initiating the handshake protocol. It treated the recent_contacts list as a target_list.
The Loop: It entered a foreach loop on the iMessage database without a break condition or a rate limit.

4. The Fix: How I Patched It

The Fork: Mention you forked the repo immediately.
The Logic: Briefly explain the concept of the “Allowlist” middleware you added.

Pseudocode example: "I injected a strict if (contact.isInWhitelist) check before any sendMessage function could fire."

The Result: You can now run the agent safely because it operates in a “Sandbox” of approved contacts (just you).

5. The Bigger Picture: Agentic AI Safety

The Pivot: Transition to Apptitude.
Thesis: “Agents are powerful, but they are literal. They will execute a bad instruction with the same enthusiasm as a good one.”
The Pitch: “At Apptitude, we build ‘Human-in-the-Loop’ architectures. We don’t give agents the nuclear codes until they prove they can handle the radio.”