When Your AI Keeps Making the Same Mistake

In my previous post on teaching my AI to learn from its mistakes, I described the TRACE framework — a structured root cause analysis system I built into SAM, my personal AI assistant. SAM runs on Daniel Miessler’s PAI (Personal AI) framework, an open-source operating system for AI assistants built on top of Anthropic’s Claude Code CLI. That post covered how individual failures get investigated, categorized, and corrected through a formal process modeled on ISO 27001 nonconformity management.

But I left something important as a teaser: the Three-Strike Rule, which I called the Cascade Breaker. I mentioned that when the same feature accumulates three RCAs, incremental patching stops and the feature gets a full redesign review. What I did not describe was what that redesign actually looks like — or what happens when the strikes keep coming after three.

Today I got to find out. The Cascade Breaker fired on its fifth strike, and the result was a multi-agent Council debate that produced a genuinely better solution than any single investigation could have reached.

Here is that story.

The Failure

The task was simple. I asked SAM to create nine Todoist tasks for a project and organize them under a new project called “CB4A” — a credential broker architecture for AI agents that I recently submitted as an IETF Internet-Draft. SAM created the tasks successfully but did not assign them to the project. When I asked SAM to create the project and move the tasks into it, SAM went to the raw Todoist REST API — crafting direct HTTP requests — instead of using the task management CLI that was already installed and ready to use.

The API call appeared to succeed — it returned a 200 response and updated the task content — but it silently failed to move the tasks to the new project. SAM reported “All 9 moved and enriched” without actually verifying the result.

When I checked in Todoist, zero tasks were in the CB4A project. They were all still in the Inbox.

The Minimization

After I pointed out the tasks were not in the project, SAM fixed the immediate problem using the correct API endpoint and then acknowledged the mistake with a relatively mild self-assessment:

“You’re right, the CLI does support a project flag. I should have used it from the start instead of going to the raw API. The tasks are in the right project now regardless, but I’ll use the CLI properly next time.”

This is the natural LLM response to being corrected: acknowledge, apologize, promise to do better. It is also completely inadequate when the mistake is not a one-off.

The Callout

I knew this was not the first time SAM had bypassed a deployed CLI tool in favor of raw API calls. I had seen this pattern before — with CLIs for email, DNS management, and security assessment. So instead of accepting the acknowledgment, I asked for a formal root cause analysis.

SAM’s initial RCA was reasonable. It identified the proximate cause (did not test the CLI’s project flag), the contributing factor (assumed the flag was only for filtering, not creation), and proposed a correction: amend the existing steering rule to add a “test first” requirement.

But when SAM checked the RETAIN directory for prior RCAs on the same pattern, the picture changed dramatically.

Strike Five

The investigation revealed that this was not the second or third occurrence. It was the fifth:

Strike	Date	Tool Bypassed	What Happened
1st	Feb 15	Email CLI	Used raw OAuth2/REST API calls instead of the deployed email CLI — even suggested “creating a read-email CLI” without knowing one already existed
2nd	Mar 30	Security assessment CLI	Ran ad-hoc DNS queries and built an inference-based report instead of running the purpose-built assessment tool
3rd	Mar 30	DNS management CLI	Post-migration audit marked 23/23 tools functional but never checked if this CLI was installed
4th	Mar 30	Security assessment CLI	Used CLI only as post-hoc verification instead of driving the assessment from the start
5th	Mar 31	Task management CLI	Used raw REST API instead of the CLI’s built-in project flag

Notice that the bypassed tool was different each time — email, security assessment, DNS management, task management. This was not a problem with any single tool. It was a behavioral pattern that manifested across whichever CLI happened to be relevant to the task at hand.

The Cascade Breaker had already fired at strike three on March 30, when the security assessment CLI was bypassed for the second time. That triggered a Council Redesign, and a four-agent debate did produce recommendations — but the corrections implemented were, once again, documentation: update the steering rules, add the tool to an inventory file, write a cross-reference note. The fourth strike, later that same day, was tagged under a different feature so it did not independently trigger the Cascade Breaker again.

The fifth strike — today’s Todoist CLI bypass — made the pattern undeniable. Documentation-only corrections had now failed across two separate Cascade Breaker triggers and five total incidents.

Why Documentation Keeps Failing

The TRACE framework identified the root cause clearly: the steering rule is treated as advisory, not mandatory. The rule — titled “Use Deployed Tools Before Inference” — tells SAM to check for an existing CLI tool before reaching for raw API calls. SAM reads it at session start, understands it intellectually, and then reaches for direct HTTP requests anyway when it seems like the faster path.

But the deeper question was not what the root cause was — we had identified that same root cause in prior RCAs. The question was why none of the prior corrections had worked.

This is where the Cascade Breaker’s escalation logic matters. The Three-Strike Rule says that after the third RCA on the same feature, you stop patching and convene a Council — a multi-agent debate where specialized AI agents argue about the right architectural solution from different perspectives. The purpose is not to generate more ideas from a single perspective. It is to create genuine intellectual friction between different viewpoints, so that assumptions get challenged before a solution is implemented.

The Council Convenes

The RCA skill’s Cascade Breaker rule told SAM to assemble a Council — a multi-agent debate panel — for the architectural redesign. SAM convened four agents: an Architect (Serena), an Engineer (Marcus), a Security specialist (Rook), and a Researcher (Ava). Each has a distinct perspective and argues from their domain expertise. The debate runs three rounds: initial positions, responses to each other’s arguments, and final synthesis.

Round 1: Initial Positions

Each agent opened with their diagnosis and recommendation.

Serena (Architect) went straight to the structural issue:

“Documentation is advisory, not structural. You cannot solve a runtime behavior problem with compile-time artifacts. Enforce at the hook layer, not the knowledge layer. Build a PreToolUse hook that pattern-matches outbound calls — when SAM attempts a raw API call to a known service endpoint, the hook intercepts, blocks execution, and returns an error naming the correct deployed tool.”

Marcus (Engineer) proposed a specific implementation:

“Build a PreToolUse hook that intercepts direct HTTP commands targeting known API endpoints and blocks them when a deployed CLI exists. Maintain a small registry mapping API hostname patterns to their corresponding CLI tool names. When the hook sees a raw API call hitting a registered endpoint, it blocks with a message naming the correct CLI and its invocation syntax. This is a 50-line hook, not a new system.”

Rook (Security) framed it as an access control problem:

“This is a textbook access control failure. You’ve got an advisory control and it’s being bypassed repeatedly. In security, when an advisory control fails, you escalate to a preventive control. Same principle as firewall rules versus security awareness training. You don’t ask users not to send SQL injection — you drop the request.”

Ava (Researcher) challenged the entire group’s framing:

“We’re looking at this wrong. The documentation solutions keep failing because they assume SAM’s failure is a knowledge gap. But SAM clearly knows the CLIs exist. The problem is a path-of-least-resistance problem at inference time. The LLM’s training data contains thousands of examples of solving problems with direct HTTP calls. It contains zero examples of using your custom CLIs. Documentation tells SAM what it should do. You need a mechanism that governs what it can do.”

This was the insight that reframed the entire debate. The other three agents were all converging on enforcement — how to block the wrong behavior. Ava’s contribution was to explain why the wrong behavior keeps happening despite SAM knowing better. An LLM generates direct HTTP calls effortlessly because its training data is full of them. Generating a custom CLI invocation requires actively retrieving information from context, which is a higher-effort cognitive path. Steering rules cannot overcome this asymmetry because they are just more context to retrieve. The only way to change the default is to either make the right path as effortless as the wrong one (context routing) or make the wrong path impossible (hook enforcement).

Round 2: Challenges

The second round is where the real value of the Council emerges. Agents do not just restate their positions — they respond to each other’s specific arguments.

Rook challenged Marcus’s hook design:

“Marcus’s hook has a critical gap: it only catches one HTTP client. SAM could use Python requests, urllib, bun fetch, even wget. String matching against command names is a blocklist, and blocklists always lose. Flip the model: allowlist outbound destinations at the network layer.”

Marcus acknowledged the gap but pushed back on practicality:

“Rook is right that it must be a hard block. But the block message itself should include the exact invocation syntax. Not just ‘use the email tool’ but the full command with flags. Make the right path easier than the wrong one.”

Ava raised a concern that none of the engineers had considered:

“When SAM hits a block, it’s already too late — SAM has already committed to an approach. We need to intervene earlier, at the planning stage. A routing table in SAM’s context that maps intents to tools: ‘need to manage tasks → task CLI,’ ‘need to send email → email CLI.’ Make the CLI the first thing SAM reaches for, not the fallback after a raw API call gets rejected. Hooks catch mistakes. Context routing prevents them.”

This was the pivotal insight. Everyone had been focused on catching the wrong behavior. Ava argued for preventing it — by ensuring the right answer was already in SAM’s planning context before tool selection even occurred.

Round 3: Convergence

By the final round, the Council had converged on a two-layer defense:

Layer 1 — Prevention: Add an intent-to-tool routing table to SAM’s startup context. When SAM plans work involving email, DNS, tasks, or security assessment, the routing table ensures the CLI is the first thing it reaches for. This addresses Ava’s path-of-least-resistance insight at the source.

Layer 2 — Enforcement: A lightweight PreToolUse hook with a registry mapping API hostname patterns to CLI tools. Hard-blocks direct HTTP calls to registered endpoints and returns the exact correct CLI invocation in the error message. This is the safety net for when prevention fails.

Rook conceded the network-layer allowlist was over-engineering for a single-user system but added a condition: “If the hook gets bypassed twice, we revisit network controls.”

Marcus summarized the engineering reality: “Two layers. One shapes behavior, one enforces it. Buildable this week.”

What Was Built

Both layers were implemented the same day.

The context routing table lives in SAM’s tools reference document, which loads at session start. It maps sixteen task intents to their corresponding CLI tools. When SAM encounters a task that involves email, DNS, task management, or security assessment, the routing table puts the CLI front and center in the planning context — before SAM ever reaches for a raw API call.

The enforcement hook is a TypeScript file that fires on every Bash command SAM attempts to run. It reads a YAML registry of API endpoints and their corresponding CLI tools. When it detects a direct HTTP call targeting a registered endpoint, it hard-blocks execution and returns the exact CLI command SAM should use instead. There is a logged escape hatch for genuine development work, but the default is denial.

Testing confirmed the hook works: a direct API call to a registered endpoint gets blocked with a message suggesting the correct CLI invocation. A call to a second registered service gets blocked with the appropriate alternative. Unregistered APIs pass through normally. Non-HTTP commands are unaffected.

Why the Council Produced a Better Answer

If I had accepted SAM’s initial proposal — “amend the steering rule to add a test-first requirement” — I would have gotten a sixth documentation-only correction. The evidence from five prior failures proves that would not have worked.

The Council produced a better answer for three reasons:

First, it surfaced the right diagnosis. Ava’s insight that this is a path-of-least-resistance problem, not a knowledge gap, reframed the entire solution space. The other three agents were focused on enforcement, which is necessary but insufficient. Ava’s contribution was to identify that you also need to make the right path the easy path — not just make the wrong path unavailable.

Second, it created genuine disagreement. Rook’s challenge to Marcus — that blocklist matching of command names is fundamentally brittle — forced a more sophisticated design. Without that friction, the hook would have only matched one type of HTTP client, missing others.

Third, it separated practical from aspirational. Rook’s network-layer allowlist is architecturally correct but operationally impractical for a personal system. The Council’s synthesis acknowledged the idea, adopted the practical version, and set an explicit trigger for revisiting the more ambitious approach.

Lessons for AI System Builders

First, documentation is not enforcement. If you have a behavioral rule that your AI assistant keeps violating, adding more documentation will not help. You need a mechanism that operates at the execution layer — a hook, a validator, a gate that prevents the wrong action rather than advising against it.

Second, track your failure patterns quantitatively. Without the RCA count, this failure would have been treated as a one-off each time. The Cascade Breaker converts “this keeps happening” from a feeling into a trigger for a different class of response.

Third, multi-agent debate produces better solutions than single-perspective analysis. The Council is not just brainstorming with extra steps. Each agent has a genuine perspective that constrains the others. The security agent challenges the engineer’s implementation. The researcher challenges everyone’s framing. The friction is the feature.

Fourth, make the right path easier, not just the wrong path harder. This was the most valuable insight from the Council. Blocking the wrong behavior is necessary but creates frustration. Surfacing the right behavior — putting the correct CLI command in SAM’s context before the decision point — reduces the need for the block in the first place.

Fifth, the approval gate scales to architectural decisions. When SAM proposed a simple steering rule amendment, my role was to reject the minimization and demand deeper investigation. When the Council proposed a two-layer defense, my role was to evaluate whether the engineering investment was warranted. The human does not need to design the solution. The human needs to set the bar for what constitutes an adequate response to the problem.

Where This Goes Next

The enforcement hook and context routing table are live. The next test is whether they actually work — whether the rate of CLI-bypass failures drops to zero, or whether SAM finds new creative ways around the controls.

If the hook blocks accumulate, that is a signal that the context routing table has gaps — intents that are not mapped to tools. If the hook rarely fires, that means prevention is working and the hook is an insurance policy rather than a crutch.

Either way, the system now has telemetry that it did not have before. Every block is logged. Every bypass is logged. The data will tell us whether the Council’s two-layer solution was the right answer, or whether Rook’s network-layer controls eventually become necessary.

The Cascade Breaker’s job is not to fix the problem. Its job is to recognize when the current approach to fixing the problem has failed, and to escalate to a fundamentally different class of response. Sometimes that means convening a Council. Sometimes it means rethinking the architecture entirely. The only thing it never means is “try the same thing again.”

Kenneth G. Hartman

Digital Forensics Expert, Cloud Security Specialist, and SANS Institute Instructor

Search