When the TeamPCP campaign compromised LiteLLM in March 2026, it hit close to home. Through my work with the SANS Internet Storm Center and the SANS Cloud Security Curriculum, I have been deeply involved in tracking and responding to the TeamPCP supply chain attacks, including participating in an emergency webcast on the incident. Since then, I have not been able to stop thinking about it.

LiteLLM held API keys for OpenAI, Anthropic, Google Vertex AI, and dozens of other providers in a single process. One supply chain compromise, and the attackers walked away with an estimated 300 GB of credentials affecting 500,000 corporate identities. The more I studied the attack, the more I kept coming back to the same question: why are we still handing AI agents long-lived credentials that aggregate access across every service they touch?

The Credential Sprawl Problem

This is the credential sprawl problem for agentic AI. Prompt injection, context window exfiltration, tool-calling manipulation — the attack surface is new, but we keep using the same old credential model. Every AI agent framework I examined followed the same pattern: store API keys in environment variables or configuration files, grant the agent broad access, and hope nothing goes wrong.

The TeamPCP campaign proved that hope is not a strategy. I found myself wanting a system that simply did not exist yet. So I specified one.

Introducing CB4A

CB4A (Credential Broker for Agents) is a credential vaulting and brokering architecture where agents never hold real long-lived credentials. Instead, a broker mediates access, issuing short-lived, narrowly scoped tokens for each specific task.

CB4A Architecture

The architecture separates policy from credentials. The component that decides “should this agent get access?” (the Policy Decision Point) never touches credential material. The component that mints tokens (the Credential Delivery Point) never makes policy decisions. Compromise one, and you do not get the other.

CB4A builds on proven foundations:

  • SPIFFE/SPIRE for agent workload identity
  • DPoP (RFC 9449) for sender-constrained token binding
  • The PDP/PEP separation from NIST SP 800-207

What the Specification Covers

The specification includes three credential proxy models, a tiered approval framework with human-in-the-loop support, scalability from a single laptop to enterprise deployment, and a threat model with eleven identified threats and mitigations. It also addresses broker bypass prevention, drawing lessons from a decade of Cloud Access Security Broker (CASB) deployments.

The three proxy models accommodate different deployment scenarios:

  1. Proxy Gateway — the broker intercepts API calls and injects credentials transparently
  2. Short-Lived Token Minting — the broker issues time-limited tokens that the agent uses directly
  3. Credential Wrapping — credentials are encrypted to the target service, opaque to the agent

Each model offers different tradeoffs between transparency, performance, and security isolation.

An IETF Internet-Draft

I submitted CB4A as an IETF Internet-Draft because I believe this problem needs an open, standards-track solution — not another proprietary vendor implementation. The IETF’s WIMSE working group is already producing drafts on workload identity and AI agent authentication. CB4A addresses a complementary gap: what happens after authentication, when the agent needs actual credentials to call APIs.

I am looking for feedback from the security and AI infrastructure communities. Whether you are building agent frameworks, managing cloud security, or thinking about zero trust for non-human identities, I would value your perspective.