The Gap Between Policy and Proof in AI Agent Compliance

Two audit firms answered the same question last week without knowing about each other. Both said the same thing.

The question was simple. Are enterprises starting to ask for per-action evidence of AI agent governance in their audits?

The first firm issues over 3,000 SOC 2 reports annually. They said demand signals around AI governance are increasing, though the space is still evolving. In SOC 2 today, AI governance gets embedded within existing controls like access, change management, and monitoring. Nobody calls it out directly. But frameworks like NIST AI RMF and ISO 42001 are gaining traction, and those put more emphasis on demonstrating oversight, traceability, and repeatability of AI driven actions.

The gap they identified was specific. Most companies still struggle to prove that controls were actually executed as intended in real time.

The second firm has completed over 2,000 audits across a wide range of environments. They looked at Sanna before responding, then described it as something that turns sampled evidence into something deterministic and makes audits more continuous instead of point-in-time. They mapped it to SOC 2 control categories that hadn’t come up in the conversation, including access control, change management, and system operations.

Neither firm received a pitch deck. They received a three-paragraph explanation of what Sanna does and a question. Both identified the same problem, and both described how they’d use the product to solve it.

Where compliance stands today

SOC 2 audits evaluate governance through policy documents, control design, and sampled evidence. An auditor reviews your policies, checks that controls exist, and samples evidence that those controls operated over time. For questions with binary answers, this works well. The compliance platforms are good at collecting that kind of evidence automatically.

AI agent behavior doesn’t fit that model. An agent makes hundreds or thousands of decisions per day, and every one of those decisions crosses a trust boundary with potential regulatory exposure. Sampling a handful and reviewing them against a policy document doesn’t tell you whether governance was applied to the rest.

The frameworks are catching up. The NIST AI Risk Management Framework explicitly calls for organizations to demonstrate oversight, traceability, and repeatability of AI driven actions through its Govern, Map, Measure, and Manage functions. ISO/IEC 42001, the first international standard for AI management systems, establishes a certifiable framework for responsible AI governance across the full system lifecycle. The EU AI Act, which entered into force in August 2024, requires demonstrable control over AI system behavior with auditable evidence for high-risk applications. All three are moving toward continuous, verifiable evidence of AI governance, but none of them specify what that evidence should look like at the individual action level.

What the evidence actually needs to be

Policy documents prove you have rules, and logs prove something happened. But neither one proves that governance was applied at the moment of execution.

The evidence that auditors and regulators will eventually require has to do three things. First, prove that a specific approved policy was in force when the action occurred. Second, prove that the action was evaluated against that policy before execution. Third, allow any third party to verify the proof without trusting the system that produced it.

Sanna produces exactly this. A Governance Receipt is a cryptographically signed artifact generated at the moment of enforcement. It binds the constitution hash, the action, the enforcement decision, and a timestamp into a single Ed25519-signed record. Any third party can verify it with the public key. No trust in the platform required.

The enforcement and the evidence are the same operation at the same chokepoint. The action cannot proceed without generating the receipt. There is no mode where an agent acts without producing proof of governance.

The gap is already visible

Both audit firms arrived at the same conclusion from different starting points. The people who sit across the table from enterprise buyers every day are already watching this problem take shape.

If you’re deploying AI agents in a regulated environment, your auditor is eventually going to ask how you govern agent behavior. Policy documents and sampled log entries won’t hold up when every agent action carries potential liability.

Sanna was built to answer that question. Constitution enforcement gives you real-time control over what your agents can do. Governance Receipts give you independently verifiable proof that the control was applied, on every action, without exception.

If that’s a problem you’re facing, or one you can see coming, get in touch.

Build with verifiable governance

Sanna is open-source trust infrastructure for the agentic economy. We're working with design partners to integrate governance receipts into production agent deployments.

Become a design partner