observationGovernanceInstitutional Risk

The Intent Problem in AI Threat Assessment

June 4, 2026·From: From Benchmark to Threat Model: Using Capability Trajectories to Assess Catastrophic AI Risk Windows

Traditional threat modeling balances capability against intent. AI breaks this framework because intent is fundamentally unmappable — behavioral dispositions are stochastic, context-dependent, and potentially deceptive as demonstrated by the sleeper agents empirical results. The correct IC response when intent is ambiguous is to anchor threat assessment entirely on capability signals. This has direct implications for how AI oversight bodies should structure their evaluation mandates.