As artificial intelligence agents increasingly take on roles involving financial transactions, concerns are mounting about the financial repercussions for humans when these systems malfunction. Researchers suggest that existing AI safety measures fall short in addressing these risks and propose new insurance-like strategies as a solution.
In a recent publication, experts from Microsoft, Google DeepMind, Columbia University, along with startups Virtuals Protocol and t54.ai, introduced the Agentic Risk Standard. This framework is designed to provide compensation for users when an AI agent fails at executing tasks, delivering services, or incurs financial losses.
The paper highlights that technical safeguards only offer probabilistic reliability, which does not suffice in high-stakes scenarios where users expect guaranteed outcomes. Most current research on AI focuses on improving model behaviors by reducing biases, making systems less susceptible to manipulation, and enhancing the interpretability of their decisions.
However, the authors point out that these product-level risks cannot be entirely mitigated through technical means alone due to the inherent stochastic nature of agent behavior. To bridge the gap between model reliability and user assurances, they suggest a framework rooted in risk management principles.
The Agentic Risk Standard introduces financial protections for tasks executed by AI. In low-risk scenarios where users pay only service fees, payments are held in escrow until task completion is verified. For high-stakes activities like trading or currency exchanges that involve upfront payments, an underwriter steps in to assess risks, mandates collateral from the service provider, and compensates the user in case of a covered failure.
The framework does not cover non-financial damages such as hallucinations, defamation, or psychological harm. The researchers tested this system through simulations comprising 5,000 trials but acknowledged the limitations of these tests, noting they don’t reflect actual failure rates.
They emphasized that further work is needed in risk modeling for various failure modes, empirical studies to measure failure frequencies under realistic conditions, and developing underwriting and collateral strategies robust against detection errors and strategic behaviors.