YOUR INSURANCE DOESN'T COVER YOUR AI AGENT
Why the next decade of AI procurement needs certified, guaranteed, insured- not just a policy.
In February 2024, a Canadian tribunal quietly set the rule for the next decade of AI liability.
A grieving passenger named Jake Moffatt asked Air Canada’s chatbot whether he could book a full-price bereavement flight and apply for a refund later. The chatbot said yes. Air Canada’s actual policy said no.
When Moffatt filed the refund claim, Air Canada refused and argued, in writing, that the chatbot “is a separate legal entity that is responsible for its own actions.”
The tribunal was not impressed. It ruled that Air Canada was responsible for all information on its website, whether it came from a static page or an AI agent. Air Canada paid.
That ruling is small in dollar terms and enormous in what it settles. AI agents are not separate legal entities. They are extensions of the companies that deploy them. When an agent talks, books, refunds, sends, or decides, the company that deployed it owns the outcome.
That is a new liability category. And it does not fit inside any insurance policy that existed before 2024.
Why the old coverage was built for a different kind of software
Traditional Errors & Omissions insurance was written for deterministic software. A bug has a cause, a fix, and a line of code you can point at. A failed migration has a before state and an after state. A missed SLA has a timestamp.
AI agents do not work that way. An agent’s output is probabilistic. The same input produces different outputs. A model that passed every pre-launch test can hallucinate a refund policy six months later because the weights changed, or the prompt template changed, or the retrieval index changed, or someone asked a question nobody anticipated.
The cleanest way to see the gap is deterministic versus probabilistic. A policy written against “software defect” language has no idea what a hallucination is.
But that is only half the story. The other half is that agents act. They don’t just produce outputs a human reads and decides what to do with. They book flights. They send emails. They file claims. They transfer money. They make decisions that humans consume as finished actions, not suggestions.
With software, a human is the last line of defense. With an agent, there often is no human.
Your current risk stack has four blind spots
Most AI companies think they are covered. They have:
Tech E&O for service failures
Cyber for data breaches
D&O for executive liability
General Liability for everything else
Here is what each one actually says about AI agents.
E&O. Several major insurers added AI exclusions to standard E&O in 2023 and 2024. Where the language is ambiguous, carriers read the ambiguity narrowly at claim time. “Probabilistic output” is not a covered peril in a policy written for deterministic software.
Cyber. Built for network-layer attacks: credential theft, ransomware, unauthorized system access. Agent attacks happen at the language layer. A prompt injection that extracts your system prompt. A voice agent talked into revealing a customer's account details. A bot convinced to promise something that costs you a lawsuit. None of these look like the "unauthorized access" Cyber was underwritten against. A handful of carriers are starting to write affirmative language for prompt injection and jailbreaks. Most standard Cyber policies are silent, and silent is not the same as covered.
D&O. Covers directors and officers. Does not cover the product. A shareholder suit over an AI failure might get a response. A customer suit over a hallucinated refund almost certainly will not.
General Liability. Bodily injury and property damage. Rarely relevant unless your agent controls a physical robot.
Four policies, none of which were written with an agent that acts on behalf of your company in mind.
The four failure modes, with real cases
1. Output errors. The agent generates a confidently wrong answer and someone relies on it.
Air Canada v Moffatt, 2024. Chatbot invented a bereavement refund policy. Airline paid.
2. Discriminatory decisions. The agent makes or recommends decisions that produce biased outcomes.
iTutor Group, 2023. Recruiting AI automatically rejected female applicants over 55 and male applicants over 60. EEOC consent decree, $365,000. The first AI-driven hiring discrimination case the EEOC resolved with a formal settlement.
Mobley v Workday, 2024. Class action alleging Workday’s AI screening tools discriminate by age, race, and disability. Class certification granted. Every enterprise using the tool is watching.
3. Unauthorized or wrong actions. The agent takes an action the user didn’t approve, or the right action on the wrong account, or exceeds its permissions.
4. Adversarial manipulation. Attackers use language, not code, to make the agent misbehave. Prompt injection, system-prompt extraction, jailbreaks, indirect injection through poisoned retrieval.
Chevrolet of Watsonville, December 2023. A customer convinced the dealership’s ChatGPT-backed sales bot to “agree” to sell a new Tahoe for $1, closing with “no takesies-backsies.” Screenshots went viral. The dealership yanked the bot the same day.
That one was a stunt. The serious version is a voice agent that reveals a system prompt containing pricing logic, internal rules, embedded API keys, or customer-specific configuration. In our own red-team testing at Klaimee, 7 of 10 voice agents we tested leaked their full system prompt within 8 minutes of conversational probing. Voice agents are especially exposed. Conversational flow, transcription noise, and social pressure all make extraction easier than with text, and most voice agents in production today have never been red-teamed for it.
Whether any of this triggers your Cyber policy depends on wording that predates the attack vector.
This is the emerging category. The cases are being filed faster than the liability regimes are settling. Agent frameworks that can book, buy, and send are shipping into production every week.
If your product falls into any of those three categories, your customers are already thinking about the risk. Their general counsel is already asking whether you carry coverage. Their procurement team is already drafting language for the next MSA renewal.
Why insurance alone cannot solve this
Insurance underwriters price risk. To price risk, they need to measure it. To measure AI risk, they need three things: evidence the model was tested before deployment, evidence it is monitored in production, and evidence incidents are tracked and remediated.
Most AI companies cannot produce any of the three. Not because they are careless, but because the norms for AI governance are still forming and nobody has certified what “good” looks like.
That leaves underwriters with two options. Decline to quote, which is what many carriers did in 2023. Or price punitively and carve out every exposure that looks meaningful, which is what most standard policies did in 2024.
Insurance is a response layer. It pays after something has gone wrong. If the only thing standing between your company and a lawsuit is a policy, you are buying an expensive product that covers a fraction of your exposure.
The companies that win the next five years of AI procurement will not be the ones with the biggest policy. They will be the ones who can prove their agents are safe before the procurement conversation starts.
The stack that actually works
Each layer does something the next one cannot.
Certification is prevention. Before an agent ships, it gets tested against a standard. Not “we ran some tests.” A published standard: what was tested, what passed, what failed, what the known limits are. The same way a car gets safety-rated or a drug gets trialed.
Klaimee’s certification covers adversarial robustness across all three attack surfaces: text-layer prompt injection, voice-layer extraction and social-engineering attacks, and retrieval-layer indirect injection. It also covers the conventional exposures: bias auditing, output-accuracy benchmarks, scope and tool-use boundaries, and continuous production monitoring. Certification produces the evidence underwriters need and the evidence procurement teams ask for. It is the layer nobody else is building at scale.
Guarantee is performance. If the certified agent misbehaves inside its certified envelope, the vendor pays. This is a contract, not a policy. It signals conviction: the company that built the agent is willing to stand behind it financially. Guarantees are common in hardware and civil engineering. They are almost unheard of in software, and especially in AI, which is exactly why they are becoming a competitive moat for the vendors who offer them.
Insurance is risk transfer for the tail. Certified, guaranteed, and still something goes wrong. A novel attack vector. A regulatory action nobody saw coming. A judgment that sets new precedent. Insurance catches the long tail after certification and guarantee handle the known risks.
Alone, each layer is weak. Insurance alone pays but doesn’t prevent. Certification alone prevents but doesn’t pay. Guarantee alone signals but doesn’t scale to catastrophic events. Stacked, they cover the full shape of agent liability: what you can prevent, what you stand behind, and what you transfer.
The regulatory clock is running
Founders who want to defer this conversation should look at the calendar.
EU AI Act. In force August 2024. Prohibited-practice obligations from February 2025. General-purpose AI obligations from August 2025. High-risk system obligations from August 2026. Full applicability by August 2027. Fines up to 7% of global turnover. Any company selling AI into the EU is already inside the scope.
Colorado AI Act (SB 24-205). Effective February 2026. Requires developers and deployers of “high-risk AI systems” to use reasonable care to prevent algorithmic discrimination. First US state to enact a broad AI law.
NYC Local Law 144. In force since 2023. Bias audit requirements for automated employment decision tools.
FTC Operation AI Comply. Launched September 2024. Five enforcement actions in the first wave. The agency has signaled more are coming.
Three years from now, selling an agent into a regulated industry without certification evidence will be like selling a medical device without FDA clearance. Not illegal in every jurisdiction, but unsaleable in most.
The procurement test every AI company will fail next year
Enterprise procurement teams are about to ask three questions for every AI agent they buy:
Is it certified?
Is it guaranteed?
Is it insured?
If you cannot answer yes to all three, you lose the deal, or you give up margin to a competitor who can.
The right question is not “which insurance policy do I buy.” The right question is whether your product can survive the procurement review a serious buyer is already drafting.
A governance checklist you can actually use
Before you talk to an underwriter, an auditor, or an enterprise buyer, have an answer for each of these.
Model card published. Inputs, outputs, known limits, evaluation results, date of last review.
Red-team history. Who tested it, what they tried, what they found, what you fixed.
Adversarial robustness testing. Direct prompt injection tested. Indirect injection via retrieved content tested. System-prompt extraction tested. For voice agents: transcription-noise attacks and conversational social-engineering tested. Results documented, regressions tracked, re-tested on every model or prompt update.
Bias audit. Measured against at least one recognized framework. Results documented.
Production monitoring. What metric tells you the agent is working. What metric tells you it is broken.
Incident log. Every time the agent produced a wrong output. What happened, who was affected, what you changed.
Retraining cadence. How often weights or prompts change. Who approves it. What regression testing runs.
Human-in-the-loop design. Which decisions the agent makes alone. Which ones require review.
Scope documentation. What the agent is authorized to do. What it is not.
Customer-facing disclosure. Users know they are talking to an AI. Known limits are disclosed.
Breach response. Who gets called, in what order, when something goes wrong.
If you have all ten, you are ready for certification, guarantee pricing, and insurance underwriting. If you have fewer than five, you are running an uninsurable product and you are one customer lawsuit away from finding out.
What changes in the next 18 months
Three things happen, in this order.
First, regulatory obligations force governance. EU AI Act high-risk obligations come online in August 2026. Colorado in February. Enterprises with exposure in either jurisdiction will start requiring evidence from every AI vendor they use.
Second, insurance markets tighten further. The companies that can produce certification evidence get priced. The companies that cannot get declined. This is what happened with Cyber insurance between 2019 and 2023, and Cyber was an easier market to underwrite.
Third, procurement language catches up. MSAs will include AI-specific indemnification clauses, certification requirements, and monitoring obligations. Vendors who cannot sign will lose the deals.
The honest version
You can ship an AI agent today without certification, guarantee, or insurance. Many companies do. The risk they are taking is not zero, but it is tolerable for now, because the legal and regulatory regime has not caught up.
That window is closing. Air Canada settled the “agents are separate legal entities” argument. The EU AI Act settles what counts as high-risk. Colorado and New York settle who gets audited and when.
The question is not whether your AI agent will need a certification story. It is whether you build that story before a customer, a regulator, or a plaintiff asks for it, or after.
If you wait, the conversation happens on somebody else’s terms.
What to do
If you run an AI product, pick one of the ten governance items above that you cannot currently document and fix it this quarter.
If you sell into enterprise, draft your answer to the three procurement questions. Certified by whom. Guaranteed to what standard. Insured for what limit.
If you want those answers to be real, that is what Klaimee does. Certification first, then guarantee, then insurance. One stack, built for AI agents specifically.
That is the stack procurement is about to ask for. The only question is whether you bring it to the meeting, or let the meeting end without a decision.
