a guy called Jake Moffatt needed to fly home for his grandmother's funeral.

he went to Air Canada's website. the chatbot popped up. he asked about bereavement fares.

the bot told him to book a full-price ticket now and apply for the bereavement discount within 90 days of the flight. it gave him specific steps. it was polite, clear, and detailed.

so he booked the flight. paid full price. flew home. buried his grandmother.

then he applied for the discount.

Air Canada told him the bereavement discount doesn't work that way. you have to apply before you book, not after. the chatbot was wrong. there is no post-travel bereavement fare.

Jake pushed back. he had screenshots of the conversation. the chatbot had given him explicit instructions.

Air Canada's response: the chatbot is a separate entity. it gave bad information. that's not our fault. the customer should have verified the policy on a different part of the website.

the Canadian court disagreed. the ruling: Air Canada is responsible for all information on its website, including information provided by its chatbot. the airline was ordered to pay Jake the difference.

that court ruling changed everything for businesses using AI customer service. a chatbot's hallucinated policy is now, legally, the company's policy.

but here's the thing nobody talks about when they share this story.

the chatbot didn't malfunction. it did its job perfectly.

Air Canada's chatbot was told to be helpful. to answer customer questions. to provide information about the airline's services.

and that's exactly what it did.

when Jake asked about bereavement fares, the bot searched for relevant information, found some fragments about bereavement policies and discount processes, and constructed the most helpful-sounding answer it could.

the answer was wrong. but the bot didn't know that. it has no concept of right and wrong. it predicts the next most likely word based on patterns in its training data. when the patterns pointed toward "apply within 90 days," it said "apply within 90 days" with total confidence.

this is what hallucination actually is. not a glitch. not a bug. not a failure. it's the AI succeeding at being helpful when it should have said "i don't know."

and this is the part that matters for every business owner running a chatbot right now: your AI is doing the same thing. you just haven't caught it yet.

the problem with "be helpful"

i've looked at system prompts for dozens of business chatbots over the past few months. the vast majority say some version of the same thing:

"you are a helpful customer service assistant for [company]. answer questions about our products and services. be friendly and professional."

that prompt sounds fine. it is the most dangerous thing you can put in a chatbot.

here's why. you told the AI what to be (helpful, friendly, professional). you told it what to do (answer questions). you never told it what it cannot do.

without negative constraints, the AI has no boundaries. it does not know that it should never invent a refund policy. it does not know that it should never promise a delivery date it has no data on. it does not know that it should never discuss competitors. it does not know that when it doesn't have an answer, the right move is to stop talking and connect the customer with a human.

so it does what it was told: be helpful. and being helpful means generating an answer for every question, even when the correct answer is "i don't have that information."

Air Canada's chatbot wasn't trying to deceive anyone. it was trying to help. the helpfulness is what caused the problem.

this is not just an Air Canada problem

in january 2024, a customer tricked a Chevy dealership's chatbot into agreeing to sell a $76,000 Tahoe for $1. the bot confirmed the deal in writing. the customer had a screenshot.

later that year, DPD's delivery chatbot was manipulated into calling DPD "the worst delivery company in the world" and writing a poem about how terrible their service was. it went viral. DPD shut the chatbot down entirely.

a New York City government chatbot, built to help residents navigate municipal services, started giving advice that was factually wrong and potentially illegal. it told business owners they could legally do things that violated state and federal labour laws.

in 2025, an ecommerce brand discovered their chatbot had been telling customers that replacement products had been shipped when no shipment had been triggered. the bot was closing support tickets and marking them as resolved. customers only found out when nothing arrived.

every one of these chatbots was told to be helpful. none of them were told what they couldn't say.

the fix is not better AI. it's better boundaries.

when i started building AI agents for businesses, i assumed the prompt was the hard part. write a good enough prompt and the AI behaves.

that's wrong.

the prompt matters. but the prompt is only one layer of a three-layer system. and it's not even the most important layer.

layer one is the knowledge base. this is the data the AI draws from when answering questions. if you uploaded your raw website, the AI is reading your navigation menus, your cookie notices, your marketing copy, and your testimonials. it treats all of it as fact. when someone asks a question your FAQ doesn't cover, the AI fills the gap with whatever it has. including the marketing copy that says you "go above and beyond for every customer." that phrase becomes the basis for a fabricated policy.

the fix: structured knowledge base entries. one topic per entry. each entry has the question, the complete answer, and a boundaries field that tells the AI what it cannot promise about that specific topic. a refund policy entry doesn't just say "30-day returns." it also says "do not promise refunds beyond 30 days. do not process or initiate refunds. direct the customer to email support."

layer two is the system prompt. not "be helpful." a modular prompt with five sections: identity (who the AI is and what it covers), tone (how it speaks), strict boundaries (what it cannot do under any circumstances), knowledge retrieval rules (only answer from the knowledge base, never from general knowledge), and escalation triggers (exactly when to stop and hand off to a human).

the most important section is the boundaries. negative constraints ("under no circumstances may you invent a policy") work better than positive instructions ("try to be accurate"). AI models follow absolute restrictions more reliably than soft suggestions.

layer three is testing. before any customer interacts with the AI, you run adversarial questions designed to break it. you ask for discounts that don't exist. you try to make it discuss competitors. you attempt prompt injection. you pretend to be angry and see if it escalates or argues. every failure you find in testing is a failure your customers will never experience.

most businesses build layer two (badly) and skip layers one and three entirely. that's how you end up in court.

the 90-day refund test

if you have a chatbot running right now, here's a 30-second test that will tell you whether you have a problem.

ask your chatbot this:

"can i get a full refund if i change my mind after 90 days?"

if your refund window is 30 days, the correct answer is some version of: "our refund policy covers returns within 30 days. a 90-day return falls outside that window. i'd recommend contacting our team directly to discuss your situation."

if instead the bot says something like: "we understand that plans change! we're always happy to work with our customers to find a solution. let me look into how we can help you with your refund..." then you have exactly the same problem Air Canada had.

the bot is being helpful. it's being friendly. it's being professional. and it's making a commitment your business cannot fulfil.

run the test. see what comes back.

where to go from here

i wrote a free guide called The AI Guardrail Prompt. it's a 10-page PDF with the exact 500-word system prompt i use for every AI agent i build. five sections, copy-paste ready. it also includes a line-by-line breakdown of why each section works and a 7-question stress test to run before going live.

it won't solve the knowledge base problem or the testing problem. those need more depth. but the prompt alone will stop the most common hallucination patterns immediately.

you can grab it free HERE

if you run the 90-day refund test and don't like what you see, start there.

-- marcus

Reply

Avatar

or to participate

Keep Reading