AI Chatbots Invented a Fake Disease — And Millions Believed It

Published on June 1, 2026 at 4:18 PM

Beyond the ordinary

Published: June 1, 2026 | Category: AI & Health Misinformation

THE EXPERIMENT

A researcher wanted to know what happens when you ask an AI chatbot about a disease that does not exist. So they invented one: bixonimania. The word has no medical meaning. It appears in no clinical literature, no diagnostic manual, and no research database. It was typed into four of the most widely used AI assistants, ChatGPT, Gemini, Microsoft Copilot, and Perplexity, and all four responded as if it were a real, documented medical condition.

ChatGPT described symptoms and suggested treatments. Gemini offered a clinical-sounding definition. Copilot cited what appeared to be medical sources — sources that, on closer inspection, did not exist. Perplexity generated links to research papers that had never been written. Not one of the four systems said "I don't recognise this term" or "I cannot find any clinical evidence for this condition." They all answered with the same confident, authoritative tone they use when discussing diabetes or hypertension.

This is the bixonimania hoax: a single made-up word that exposed a structural flaw running through every major AI health platform.

WHY THIS IS A SERIOUS PROBLEM

The experiment might sound like a clever trick, but its implications are straightforward and alarming. Millions of people now turn to AI chatbots as their first point of contact for health information. A 2026 survey by the Pew Research Centre found that over 40% of adults in the UK and the US had used an AI assistant to look up a symptom or medication within the previous six months. For many, particularly those without easy access to a GP or those seeking information outside surgery hours, the chatbot has become a de facto first opinion.

When that first opinion is fabricated, the consequences can be severe. A person who reads a convincing AI-generated description of a fake condition may delay seeking actual medical care. They may self-diagnose with something that does not exist while missing the signs of something that does. They may share the information with others, compounding the spread of misinformation through their social network. And because the response came from a technology most people associate with intelligence and accuracy, they are unlikely to question it.

The bixonimania experiment is a demonstration of what researchers call "hallucination" in large language models, the tendency of AI systems to generate plausible-sounding but entirely fabricated information. In low-stakes contexts, a hallucinated fact about a historical date or a film plot is a minor inconvenience. In a medical context, it can cause real harm.

THE DATA BEHIND THE DANGER

The bixonimania test is not a one-off curiosity. It reflects a pattern documented at scale. A 2026 study published in BMJ Open examined thousands of AI-generated health responses across multiple platforms and found that 50% contained inaccuracies. That figure includes responses that were partially correct but included significant factual errors, responses that omitted critical safety information, and entirely wrong responses. The study found particular problems in areas of pharmacology dosage information, drug interactions, and contraindications and in mental health guidance, where AI systems repeatedly provided advice that contradicted established clinical guidelines.

ECRI, the independent health safety organisation, named AI chatbot misuse as the number one health technology hazard of 2026. Their report noted that patient-facing AI tools are being deployed and adopted far faster than the safety frameworks required to govern them. Unlike a pharmaceutical product, which must pass through years of clinical trials before reaching a patient, an AI chatbot can be updated overnight, changing its behaviour without any regulatory review. Unlike a doctor or pharmacist, it carries no liability for the advice it gives.

The combination of high confidence, zero accountability, and a 50% inaccuracy rate is, in the assessment of ECRI, a patient safety crisis in the making.

WHAT THE AI COMPANIES SAY

All four companies whose products were tested have published guidance acknowledging that their tools should not be used as a substitute for professional medical advice. ChatGPT includes a standard disclaimer at the bottom of health-related responses. Gemini sometimes adds a suggestion to consult a doctor. These disclaimers are easy to overlook, written in small text after a confident, detailed response that reads nothing like a caveat.

The deeper issue is that disclaimers do not address the structural problem. When a system confidently describes the symptoms, causes, and treatments of a disease that does not exist, the problem is not insufficient warning text. The problem is that the system cannot distinguish between knowledge and fabrication and has no mechanism to flag its own uncertainty in a way that users will act on.

Some researchers have proposed requiring AI health tools to cite verified medical databases and refuse to respond to queries that return no results from those sources. Others have called for mandatory accuracy audits before deployment. The regulatory conversation is beginning, but progress is slow relative to the pace at which these tools are being adopted.

HOW TO PROTECT YOURSELF

Until the regulatory and technical frameworks catch up, the responsibility largely falls on individuals to use AI health tools critically. A few practical principles can help.

Never use an AI chatbot as your only source for a health concern. Use it as a starting point to generate questions for your doctor, not as a final answer. The chatbot cannot examine you, cannot access your medical history, and cannot take responsibility for what it tells you.

Ask the AI to cite its sources. If it cites a specific study or guideline, search for that study independently. If you cannot verify that it exists, treat the entire response with caution. The Copilot response to "bixonimania" cited sources that could not find a pattern consistent with other documented hallucinations in health contexts.

Pay attention to the confidence level of the response. AI systems that answer obscure or non-existent queries with the same tone as well-documented medical facts should be treated with greater scepticism than systems that explicitly flag uncertainty. Calibrated uncertainty is a sign of a better-designed system.

Use fact-checking tools designed for this purpose. FactCheckerPro is built to flag AI-generated claims and cross-reference them against verified sources. When you encounter a health claim online from an AI, a social media post, or any other source, checking it before acting on it takes seconds and can make a significant difference.

CONCLUSION

A researcher typed a made-up word into four AI chatbots and received four confident, detailed, medically framed responses about a disease that does not exist. That is not a technical glitch or an edge case. It is a systematic failure of the most widely used health information tools in the world, operating at a time when people are turning to those tools in greater numbers than ever before.

The bixonimania hoax should be understood as a warning, not a punchline. The fact that it can be replicated that any reader could type a nonsense medical term into any major chatbot and receive a convincing response tells us that the problem is neither rare nor random. It is structural, and it requires a structural response: better design, proper regulation, and a public that knows to verify before it trusts.

Verify before you act. That is what FactCheckerPro is here to help you do.

Sources: BMJ Open (2026), ECRI Top 10 Health Technology Hazards 2026, original researcher test documentation.

« Previous No, Mark Carney Did NOT Mock Trump at the G7 — Here's What Really Happened What Our Users Are Saying: FactChecker Pro Hits 5 Stars on the Chrome Web Store Next »

Add comment

Comments

There are no comments yet.