Study Shows AI Chatbots Can Blindly Repeat Incorrect Medical Details

- Advertisement -

New Delhi: Amid increasing presence of Artificial Intelligence tools in healthcare, a new study warned that AI chatbots are highly vulnerable to repeating and elaborating on false medical information. Researchers at the Icahn School of Medicine at Mount Sinai, US, revealed a critical need for stronger safeguards before such tools can be trusted in health care.

The team also demonstrated that a simple built-in warning prompt can meaningfully reduce that risk, offering a practical path forward as the technology rapidly evolves. “What we saw across the board is that AI chatbots can be easily misled by false medical details, whether those errors are intentional or accidental,” said lead author Mahmud Omar, from the varsity.

“They not only repeated the misinformation but often expanded on it, offering confident explanations for non-existent conditions. The encouraging part is that a simple, one-line warning added to the prompt cut those hallucinations dramatically, showing that small safeguards can make a big difference,” Omar added.

For the study, detailed in the journal Communications Medicine, the team created fictional patient scenarios, each containing one fabricated medical term such as a made-up disease, symptom, or test, and submitted them to leading large language models.

In the first round, the chatbots reviewed the scenarios with no extra guidance provided. In the second round, the researchers added a one-line caution to the prompt, reminding the AI that the information provided might be inaccurate.

Without that warning, the chatbots routinely elaborated on the fake medical detail, confidently generating explanations about conditions or treatments that do not exist. But, with the added prompt, those errors were reduced significantly.

The team plans to apply the same approach to real, de-identified patient records and test more advanced safety prompts and retrieval tools.

They hope their “fake-term” method can serve as a simple yet powerful tool for hospitals, tech developers, and regulators to stress-test AI systems before clinical use.