A recent study conducted by a team from the Cole Eye Institute at the Cleveland Clinic Foundation highlights the importance of thoroughly vetting and fact-checking medical research content generated by artificial intelligence (AI) chatbots. The researchers, led by Hong-Uyen Hua, MD, examined the quality of ophthalmic scientific abstracts and references produced by different versions of an AI chatbot.

The study compared two versions of the chatbot and evaluated their ability to generate abstracts and references for clinical research questions in ophthalmology. The abstracts were assessed using modified DISCERN criteria and performance evaluation scores. Additionally, two AI output detectors were used to evaluate the abstracts. The study also calculated the hallucination rate for references generated by the chatbot, which could not be verified.

The findings of the study revealed that the quality of the abstracts generated by the two versions of the chatbot was comparable. The mean modified AI-DISCERN scores for the abstracts were 35.9 and 38.1 out of a maximum score of 50 for the earlier and updated versions, respectively. However, when it came to detecting fake scores, the updated version of the chatbot performed significantly better, with a score of 10.8% compared to 65.4% for the earlier version.

Interestingly, both versions of the chatbot had a similar hallucination rate for nonverifiable references, averaging around 30%. This indicates that while the abstracts were of decent quality, there was a potential for factual errors or hallucinations in the references generated by the chatbot.

The researchers emphasized the need for caution when using AI-generated medical content for health education or academic purposes. They warned that AI chatbots may produce references that are not credible or accurate. Clinicians and researchers should carefully vet and fact-check any content generated by AI before relying on it for medical purposes.

In conclusion, while AI chatbots have the potential to generate ideas and references for medical research, it is crucial to verify their accuracy and reliability. Further research and improvements in AI detectors are needed to ensure the integrity of AI-generated medical content.