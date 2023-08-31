A new study published in JAMA Oncology raises concerns about the accuracy of OpenAI’s ChatGPT in providing medical advice for cancer treatment. Researchers from Mass General Brigham, Sloan Kettering, and Boston Children’s Hospital tested ChatGPT by asking it for recommendations on cancer treatments based on 104 different prompts.

The results were underwhelming, with ChatGPT scoring only 61.9% when evaluated by four board-certified oncologists using five criteria. The study found that nearly 13% of the responses from the chatbot were “hallucinated,” meaning they sounded factual but were completely inaccurate or unrelated to the prompt.

Dr. Harvey Castro, an emergency medicine physician and AI expert, expressed concern about the potential harm caused by misinformation provided by ChatGPT. For example, a patient with advanced lung cancer might receive a recommendation for a treatment not recognized by the National Comprehensive Cancer Network (NCCN) guidelines, leading to delays in receiving appropriate care.

The study’s lead author, Danielle Bitterman, stated that while ChatGPT excels at mimicking human language, it lacks the training to reliably provide factually correct information. Bitterman noted the challenge of distinguishing correct from incorrect information in health advice.

The research also highlighted some limitations, such as evaluating only one language learning model (LLM) in one snapshot in time. However, ChatGPT 3.5, the version used in the study, is publicly available and widely accessible. The researchers did not extensively investigate prompt engineering, which might have improved the results.

Dr. Castro emphasized the need to use AI chatbots as a supplement, not a replacement, for professional medical advice. He called for caution and adherence to established guidelines and clinical expertise when making treatment recommendations.

While ChatGPT and similar large language models have potential for synthesizing information in accessible language, Bitterman stressed the importance of careful evaluation and optimization for the clinical domain.

Future research will focus on assessing the long-term impact and generalizability of AI chatbots in cancer treatment. Additionally, studies will explore the performance of chatbots in providing suggestions for different types of cancer and medical conditions.

Overall, the study highlights the necessity of ensuring accurate and appropriate recommendations when using AI chatbots for cancer treatment. it is crucial to remain cautious and prioritize professional medical advice to prevent potential harm to patients.

