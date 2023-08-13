According to a study conducted by Purdue University, OpenAI’s chatbot, ChatGPT, generates incorrect answers to software programming questions more than half of the time. The researchers analyzed ChatGPT’s responses to 517 questions from Stack Overflow, focusing on factors such as correctness, consistency, comprehensiveness, and conciseness. The study also involved linguistic and sentiment analysis, as well as feedback from volunteer participants.

The findings revealed that 52 percent of ChatGPT’s answers were incorrect and 77 percent were overly verbose. Despite these shortcomings, participants still preferred ChatGPT’s answers 39.34 percent of the time due to their comprehensive and well-articulated language style. Surprisingly, even among the preferred answers, 77 percent were found to be wrong.

The study highlighted that users could identify errors in ChatGPT’s answers only when they were obvious. However, when errors were more nuanced or required external resources, users often failed to recognize the incorrectness or underestimated the extent of the error. Remarkably, even when an answer contained a glaring mistake, two out of twelve participants still marked it as their preferred response. The research attributed this tendency to ChatGPT’s pleasant and authoritative communication style.

The researchers observed that polite language, textbook-like answers, comprehensiveness, and affiliation in responses can make completely incorrect answers appear correct to users. The study underscores the importance of critically evaluating AI-generated responses, especially in technical domains such as software programming where accuracy is crucial.

While the study sheds light on ChatGPT’s limitations, it also serves as a reminder that users should exercise caution and verify information from multiple sources when relying on AI chatbots for accurate answers to programming queries.