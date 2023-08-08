A recent study by researchers from Purdue University has revealed that OpenAI’s chatbot, ChatGPT, often provides incorrect answers to software programming questions. The research team analyzed the responses of ChatGPT to 517 questions from Stack Overflow, focusing on factors such as correctness, consistency, comprehensiveness, and conciseness.

The study found that 52% of ChatGPT’s answers were incorrect, and an overwhelming 77% were overly verbose. Surprisingly, despite these inaccuracies, participants still preferred ChatGPT’s answers 39.34% of the time. This preference was attributed to the chatbot’s comprehensive and well-articulated language style, even though a significant portion of the preferred answers were also wrong.

Users were only able to identify errors in ChatGPT’s responses when they were obvious. The researchers found that errors that weren’t easily verifiable or required external resources often went unnoticed or were underestimated by users. Elements such as polite language, text-book style answers, comprehensiveness, and affiliation played a role in making incorrect answers appear correct to participants.

The study also observed that ChatGPT’s responses contained more language suggesting accomplishment or achievement compared to Stack Overflow posts. The chatbot frequently used phrases like “of course I can help you” or “this will certainly fix it.” Additionally, ChatGPT made more conceptual errors, often due to its inability to understand the underlying context of the questions.

The researchers recommended several improvements for Stack Overflow. They suggested implementing effective methods to detect toxicity and negative sentiments in comments and answers. They also emphasized the need for better discoverability of answers and specific guidelines for structuring answers in a step-by-step, detail-oriented manner.

While the study revealed the limitations of ChatGPT, it also highlighted that 60% of respondents still found human-authored answers on Stack Overflow to be more correct, concise, and useful. Although there have been reports of declining usage on Stack Overflow, the platform’s new ownership disagreed with the extent of the decline and clarified that it is smaller than indicated by SimilarWeb.