OpenAI’s ChatGPT Fools a Third of Users Despite 52% Wrong Answers, Purdue Study Finds

-


OpenAI’s ChatGPT has emerged as a popular tool for various uses in the rapidly evolving artificial intelligence landscape. 

However, a recently published Purdue University study sheds light on a critical element of ChatGPT’s performance that deserves attention: its accuracy in answering software engineering questions. 

The study, titled “Who Answers It Better? An In-Depth Analysis of ChatGPT and Stack Overflow Answers to Software Engineering Questions,” delves deep into the quality and usability of ChatGPT’s responses, uncovering some intriguing and, at times, problematic findings.

Exposing ChatGPT With Programmer Questions

The Purdue team meticulously examined ChatGPT’s answers to 517 questions sourced from Stack Overflow, a well-known Q&A platform for programmers. 

The assessment spanned various criteria, including correctness, consistency, comprehensiveness, and conciseness. The results were both enlightening and concerning. 

ChatGPT answered approximately 52% of software engineering questions incorrectly, raising significant questions about its accuracy and reliability as a programming resource.

The study unveiled another interesting aspect of ChatGPT’s behavior: verbosity. A staggering 77% of ChatGPT’s responses were deemed excessively wordy, potentially impacting the clarity and efficiency of its solutions. 

However, amid these inaccuracies and rambling words, users surprisingly continued to prefer ChatGPT’s responses 39.34% of the time. As the study reveals, this preference is attributed to ChatGPT’s comprehensive and well-articulated language style.

Read Also: Google’s Med-PaLM 2 Deployment Draws Scrutiny from US Senate Over Healthcare AI Risks

Moreover, the research highlighted a distinctive trait of ChatGPT’s approach – a propensity for conceptual errors. The model seems to struggle to grasp the underlying context of questions, leading to a higher frequency of errors stemming from a lack of conceptual understanding.

Even when an answer contained glaring inaccuracies, participants in the study often marked the response as preferred, indicating the influence of ChatGPT’s polite, authoritative style.

However, the authors acknowledge ChatGPT’s limitations, particularly regarding reasoning. The model often provides solutions or code snippets without clearly understanding their implications, hinting at the challenge of incorporating reasoning into language models like ChatGPT.

A Closer Look

As News18 reports, the Purdue study also delved into the linguistic and sentiment aspects of ChatGPT’s responses. 

Surprisingly, the model’s answers exhibited more formal language, analytic thinking, and positive sentiments compared to responses from Stack Overflow. 

This inclination towards positivity might contribute to user trust in ChatGPT’s answers, even when they contain inaccuracies.

What This Study Holds

The implications of this study extend beyond the confines of ChatGPT’s performance. The observed decline in usage of traditional platforms like Stack Overflow suggests that ChatGPT’s popularity is altering the landscape of seeking programming assistance online. 

In response to these findings, the researchers offer valuable recommendations. Platforms like Stack Overflow could benefit from enhancing the detection of negative sentiments and toxicity in answers and providing more precise guidelines for structuring answers effectively. 

The study emphasizes that while ChatGPT can be useful, users should be aware of the potential risks associated with seemingly accurate answers.

Stay posted here at Tech Zone Daily.

Related Article: ChatGPT Rising in Popularity: US Workers Embrace AI Chatbot Despite Employers’ Concerns

 

ⓒ 2023 TECHTIMES.com All rights reserved. Do not reproduce without permission.

Tags:





Source link

Latest news

OpenAI’s Child Exploitation Reports Increased Sharply This Year

OpenAI sent 80 times as many child exploitation incident reports to the National Center for Missing & Exploited...

The Doomsday Glacier Is Getting Closer and Closer to Irreversible Collapse

Known as the “Doomsday Glacier,” the Thwaites Glacier in Antarctica is one of the most rapidly changing glaciers...

Grado’s Signature S750 Headphones Sound Modern but Feel Like the ’70s

The friction-pole mechanism for headband adjustment is no less agricultural, for all its familiarity where Grado headphone designs...

As a Planner Addict, Here’s Why I Think Japanese Planners Are Worth Switching To

This isn't something you'll see in Japanese planners. Instead, you're given more free space to write in your...

People Are Using Sora 2 to Make Disturbing Videos With AI-Generated Kids

On October 7, a TikTok account named @fujitiva48 posed a provocative question alongside their latest video. “What are...

How Elon Musk Won His No Good, Very Bad Year

What a weird time to be Elon Musk.This year opened with the businessman turned political operator throwing what...

Must read

You might also likeRELATED
Recommended to you