• Send Us A Tip
  • Calling all Tech Writers
  • Advertise
Friday, July 3, 2026
  • Login
TechStory
  • News
  • Crypto
  • Gadgets
  • Memes
  • Gaming
  • Cars
  • AI
  • Startups
  • Markets
  • How to
No Result
View All Result
  • News
  • Crypto
  • Gadgets
  • Memes
  • Gaming
  • Cars
  • AI
  • Startups
  • Markets
  • How to
No Result
View All Result
TechStory
No Result
View All Result
Home Future Tech AI

Shocking: OpenAI Researchers Find That Even The Best AI Is “Unable To Solve The Majority”

by Reshab Agarwal
February 24, 2025
in AI, News
Reading Time: 2 mins read
0
Google's secret AI Project Jarvis
TwitterWhatsappLinkedin

Despite rapid advancements in artificial intelligence, OpenAI researchers find that even the best AI is “unable to solve the majority” of complex coding tasks. CEO Sam Altman, however, remains optimistic, predicting that AI will surpass entry-level programmers by the end of the year.

You might also like

How does an On-device AI work? 

EU’s Highest Court Upholds €4.1 Billion Android Fine Against Google, Ending Eight-Year Legal Battle

How Smart Rings Track Your Health: The Tiny Wearables That Know More Than You Think

A recent OpenAI study reveals that even cutting-edge models struggle with most coding challenges. The study, based on a new benchmark called SWE-Lancer, evaluated AI performance on over 1,400 software engineering tasks sourced from Upwork.

AI Models Tested on Real-World Coding Problems

OpenAI assessed three large language models (LLMs)—its own o1 reasoning model, GPT-4o, and Anthropic’s Claude 3.5 Sonnet. These models tackled individual coding tasks like bug fixes and broader software management assignments. However, without internet access, they could not reference online solutions.

The AI models attempted tasks worth hundreds of thousands of dollars on Upwork but could only address surface-level software issues. They struggled to detect deeper bugs or identify root causes, producing incomplete or incorrect solutions. While AI worked faster than human coders, it lacked contextual understanding, leading to unreliable outcomes.

Claude 3.5 Outperforms, But Still Fails Majority of Tests

Among the tested models, Claude 3.5 Sonnet outperformed OpenAI’s o1 and GPT-4o in earnings. However, most of its answers were still incorrect. Researchers concluded that AI models need significantly higher reliability before they can handle real-world coding tasks independently.

The study highlights AI’s ability to execute simple, isolated coding assignments but reinforces that human engineers remain superior in tackling complex software challenges.

Microsoft CEO Criticizes AI Hype

OpenAI researchers find that even the best AI is “unable to solve the majority” of tasks requiring deep contextual understanding.  Microsoft CEO Satya Nadella has voiced skepticism about the exaggerated claims surrounding AI’s capabilities. In a recent interview, he dismissed self-declared artificial general intelligence (AGI) milestones as “nonsensical benchmark hacking.”

Nadella emphasized the need to focus on AI’s real-world economic impact rather than pursuing theoretical AGI achievements. He argued that AI should drive industrial-level productivity growth before being compared to revolutions like the Industrial Revolution.

Despite his cautious stance, Microsoft remains a major player in AI investment. The company has poured $12 billion into OpenAI and committed $80 billion to the ambitious $500-billion Stargate project, backed by former U.S. President Donald Trump.

AI Faces Technical and Economic Hurdles

One of the biggest challenges AI faces in coding is contextual understanding, as OpenAI researchers find that even the best AI is “unable to solve the majority” of intricate software issues.  The AI industry faces numerous obstacles, from persistent “hallucinations” in AI responses to cybersecurity risks. Despite massive investments, AI-driven productivity growth has yet to materialize.

Chinese AI startup DeepSeek recently challenged industry leaders by introducing a low-cost, high-efficiency reasoning model called R1. This triggered a major selloff, wiping out $1 trillion from the AI market.

As tech giants continue to invest heavily in AI, skepticism remains about whether these models can genuinely transform industries. Nadella’s remarks signal a push for a more practical approach, urging companies to prioritize real economic value over ambitious AI claims.

Another key concern is AI’s economic impact. Despite significant investments in AI technology, its practical benefits remain limited. AI-driven automation was expected to revolutionize software engineering, but the reality is different. AI lacks reliability and cannot work independently on complex projects, making human oversight necessary. OpenAI researchers have concluded that AI still requires higher accuracy and contextual awareness before it can replace human coders.

 

Tweet55SendShare15
Previous Post

How to Play Online Blackjack Games in Australia

Next Post

Alibaba Joins Global AI Race With $53 Billion Investment

Reshab Agarwal

Reshab is a tech-enthusiast who likes to write about all things crypto. He is a Bitcoin bull and believes in a decentralized future of finance. Follow him on Twitter for more!

Recommended For You

How does an On-device AI work? 

by Afeefa Ansari
July 3, 2026
0
How does an On-device AI work? 

On-device AI is becoming an assistant like never before. It is a fresh take on the world of AI and helps you handle things really well. We shall...

Read more

EU’s Highest Court Upholds €4.1 Billion Android Fine Against Google, Ending Eight-Year Legal Battle

by Rounak Majumdar
July 3, 2026
0
EU's Highest Court Upholds €4.1 Billion Android Fine Against Google, Ending Eight-Year Legal Battle

After eight years of judicial fights across Europe, Google has finally lost. On Thursday, Europe's top court dismissed Alphabet's Google's final appeal against the European Commission's €4.1 billion...

Read more

How Smart Rings Track Your Health: The Tiny Wearables That Know More Than You Think

by Ishaan Negi
July 2, 2026
0
How Smart Rings Track Your Health: The Tiny Wearables That Know More Than You Think

Smartwatches have dominated the wearable technology market for years, but a much smaller gadget is quietly becoming one of the most advanced health trackers available. Smart rings pack...

Read more
Next Post
Alibaba’s new AI translation tool beats Google and ChatGPT

Alibaba Joins Global AI Race With $53 Billion Investment

Please login to join discussion

Techstory

Tech and Business News from around the world. Follow along for latest in the world of Tech, AI, Crypto, EVs, Business Personalities and more.
reach us at info@techstory.in

Advertise With Us

Reach out at - info@techstory.in

Aviator Game India 2026

BROWSE BY TAG

#Crypto #howto 2024 acquisition AI amazon Apple Artificial Intelligence bitcoin Business China cryptocurrency e-commerce electric vehicles Elon Musk Ethereum facebook funding Gaming Google India Instagram Investment ios iPhone IPO Market Markets Meta Microsoft News OpenAI samsung Social Media SpaceX startup startups tech technology Tesla TikTok trend trending twitter US

© 2025 Techstory.in

No Result
View All Result
  • News
  • Crypto
  • Gadgets
  • Memes
  • Gaming
  • Cars
  • AI
  • Startups
  • Markets
  • How to

© 2025 Techstory.in

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?