• Send Us A Tip
  • Calling all Tech Writers
  • Advertise
Sunday, July 5, 2026
  • Login
TechStory
  • News
  • Crypto
  • Gadgets
  • Memes
  • Gaming
  • Cars
  • AI
  • Startups
  • Markets
  • How to
No Result
View All Result
  • News
  • Crypto
  • Gadgets
  • Memes
  • Gaming
  • Cars
  • AI
  • Startups
  • Markets
  • How to
No Result
View All Result
TechStory
No Result
View All Result
Home Future Tech AI

OpenAI Just Pulled a Theranos with o3: Transparency Questions Emerge

by Reshab Agarwal
January 20, 2025
in AI, News
Reading Time: 3 mins read
0
Tech Giant OpenAI is Projected to Turn a Profit in 2029, Reports Show
TwitterWhatsappLinkedin

OpenAI just pulled a Theranos with o3 by claiming record-breaking performance on the FrontierMath benchmark while having access to much of the test data. OpenAI’s recent claims of record-breaking performance on the EpochAI FrontierMath benchmark have sparked significant criticism. Allegations have emerged that OpenAI had access to a substantial portion of the test data, raising concerns about the validity of the results. Critics have drawn parallels between this controversy and the infamous Theranos scandal.

You might also like

Project Aion Discovered Leaked Microsoft Experiment Reveals Web-Based Agentic OS Built Around Copilot

The AI Industrial Drone Wisconsin Homeowners Sue Microsoft Over Data Center Noise

UK Culture Secretary Lisa Nandy Quits X, Calls Platform a Threat to Healthy Public Debate

EpochAI’s associate director, Tamay Besiroglu, acknowledged that OpenAI’s involvement in developing the benchmark was not disclosed upfront. Besiroglu admitted, “We made a mistake in not being more transparent about OpenAI’s involvement.” He also revealed that a contractual agreement restricted disclosure of OpenAI’s role until the launch of the o3 model.

According to Besiroglu, OpenAI accessed a large portion of the FrontierMath dataset. However, an unseen hold-out set was reportedly used to verify the model’s capabilities. Despite this, six mathematicians who contributed to the benchmark claimed they were unaware of OpenAI’s exclusive access to the data. Some expressed doubts about participating if they knew about the arrangement.

Record-Breaking Results Under Fire

In December 2024, OpenAI announced that its o3 model achieved a groundbreaking 25% accuracy on the FrontierMath benchmark, compared to the previous high score of 2% by other models. The benchmark involves solving highly complex mathematical problems.

Besiroglu had earlier stated that the benchmark data was private and not used for training. However, a footnote in the updated FrontierMath research paper revealed OpenAI’s support in creating the benchmark.

Experts like Gary Marcus claim that OpenAI just pulled a Theranos with o3 by making bold claims without independent validation of its results. AI experts, including Gary Marcus, have questioned OpenAI’s claims, citing a lack of external validation. Mikhail Samin, from the AI Governance and Safety Institute, criticized OpenAI’s history of secretive practices.

Disputes Over ARC-AGI Benchmark

OpenAI also claimed that the o3 model achieved nearly 90% accuracy on the ARC-AGI benchmark, outperforming human capabilities. However, François Chollet, creator of the ARC-AGI benchmark, dismissed these claims, stating that the model still struggles with simpler tasks in the benchmark.

Marcus echoed concerns about the lack of independent evaluations of o3’s robustness across diverse problem sets.

Launch of o3-Mini and Future Plans

Amid the controversy, OpenAI CEO Sam Altman announced the upcoming release of the o3-mini model, a smaller and more efficient version of the o3 model. The launch is scheduled to take place within two weeks on both the API and ChatGPT platforms.

The o3-mini is designed to provide advanced capabilities while being more resource-efficient. It features adjustable reasoning modes, allowing users to optimize performance for specific tasks.

Altman also hinted at future developments, including a potential merger of OpenAI’s GPT and o-series models by 2025. Expressing confidence in OpenAI’s progress, Altman said the company is well-positioned to be the first to achieve artificial general intelligence (AGI).

Developers and Customization Updates

OpenAI has introduced updates to its Realtime API, enabling developers to create voice applications using multi-agent flows in under 20 minutes. Additionally, ChatGPT’s custom instructions have been enhanced to allow users to specify how the AI interacts, including desired traits and response styles.

Tamay Besiroglu admitted that OpenAI just pulled a Theranos with o3 by restricting the disclosure of its role until after the model’s release. Tamay Besiroglu’s admission that OpenAI’s involvement was hidden due to contractual restrictions further fuels concerns. This lack of openness not only damages the credibility of the benchmark but also harms trust in OpenAI’s research.

Tweet57SendShare16
Previous Post

BP to Lay Off 4,700 Employees and 3,000 Contractors to Cut Costs and Rebuild Investor Confidence

Next Post

Instagram Launches Edits: A Game-Changing App for Video Creators

Reshab Agarwal

Reshab is a tech-enthusiast who likes to write about all things crypto. He is a Bitcoin bull and believes in a decentralized future of finance. Follow him on Twitter for more!

Recommended For You

Project Aion Discovered Leaked Microsoft Experiment Reveals Web-Based Agentic OS Built Around Copilot

by Anochie Esther
July 5, 2026
0
agentic AI operating system

The multi-billion-dollar corporate push toward generative artificial intelligence is moving past standalone companion widgets and plunging straight into the core architecture of desktop computing. For years, major operating...

Read more

The AI Industrial Drone Wisconsin Homeowners Sue Microsoft Over Data Center Noise

by Anochie Esther
July 5, 2026
0
data center noise complaints

The massive, cross-country expansion of artificial intelligence infrastructure is fast colliding with local community standards and basic residential property rights. Across the United States, tech titans are racing...

Read more

UK Culture Secretary Lisa Nandy Quits X, Calls Platform a Threat to Healthy Public Debate

by Ishaan Negi
July 5, 2026
0
UK Culture Secretary Lisa Nandy Quits X, Calls Platform a Threat to Healthy Public Debate

The debate over social media's role in modern society has taken another dramatic turn. UK Culture Secretary Lisa Nandy has announced that she is leaving X (formerly Twitter),...

Read more
Next Post
G-suite comes back Online after a long outage in many countries and other tech news

Instagram Launches Edits: A Game-Changing App for Video Creators

Please login to join discussion

Techstory

Tech and Business News from around the world. Follow along for latest in the world of Tech, AI, Crypto, EVs, Business Personalities and more.
reach us at info@techstory.in

Advertise With Us

Reach out at - info@techstory.in

Aviator Game India 2026

BROWSE BY TAG

#Crypto #howto 2024 acquisition AI amazon Apple Artificial Intelligence bitcoin Business China cryptocurrency e-commerce electric vehicles Elon Musk Ethereum facebook funding Gaming Google India Instagram Investment ios iPhone IPO Market Markets Meta Microsoft News OpenAI samsung Social Media SpaceX startup startups tech technology Tesla TikTok trend trending twitter US

© 2025 Techstory.in

No Result
View All Result
  • News
  • Crypto
  • Gadgets
  • Memes
  • Gaming
  • Cars
  • AI
  • Startups
  • Markets
  • How to

© 2025 Techstory.in

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?