Tech companies are racing to put AI chatbots everywhere: in search engines, on websites, and throughout our daily digital routines. But a troubling new analysis reveals these tools are far from ready for prime time when it comes to accurately reporting news.
A broad investigation by the BBC and 22 other public media organizations in 18 countries found that about 45 percent of responses by AI chatbots, based on news articles provided to them, contain major errors.
The findings raise serious questions about whether such tools should be trusted as sources of information, especially with companies like OpenAI, Google, and Microsoft pushing users to rely on them for summarizing news and analyzing information.
The research tested popular chatbots, including ChatGPT, Google’s Gemini, Microsoft’s Copilot, and Perplexity, with questions about specific news stories. What they found was unsettling: almost half of all responses included problems ranging from factual inaccuracies and fabricated quotes to outdated information that should have been corrected.
Perhaps most troubling, the chatbots frequently struggled with basic sourcing: providing links that didn’t match the sources they claimed to be citing. Even when they got the sources right, the AI tools often couldn’t tell the difference between opinion pieces and straight news reporting, or distinguish satirical content from genuine journalism.
It’s not just a matter of getting minor facts wrong. The study found chatbots failing on fundamental current events. ChatGPT, Copilot, and Gemini all incorrectly identified Pope Francis as the current pope, Leo XIV had succeeded him.
AI Chatbots Still Struggle with Basic Facts Despite Access to Current Data
In one particularly bizarre error, Copilot correctly reported Francis’s date of death while simultaneously describing him as still being pope. ChatGPT also gave outdated answers when asked to name Germany’s current chancellor and NATO’s secretary-general.
Large differences were found between platforms. Google’s Gemini performed significantly worse than the others: serious sourcing errors showed up in 72% of its responses. ChatGPT, Copilot, and Perplexity all fared better, though none were close to reliable.

These issues were consistent across languages and geographies, indicating that the problems are fundamental to how these systems work, rather than isolated bugs that could easily be fixed.
There is a silver lining: things are improving, at least for some chatbots. Compared to a similar BBC study from February, the portion of responses with serious errors dropped from 51 percent to 37 percent. That’s progress, but still means more than one in three AI-generated responses contains significant mistakes.
This improvement is noteworthy because it undermines one of the explanations AI companies have given previously for these errors. OpenAI used to blame ChatGPT’s mistakes on the model being trained only on data through September 2021 and lacking internet access.
Now that ChatGPT and other tools can access current information online, those excuses no longer apply. The fact that errors continue might mean the problem is baked into the foundational way in which those algorithms work.
The Dangerous Dynamic of AI Misinformation and News Credibility
What makes those findings particularly alarming is how much trust people are placing in AI-generated answers. The study found more than a third of British adults trust AI to accurately summarize news, with that figure jumping to nearly half among adults under 35.
Even more concerning: 42 percent of people would either blame both the AI and the original source, or trust the original news organization less, if an AI misrepresents a news outlet’s content. This sets up a dangerous dynamic in which news outlets could see their credibility damaged not by their own reporting, but by AI tools that misrepresent their work.
As tech platforms continue promoting chatbots as go-to information sources, these accuracy problems are very real risks. Users increasingly rely on AI summaries rather than reading original articles, meaning errors can spread without people ever encountering the correct information.
For news organizations already fighting to maintain public trust, having AI tools consistently misrepresent their reporting adds another challenge to an already difficult landscape.
Until these issues about accuracy are sorted out, users of such AI news summaries should be rather skeptical and check key information directly with original sources.




