AI News Assistants Fall Short on Accuracy and Sourcing

A study by the European Broadcasting Union in collaboration with the BBC has found that leading artificial-intelligence news assistants deliver flawed responses in nearly half of the cases evaluated. The analysis assessed over 3,000 replies from major tools including ChatGPT, Copilot, Gemini and Perplexity, across 14 languages and involving 22 public-service media organisations from 18 countries.

According to the report, 45 per cent of all answers contained at least one major issue, while a broader 81 per cent exhibited some form of problem including inaccurate attribution, outdated facts or blending of opinion and facts. In particular, sourcing emerged as a critical weakness: about a third of responses were judged to have serious attribution errors. Gemini, developed by Google LLC, had the highest proportion of sourcing faults at around 72 per cent – significantly above other platforms. The study also reported that 20 per cent of responses included major factual inaccuracies, including wrong dates, fictional quotes or mis-identification of public figures.

The growing reliance on AI assistants for news consumption adds to the urgency of the findings. The Reuters Institute’s Digital News Report for 2025 shows that about 7 per cent of all online news users now rely on AI assistants as a primary news source, rising to 15 per cent among the under-25 age group. The EBU warns that the combination of high usage and high error rates poses a serious risk to public trust in news and could affect democratic engagement.

Researchers involved in the study point to specific patterns of failure. AI assistants struggled particularly with complex, dynamic stories involving multiple actors, subtle timeline shifts or evolving contexts. They performed better on straightforward factual queries, such as “How many countries have hosted the FIFA World Cup?” but fare poorly when asked “What are the implications of the new tax regulation passed in Country X?” In these more complicated cases the assistants often failed to distinguish fact from opinion, attributed statements incorrectly, or left out key context. The study notes these weaknesses are persistent across languages and jurisdictions, making the issue systemic rather than isolated.

Industry responses range from acknowledgement to caution. Google stated that Gemini welcomes feedback and is being improved on accuracy and sourcing. OpenAI and Microsoft both acknowledge “hallucinations”, or methods by which large-language models generate plausible but incorrect information, as a known problem and are actively working to mitigate them. News-industry executives say they view the findings as a call to action. Jean Philip De Tender, Media Director at the EBU, said: “When people don’t know what to trust, they end up trusting nothing at all, and that can deter democratic participation.”

Within the news ecosystem, the implications are broad. Publishers must now confront the challenge of how their content may be used or mis-used by AI intermediaries, potentially harming reputations if the AI versions of their content appear flawed. Some news organisations are re-evaluating commercial licensing models and seeking greater transparency about how AI tools source and integrate their material. At the same time regulators are taking renewed interest: the findings could accelerate calls for oversight mechanisms in generative-AI deployment in journalism contexts.



Notice an issue?

Arabian Post strives to deliver the most accurate and reliable information to its readers. If you believe you have identified an error or inconsistency in this article, please don't hesitate to contact our editorial team at editor[at]thearabianpost[dot]com. We are committed to promptly addressing any concerns and ensuring the highest level of journalistic integrity.


ADVERTISEMENT
Social Media Auto Publish Powered By : XYZScripts.com