FACTS benchmark shows that even top AI models struggle with the truth

THE DECODERThursday, December 11, 2025 at 4:37:14 PM
FACTS benchmark shows that even top AI models struggle with the truth
  • Google DeepMind has introduced a new benchmark called FACTS, designed to assess the reliability of AI models more thoroughly. The results indicate that even leading models like Gemini 3 Pro and GPT-5.1 exhibit significant shortcomings in factual accuracy, highlighting the challenges these technologies face in delivering truthful information.
  • This development is crucial for Google as it seeks to enhance the credibility and performance of its AI offerings. The FACTS benchmark aims to address the growing concerns regarding the reliability of AI outputs, which is essential for maintaining user trust and advancing AI applications in various sectors.
  • The introduction of the FACTS benchmark underscores a broader industry trend towards prioritizing factual accuracy in AI systems. As companies like Google and OpenAI compete to improve their models, the persistent issues of hallucination and misinformation remain critical challenges, prompting ongoing discussions about the ethical implications and responsibilities of AI developers.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
Google launched its deepest AI research agent yet — on the same day OpenAI dropped GPT-5.2
PositiveArtificial Intelligence
Google has launched its most advanced AI research agent to date, based on the Gemini 3 Pro model, allowing developers to integrate this tool into their applications. This release coincided with OpenAI's introduction of GPT-5.2, marking a significant moment in the competitive AI landscape.
Google opens updated Deep Research Agent to developers with new API
PositiveArtificial Intelligence
Google has launched an updated version of its Deep Research Agent, making it accessible to developers for the first time through a new API. This release is part of Google's ongoing efforts to enhance its AI capabilities and improve complex web search functionalities with an open-source benchmark.
AI in space requires new cooling tech and cheap rockets
NeutralArtificial Intelligence
The increasing energy demands of modern AI models are prompting tech companies to explore space-based solutions, necessitating advancements in cooling technologies and affordable rocket launches. This shift reflects a long-term vision among industry leaders to harness the unique advantages of space for AI applications.
The 70% factuality ceiling: why Google’s new ‘FACTS’ benchmark is a wake-up call for enterprise AI
NeutralArtificial Intelligence
Google has introduced a new benchmark called 'FACTS' aimed at measuring the factual accuracy of generative AI models, addressing a critical gap in existing benchmarks that focus primarily on task completion rather than the truthfulness of the information generated. This initiative is particularly significant for industries where accuracy is essential, such as legal, finance, and medical sectors.
Deepseek reportedly using thousands of smuggled Nvidia chips for AI training
NegativeArtificial Intelligence
Deepseek, a Chinese AI startup, is reportedly using thousands of smuggled Nvidia chips for training its next major AI model, raising concerns about compliance with international trade regulations. This information comes from sources cited by The Information, highlighting the lengths to which the company is going to enhance its technological capabilities.
Aviation startup Boom pivots to gas turbines to feed AI’s power hunger
NeutralArtificial Intelligence
US aviation startup Boom Supersonic is shifting its focus from developing a supersonic passenger jet to entering the energy sector by creating gas turbines to meet the growing power demands of artificial intelligence (AI). This pivot aims to capitalize on the increasing energy needs driven by AI advancements.
Google Rolls Out AI Plus Subscription in India at ₹399 Per Month
PositiveArtificial Intelligence
Google has launched its AI Plus subscription service in India, priced at ₹399 per month, as part of its ongoing efforts to enhance user engagement with its AI technologies. This initiative aligns with the company's strategy to expand its AI offerings and reach a broader audience in the Indian market.