Gemini 3 Pro tops new AI reliability benchmark, but hallucination rates remain high

THE DECODER•Wednesday, November 19, 2025 at 3:57:04 PM

NegativeArtificial Intelligence

Gemini 3 Pro tops new AI reliability benchmark, but hallucination rates remain high

Artificial Analysis has released a benchmark indicating that only four out of 40 large language models, including Google's Gemini 3 Pro, achieved positive reliability scores, raising concerns about AI accuracy.
The performance of Gemini 3 Pro is crucial for Google as it seeks to establish leadership in AI technology amidst increasing scrutiny over the reliability of AI outputs.
This situation reflects ongoing debates in the AI community regarding the balance between innovation and reliability, as companies strive to enhance AI capabilities while addressing persistent issues like hallucinations.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

One More Thing in AI

Master AI with curated tools and tutorials for practical, real-world applications.

Airparser

Extract and parse data from documents using GPT-4 automation.

AI & DataView app details

LucidQuery AI

Combines diffusion reasoning with autoregressive LLM for advanced AI analysis.

AI & DataView app details

Magicley AI

Access a suite of AI generators for all your creative and productivity tasks.

AI & DataView app details

Humanize AI

Transform AI-generated text into undetectable, human-like content effortlessly.

Business & ProductivityView app details

GPTHumanizer

Bypass AI detection with guaranteed undetectable content generation.

AI & DataView app details

Continue Readings

Engadgeta day ago

28 advocacy groups call on Apple and Google to ban Grok, X over nonconsensual deepfakes

NegativeArtificial Intelligence

Twenty-eight advocacy groups have urged Apple and Google to ban the platforms Grok and X due to their involvement in nonconsensual deepfake content, highlighting growing concerns over digital privacy and consent.

Read full article

via Engadget

TechCruncha day ago

Google’s Trends Explore page gets new Gemini capabilities

PositiveArtificial Intelligence

Google has upgraded its Trends Explore page, integrating Gemini capabilities to enhance the analysis of search interest and allow users to identify and compare relevant trends more effectively. This significant update aims to improve user engagement and data insights.

Read full article

via TechCrunch

THE DECODERa day ago

Google taps its massive data advantage with new Gemini feature

PositiveArtificial Intelligence

Google has introduced a new feature called 'Personal Intelligence' for its Gemini AI, which integrates data from Gmail, Google Photos, and YouTube to enhance user interactions. This feature aims to make the AI assistant more responsive and personalized by leveraging Google's extensive data resources.

Read full article

via THE DECODER

The Guardian — Artificial Intelligencea day ago

Musk claims he was unaware of Grok generating explicit images of minors

NegativeArtificial Intelligence

Elon Musk stated he was unaware of any explicit images of minors generated by Grok, an AI tool developed by his company xAI, amidst increasing global scrutiny over the tool's capacity to produce nonconsensual sexual images. Musk's comments were made in response to growing concerns from lawmakers and advocacy groups, urging major tech companies to remove Grok from their app stores.

Read full article

via The Guardian — Artificial Intelligence

Engadgeta day ago

Gemini can now pull context the rest of your Google apps, if you let it

NeutralArtificial Intelligence

Google has announced that its AI model, Gemini, can now pull context from other Google applications, enhancing its functionality and user experience. This capability allows Gemini to provide more personalized and relevant responses by integrating data from services like Gmail and Calendar, contingent on user consent.

Read full article

via Engadget

Bloomberg Technologya day ago

Google Gemini Can Proactively Analyze Users’ Gmail, Photos, Searches

PositiveArtificial Intelligence

Alphabet Inc.'s Google has announced that its Gemini artificial intelligence assistant can now proactively analyze users' data across various platforms, including Gmail, Search, Photos, and YouTube, enhancing personalization for its consumer-facing AI product.

Read full article

via Bloomberg Technology

TechCruncha day ago

Gemini’s new beta feature provides proactive responses based on your photos, emails, and more

NeutralArtificial Intelligence

Google has launched a new beta feature for its AI model, Gemini, called 'Personal Intelligence,' which allows the AI to proactively respond to users by analyzing their emails, photos, and other data, contingent on user consent. This feature is currently off by default, giving users control over their data integration with Gemini.

Read full article

via TechCrunch

THE DECODERa day ago

New Apple-Google deal pushes ChatGPT to the sidelines on iPhone

NegativeArtificial Intelligence

Apple's recent partnership with Google has led to the integration of Google's AI technologies into iPhones, effectively sidelining ChatGPT as a secondary option for users. This strategic move indicates a shift in Apple's AI strategy, prioritizing Google's offerings over those from OpenAI.

Read full article

via THE DECODER

Ready to build your own newsroom?

Subscribe to unlock a personalised feed, podcasts, newsletters, and notifications tailored to the topics you actually care about