Zoom says its "federated AI" model, combining its SLM with open- and closed-source models, got 48.1% on Humanity's Last Exam vs. 45.8% for Gemini 3 Pro w/ tools (Xuedong Huang/Zoom)

Techmeme•Friday, December 12, 2025 at 10:00:58 PM

PositiveArtificial Intelligence

Zoom says its "federated AI" model, combining its SLM with open- and closed-source models, got 48.1% on Humanity's Last Exam vs. 45.8% for Gemini 3 Pro w/ tools (Xuedong Huang/Zoom)

Zoom announced that its federated AI model achieved a score of 48.1% on Humanity's Last Exam, outperforming Google's Gemini 3 Pro, which scored 45.8%. This model integrates Zoom's SLM with both open- and closed-source AI technologies, showcasing a significant advancement in AI capabilities.
This development is crucial for Zoom as it positions the company as a competitive player in the AI landscape, particularly in the realm of educational assessments and AI-driven solutions, enhancing its reputation and potential market share in AI technologies.
The results highlight ongoing challenges in AI reliability, as even top models like Gemini 3 Pro face scrutiny over factual accuracy and hallucination rates. This underscores a broader industry concern regarding the effectiveness of AI in complex tasks, prompting discussions about the need for improved benchmarks and methodologies in AI evaluation.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

Fakeface

Swap faces instantly with advanced AI technology for realistic results.

Tech & Developer ToolsView app details

Hamster Ai

Get premium GPT-4 tools for just $3 a month—free AI access included.

Business & ProductivityView app details

Brainactive

Accelerate your research with AI-powered insights at an affordable price.

Tech & Developer ToolsView app details

Promptzone

Connect and collaborate with AI enthusiasts on the first social platform for artificial intelligence.

Business & ProductivityView app details

Skribe.ai

Videofirst record and AI transcript for legal testimony, powered by Zoom.

Business & ProductivityView app details

Z3D

Generate 3D models instantly with AI-powered design tools.

AI & DataView app details

Continue Readings

arXiv — cs.CV2 days ago

Effective Online Exam Proctoring by Combining Lightweight Face Detection and Deep Recognition

PositiveArtificial Intelligence

A new online exam proctoring system named iExam has been developed, which integrates lightweight face detection and deep recognition technologies to enhance the integrity of online assessments conducted via platforms like Zoom. This system addresses the challenges of monitoring multiple student video feeds in real time and includes features for post-exam analysis to detect abnormal behaviors such as face disappearance and identity substitution.

Read full article

via arXiv — cs.CV

TechCrunch2 days ago

Google launched its deepest AI research agent yet — on the same day OpenAI dropped GPT-5.2

PositiveArtificial Intelligence

Google has launched its most advanced AI research agent to date, based on the Gemini 3 Pro model, allowing developers to integrate this tool into their applications. This release coincided with OpenAI's introduction of GPT-5.2, marking a significant moment in the competitive AI landscape.

Read full article

via TechCrunch

THE DECODER2 days ago

FACTS benchmark shows that even top AI models struggle with the truth

NegativeArtificial Intelligence

Google DeepMind has introduced a new benchmark called FACTS, designed to assess the reliability of AI models more thoroughly. The results indicate that even leading models like Gemini 3 Pro and GPT-5.1 exhibit significant shortcomings in factual accuracy, highlighting the challenges these technologies face in delivering truthful information.

Read full article

via THE DECODER

VentureBeat — AI3 days ago

The 70% factuality ceiling: why Google’s new ‘FACTS’ benchmark is a wake-up call for enterprise AI

NeutralArtificial Intelligence

Google has introduced a new benchmark called 'FACTS' aimed at measuring the factual accuracy of generative AI models, addressing a critical gap in existing benchmarks that focus primarily on task completion rather than the truthfulness of the information generated. This initiative is particularly significant for industries where accuracy is essential, such as legal, finance, and medical sectors.

Read full article

via VentureBeat — AI

Ready to build your own newsroom?

Subscribe to unlock a personalised feed, podcasts, newsletters, and notifications tailored to the topics you actually care about