Zoom says its "federated AI" model, combining its SLM with open- and closed-source models, got 48.1% on Humanity's Last Exam vs. 45.8% for Gemini 3 Pro w/ tools (Xuedong Huang/Zoom)

TechmemeFriday, December 12, 2025 at 10:00:58 PM
Zoom says its "federated AI" model, combining its SLM with open- and closed-source models, got 48.1% on Humanity's Last Exam vs. 45.8% for Gemini 3 Pro w/ tools (Xuedong Huang/Zoom)
  • Zoom announced that its federated AI model achieved a score of 48.1% on Humanity's Last Exam, outperforming Google's Gemini 3 Pro, which scored 45.8%. This model integrates Zoom's SLM with both open- and closed-source AI technologies, showcasing a significant advancement in AI capabilities.
  • This development is crucial for Zoom as it positions the company as a competitive player in the AI landscape, particularly in the realm of educational assessments and AI-driven solutions, enhancing its reputation and potential market share in AI technologies.
  • The results highlight ongoing challenges in AI reliability, as even top models like Gemini 3 Pro face scrutiny over factual accuracy and hallucination rates. This underscores a broader industry concern regarding the effectiveness of AI in complex tasks, prompting discussions about the need for improved benchmarks and methodologies in AI evaluation.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
Effective Online Exam Proctoring by Combining Lightweight Face Detection and Deep Recognition
PositiveArtificial Intelligence
A new online exam proctoring system named iExam has been developed, which integrates lightweight face detection and deep recognition technologies to enhance the integrity of online assessments conducted via platforms like Zoom. This system addresses the challenges of monitoring multiple student video feeds in real time and includes features for post-exam analysis to detect abnormal behaviors such as face disappearance and identity substitution.
Google launched its deepest AI research agent yet — on the same day OpenAI dropped GPT-5.2
PositiveArtificial Intelligence
Google has launched its most advanced AI research agent to date, based on the Gemini 3 Pro model, allowing developers to integrate this tool into their applications. This release coincided with OpenAI's introduction of GPT-5.2, marking a significant moment in the competitive AI landscape.
FACTS benchmark shows that even top AI models struggle with the truth
NegativeArtificial Intelligence
Google DeepMind has introduced a new benchmark called FACTS, designed to assess the reliability of AI models more thoroughly. The results indicate that even leading models like Gemini 3 Pro and GPT-5.1 exhibit significant shortcomings in factual accuracy, highlighting the challenges these technologies face in delivering truthful information.
The 70% factuality ceiling: why Google’s new ‘FACTS’ benchmark is a wake-up call for enterprise AI
NeutralArtificial Intelligence
Google has introduced a new benchmark called 'FACTS' aimed at measuring the factual accuracy of generative AI models, addressing a critical gap in existing benchmarks that focus primarily on task completion rather than the truthfulness of the information generated. This initiative is particularly significant for industries where accuracy is essential, such as legal, finance, and medical sectors.

Ready to build your own newsroom?

Subscribe to unlock a personalised feed, podcasts, newsletters, and notifications tailored to the topics you actually care about