Spot The Ball: A Benchmark for Visual Social Inference

arXiv — cs.CVThursday, November 20, 2025 at 5:00:00 AM
  • 'Spot The Ball' is a new benchmark aimed at assessing visual social inference capabilities in AI, specifically through sports images where participants must locate a missing ball. The evaluation involved comparing human accuracy with that of four leading VLMs, demonstrating that humans outperform these models significantly.
  • This development underscores the ongoing challenges in AI's ability to interpret nuanced visual information, suggesting a need for further advancements in AI training to enhance their understanding of social cues and improve performance in real
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
Google’s Trends Explore page gets new Gemini capabilities
PositiveArtificial Intelligence
Google has upgraded its Trends Explore page, integrating Gemini capabilities to enhance the analysis of search interest and allow users to identify and compare relevant trends more effectively. This significant update aims to improve user engagement and data insights.
Google taps its massive data advantage with new Gemini feature
PositiveArtificial Intelligence
Google has introduced a new feature called 'Personal Intelligence' for its Gemini AI, which integrates data from Gmail, Google Photos, and YouTube to enhance user interactions. This feature aims to make the AI assistant more responsive and personalized by leveraging Google's extensive data resources.
Gemini can now scan your photos, email, and more to provide better answers
NeutralArtificial Intelligence
Google has introduced a new feature for its AI model, Gemini, allowing it to scan users' photos, emails, and other data to provide more accurate responses. This feature is currently available only to paid users and is disabled by default.
Gemini can now pull context the rest of your Google apps, if you let it
NeutralArtificial Intelligence
Google has announced that its AI model, Gemini, can now pull context from other Google applications, enhancing its functionality and user experience. This capability allows Gemini to provide more personalized and relevant responses by integrating data from services like Gmail and Calendar, contingent on user consent.
Google Gemini Can Proactively Analyze Users’ Gmail, Photos, Searches
PositiveArtificial Intelligence
Alphabet Inc.'s Google has announced that its Gemini artificial intelligence assistant can now proactively analyze users' data across various platforms, including Gmail, Search, Photos, and YouTube, enhancing personalization for its consumer-facing AI product.
Gemini's new Personal Intelligence will look through your emails and photos - if you let it
NeutralArtificial Intelligence
Google has introduced a new feature for its AI model, Gemini, called 'Personal Intelligence,' which allows it to scan users' emails, photos, and other data to provide more personalized responses, contingent on user consent. This feature aims to enhance user interaction by leveraging data from various Google services, including Gmail and YouTube.
Gemini’s new beta feature provides proactive responses based on your photos, emails, and more
NeutralArtificial Intelligence
Google has launched a new beta feature for its AI model, Gemini, called 'Personal Intelligence,' which allows the AI to proactively respond to users by analyzing their emails, photos, and other data, contingent on user consent. This feature is currently off by default, giving users control over their data integration with Gemini.
E^2-LLM: Bridging Neural Signals and Interpretable Affective Analysis
PositiveArtificial Intelligence
The introduction of E^2-LLM (EEG-to-Emotion Large Language Model) marks a significant advancement in emotion recognition from electroencephalography (EEG) signals, addressing challenges such as inter-subject variability and the need for interpretable reasoning in existing models. This framework integrates a pretrained EEG encoder with Qwen-based large language models through a multi-stage training pipeline.

Ready to build your own newsroom?

Subscribe to unlock a personalised feed, podcasts, newsletters, and notifications tailored to the topics you actually care about