Thinking in 360{\deg}: Humanoid Visual Search in the Wild

arXiv — cs.CVThursday, November 27, 2025 at 5:00:00 AM
  • The development of humanoid visual search agents capable of rotating their heads to efficiently search for objects in immersive 360-degree environments has been proposed, addressing limitations of static image-based visual search methods. This approach utilizes a new benchmark called H* Bench, which focuses on complex real-world scenarios requiring advanced visual-spatial reasoning.
  • This innovation is significant as it aims to replicate human-like visual search capabilities in artificial agents, potentially enhancing applications in various fields such as robotics, urban navigation, and augmented reality, where understanding dynamic environments is crucial.
  • The introduction of humanoid visual search aligns with ongoing advancements in multimodal models, which are increasingly integrating visual and linguistic data to improve interaction and reasoning. This trend reflects a broader movement towards creating more sophisticated AI systems that can understand and navigate complex environments, highlighting the importance of embodied cognition in AI development.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
SmokeBench: Evaluating Multimodal Large Language Models for Wildfire Smoke Detection
NeutralArtificial Intelligence
A new benchmark named SmokeBench has been introduced to assess the capabilities of multimodal large language models (MLLMs) in detecting and localizing wildfire smoke in images. The benchmark includes four tasks: smoke classification, tile-based and grid-based smoke localization, and smoke detection, evaluating models such as Idefics2, Qwen2.5-VL, and GPT-4o. Results indicate that while some models can identify smoke over large areas, they struggle with precise localization, particularly in early detection stages.

Ready to build your own newsroom?

Subscribe to unlock a personalised feed, podcasts, newsletters, and notifications tailored to the topics you actually care about