VisReason: A Large-Scale Dataset for Visual Chain-of-Thought Reasoning

arXiv — cs.LGTuesday, November 25, 2025 at 5:00:00 AM
  • A new dataset named VisReason has been introduced to enhance visual Chain-of-Thought (CoT) reasoning in multimodal large language models (MLLMs). Comprising 489,000 annotated examples across four domains, VisReason aims to facilitate complex reasoning by providing multi-round, human-like rationales that guide MLLMs through visual reasoning steps. Additionally, a subset called VisReason-Pro, featuring 165,000 examples, has been curated with expert-level annotations.
  • The development of VisReason is significant as it addresses the current limitations in existing visual-CoT resources, which are often small or domain-specific. By providing a large-scale dataset, VisReason is expected to improve the interpretability and performance of MLLMs, enabling them to better understand and reason about visual information, thus advancing the field of AI.
  • This initiative reflects a broader trend in AI research focused on enhancing reasoning capabilities in multimodal models. As frameworks like ReVeL and EvoLMM emerge, aiming to improve question-answering and reasoning without heavy reliance on human-annotated data, the introduction of VisReason aligns with ongoing efforts to create more robust and autonomous AI systems capable of complex visual reasoning.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
VideoHEDGE: Entropy-Based Hallucination Detection for Video-VLMs via Semantic Clustering and Spatiotemporal Perturbations
NeutralArtificial Intelligence
A new framework named VideoHEDGE has been introduced to detect hallucinations in video-capable vision-language models (Video-VLMs), addressing the frequent inaccuracies in video question answering. This system employs entropy-based reliability estimation and semantic clustering to evaluate the correctness of generated answers against video-question pairs.

Ready to build your own newsroom?

Subscribe to unlock a personalised feed, podcasts, newsletters, and notifications tailored to the topics you actually care about