Navigating Gigapixel Pathology Images with Large Multimodal Models

arXiv — cs.CVWednesday, November 26, 2025 at 5:00:00 AM
  • A new framework called Gigapixel Image Agent for Navigating Tissue (GIANT) has been introduced to enhance the performance of large multimodal models (LMMs) in interpreting gigapixel pathology images. This framework allows LMMs to navigate whole-slide images (WSIs) iteratively, improving their reasoning capabilities in medical image evaluation. Accompanying GIANT is MultiPathQA, a benchmark featuring 934 WSI-level questions across five clinically relevant tasks.
  • The development of GIANT and MultiPathQA is significant as it addresses the limitations of previous studies that used low-resolution images, potentially underestimating model performance. By enabling LMMs to operate more like pathologists, this innovation could lead to better diagnostic accuracy and support clinical decision-making in pathology.
  • This advancement reflects a broader trend in the medical AI field, where multimodal models are increasingly being evaluated for their effectiveness in various imaging tasks. The integration of large language models in medical imaging, as seen in other studies, highlights the ongoing efforts to enhance diagnostic tools and improve patient outcomes through advanced AI technologies.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
Physicist Steve Hsu publishes research built around a core idea generated by GPT-5
NeutralArtificial Intelligence
Physicist Steve Hsu has published a research paper based on an idea generated by GPT-5, highlighting the potential of AI in scientific inquiry while cautioning about its reliability, likening it to a 'brilliant but unreliable genius.'
AI denial is becoming an enterprise risk: Why dismissing “slop” obscures real capability gains
NegativeArtificial Intelligence
The recent release of GPT-5 by OpenAI has sparked a negative shift in public sentiment towards AI, with many users criticizing the model for its perceived flaws rather than recognizing its capabilities. This backlash has led to claims that AI progress is stagnating, with some commentators labeling the technology as 'AI slop'.
OpenAI is training models to 'confess' when they lie - what it means for future AI
NeutralArtificial Intelligence
OpenAI has developed a version of GPT-5 that can admit to its own errors, a significant step in addressing concerns about AI honesty and transparency. This new capability, referred to as 'confessions', aims to enhance the reliability of AI systems by encouraging them to self-report misbehavior. However, experts caution that this is not a comprehensive solution to the broader safety issues surrounding AI technology.
ViRectify: A Challenging Benchmark for Video Reasoning Correction with Multimodal Large Language Models
PositiveArtificial Intelligence
The introduction of ViRectify presents a new benchmark aimed at evaluating the error correction capabilities of multimodal large language models (MLLMs) in complex video reasoning tasks. This benchmark addresses the existing gap in systematic evaluation, providing a dataset of over 30,000 instances across various domains such as dynamic perception and scientific reasoning.
6 Fingers, 1 Kidney: Natural Adversarial Medical Images Reveal Critical Weaknesses of Vision-Language Models
NeutralArtificial Intelligence
A new benchmark called AdversarialAnatomyBench has been introduced to evaluate vision-language models (VLMs) against naturally occurring rare anatomical variants, revealing significant performance drops in state-of-the-art models like GPT-5 and Gemini 2.5 Pro when faced with atypical anatomy. The accuracy decreased from 74% on typical anatomy to just 29% on atypical cases.
Object Counting with GPT-4o and GPT-5: A Comparative Study
PositiveArtificial Intelligence
A comparative study has been conducted on the object counting capabilities of two multi-modal large language models, GPT-4o and GPT-5, focusing on their performance in zero-shot scenarios using only textual prompts. The evaluation was carried out on the FSC-147 and CARPK datasets, revealing that both models achieved results comparable to state-of-the-art methods, with some instances exceeding them.
A Definition of AGI
NeutralArtificial Intelligence
A recent paper has introduced a quantifiable framework for defining Artificial General Intelligence (AGI), proposing that AGI should match the cognitive versatility of a well-educated adult. This framework is based on the Cattell-Horn-Carroll theory and evaluates AI systems across ten cognitive domains, revealing significant gaps in current AI models, particularly in long-term memory storage.
Anthropic vs. OpenAI red teaming methods reveal different security priorities for enterprise AI
NeutralArtificial Intelligence
Anthropic and OpenAI have recently showcased their respective AI models, Claude Opus 4.5 and GPT-5, highlighting their distinct approaches to security validation through system cards and red-team exercises. Anthropic's extensive 153-page system card contrasts with OpenAI's 60-page version, revealing differing methodologies in assessing AI robustness and security metrics.