Co-Me: Confidence-Guided Token Merging for Visual Geometric Transformers

arXiv — cs.CVWednesday, November 19, 2025 at 5:00:00 AM
  • The introduction of Confidence
  • This development is crucial as it enables practical applications of visual geometric transformers in real
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
Emergent Outlier View Rejection in Visual Geometry Grounded Transformers
PositiveArtificial Intelligence
A recent study has revealed that feed-forward 3D reconstruction models, such as VGGT, can inherently distinguish noisy images, which traditionally hinder reliable 3D reconstruction from in-the-wild image collections. This discovery highlights a specific layer within the model that exhibits outlier-suppressing behavior, enabling effective noise filtering without explicit mechanisms for outlier rejection.
AVGGT: Rethinking Global Attention for Accelerating VGGT
PositiveArtificial Intelligence
A recent study titled 'AVGGT: Rethinking Global Attention for Accelerating VGGT' investigates the global attention mechanisms in models like VGGT and π3, revealing their roles in multi-view 3D performance. The authors propose a two-step acceleration scheme to enhance efficiency by modifying early global layers and subsampling global attention. This approach aims to reduce computational costs while maintaining performance.