Systematic Literature Review on Vehicular Collaborative Perception - A Computer Vision Perspective

arXiv — cs.CV•Wednesday, November 12, 2025 at 5:00:00 AM

The systematic literature review on vehicular collaborative perception, published on November 12, 2025, addresses the pressing need for reliable perception capabilities in autonomous vehicles. Despite advancements in artificial intelligence and sensor technologies, single-vehicle systems face significant challenges, such as visual occlusions and limited long-range detection. Collaborative perception, facilitated by vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) communication, emerges as a promising solution to these limitations. This review, adhering to PRISMA 2020 guidelines, synthesizes findings from 106 peer-reviewed articles, revealing common trends and research gaps. By systematically analyzing existing literature, the study not only highlights the importance of collaborative approaches in computer vision but also sets a foundation for future research directions, ultimately aiming to enhance the effectiveness and safety of autonomous vehicles.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

Recommended Readings

arXiv — cs.LG18 hours ago

X-VMamba: Explainable Vision Mamba

PositiveArtificial Intelligence

The X-VMamba framework introduces a controllability-based interpretability approach for State Space Models (SSMs), particularly the Mamba architecture. This framework aims to enhance understanding of how Vision SSMs process spatial information, addressing the challenges posed by the lack of transparent mechanisms in existing models. Two methods are proposed: a Jacobian-based method for general SSM architectures and a Gramian-based approach for diagonal SSMs, both designed to measure the influence of input sequences on internal state dynamics efficiently.

Read full article

via arXiv — cs.LG

arXiv — cs.CV2 days ago

Dynamic Gaussian Scene Reconstruction from Unsynchronized Videos

PositiveArtificial Intelligence

The paper titled 'Dynamic Gaussian Scene Reconstruction from Unsynchronized Videos' presents a novel approach to multi-view video reconstruction, crucial for applications in computer vision, film production, virtual reality, and motion analysis. The authors address the common issue of temporal misalignment in unsynchronized video streams, which can degrade reconstruction quality. They propose a temporal alignment strategy that utilizes a coarse-to-fine alignment module to estimate and compensate for time shifts between cameras, enhancing the overall reconstruction process.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

Fractured Glass, Failing Cameras: Simulating Physics-Based Adversarial Samples for Autonomous Driving Systems

NeutralArtificial Intelligence

Recent research has highlighted the importance of addressing physical failures in on-board cameras of autonomous vehicles, which are crucial for their perception systems. This study demonstrates that glass failures can lead to the malfunction of detection-based neural network models. By conducting real-world experiments and simulations, the researchers created perturbed scenarios that mimic the effects of glass breakage, emphasizing the need for robust safety measures in autonomous driving systems.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

TEyeD: Over 20 million real-world eye images with Pupil, Eyelid, and Iris 2D and 3D Segmentations, 2D and 3D Landmarks, 3D Eyeball, Gaze Vector, and Eye Movement Types

PositiveArtificial Intelligence

TEyeD is the world's largest unified public dataset of eye images, featuring over 20 million images collected using seven different head-mounted eye trackers, including devices integrated into virtual and augmented reality systems. The dataset encompasses a variety of activities, such as car rides and sports, and includes detailed annotations like 2D and 3D landmarks, semantic segmentation, and gaze vectors. This resource aims to enhance research in computer vision, eye tracking, and gaze estimation.

Read full article

via arXiv — cs.CV

arXiv — cs.LG2 days ago

Large-scale modality-invariant foundation models for brain MRI analysis: Application to lesion segmentation

NeutralArtificial Intelligence

The article discusses a significant advancement in computer vision, focusing on large-scale modality-invariant foundation models for brain MRI analysis. These models utilize self-supervised learning to leverage extensive unlabeled MRI data, enhancing performance in neuroimaging tasks such as lesion segmentation for stroke and epilepsy. The study highlights the importance of maintaining modality-specific features despite successful cross-modality alignment, and the model's code and checkpoints are publicly available.

Read full article

via arXiv — cs.LG

arXiv — cs.CV2 days ago

D-GAP: Improving Out-of-Domain Robustness via Dataset-Agnostic and Gradient-Guided Augmentation in Amplitude and Pixel Spaces

PositiveArtificial Intelligence

The article presents D-GAP (Dataset-agnostic and Gradient-guided augmentation in Amplitude and Pixel spaces), a novel approach aimed at enhancing out-of-domain (OOD) robustness in computer vision applications. Traditional augmentations often fail under varying image conditions, while D-GAP introduces targeted augmentations in both amplitude and pixel spaces. This method addresses the learning bias of neural networks towards domain-specific frequency components, leading to improved performance across diverse datasets.

Read full article

via arXiv — cs.CV