LiSTAR: Ray-Centric World Models for 4D LiDAR Sequences in Autonomous Driving

arXiv — cs.CV•Friday, November 21, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

LiSTAR introduces a new generative world model that enhances the synthesis of 4D LiDAR data, crucial for autonomous driving simulations. This model effectively addresses the unique challenges posed by LiDAR technology, such as its spherical geometry and the temporal sparsity of data.
The development of LiSTAR is significant as it improves the fidelity and controllability of synthetic data, which is essential for training autonomous driving systems. This advancement positions LiSTAR as a key player in the evolution of autonomous vehicle technology.
The introduction of LiSTAR aligns with ongoing efforts to enhance data generation methods in autonomous driving, reflecting a broader trend towards improving situational awareness and object detection capabilities. As various models emerge, the focus remains on overcoming challenges posed by environmental conditions and enhancing the robustness of perception systems.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

Continue Readings

arXiv — cs.CVa day ago

CleverDistiller: Simple and Spatially Consistent Cross-modal Distillation

PositiveArtificial Intelligence

CleverDistiller is a self-supervised, cross-modal knowledge distillation framework that enhances the transfer of features from 2D vision models to 3D LiDAR models. It simplifies the distillation process by using a direct feature similarity loss and a multi-layer perceptron projection head, allowing for better learning of complex semantic dependencies. This approach aims to improve the performance of 3D models in various applications, including autonomous driving.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

Cheating Stereo Matching in Full-scale: Physical Adversarial Attack against Binocular Depth Estimation in Autonomous Driving

NeutralArtificial Intelligence

The paper presents a novel physical adversarial attack targeting stereo matching models used in autonomous driving. Unlike traditional attacks that utilize 2D patches, this approach employs a 3D physical adversarial example (PAE) with global camouflage texture, enhancing visual consistency across various viewpoints. Additionally, a new 3D stereo matching rendering module is introduced to align the PAE with real-world positions in binocular vision, addressing the disparity effects of stereo cameras.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

Gaussian Mapping for Evolving Scenes

PositiveArtificial Intelligence

The paper 'Gaussian Mapping for Evolving Scenes' introduces a dynamic scene-adaptation mechanism that enhances 3D Gaussian Splatting (3DGS) for applications in computer vision, augmented reality, robotics, and autonomous driving. This mechanism continuously updates 3DGS to reflect changes in evolving scenes, addressing the limitations of existing methods that primarily focus on static scenes. A novel keyframe management system is proposed to discard outdated observations while retaining essential information.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

Learning Depth from Past Selves: Self-Evolution Contrast for Robust Depth Estimation

PositiveArtificial Intelligence

Self-supervised depth estimation has become crucial in fields like autonomous driving and robotics. However, existing methods struggle in adverse weather conditions, leading to performance degradation. To tackle this, a new framework called SEC-Depth is proposed, which utilizes intermediate training parameters to create evolving latency models. This approach aims to enhance depth estimation robustness under challenging conditions through a self-evolution contrastive learning scheme.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

Scriboora: Rethinking Human Pose Forecasting

PositiveArtificial Intelligence

The paper titled 'Scriboora: Rethinking Human Pose Forecasting' evaluates various algorithms for predicting human poses based on past observations. It highlights reproducibility issues and introduces a unified training and evaluation pipeline. The study demonstrates that recent speech models can be adapted to enhance pose forecasting performance, and assesses model robustness using noisy joint coordinates to better reflect real-world applications.

Read full article

via arXiv — cs.CV