MambaTrack3D: A State Space Model Framework for LiDAR-Based Object Tracking under High Temporal Variation

arXiv — cs.CVThursday, November 20, 2025 at 5:00:00 AM
  • MambaTrack3D has been introduced as an innovative solution for 3D object tracking in LiDAR point clouds, specifically targeting the challenges posed by high temporal variation in outdoor environments. This framework enhances tracking efficiency and spatial modeling.
  • The development of MambaTrack3D is significant as it promises to improve the accuracy and efficiency of object tracking systems, which are crucial for applications in autonomous driving and robotics.
  • The introduction of MambaTrack3D aligns with ongoing advancements in depth estimation and LiDAR technology, highlighting the importance of accurate spatial perception in various AI applications, especially in complex environments.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended Readings
FQ-PETR: Fully Quantized Position Embedding Transformation for Multi-View 3D Object Detection
PositiveArtificial Intelligence
The paper presents FQ-PETR, a fully quantized framework for multi-view 3D object detection, addressing challenges in deploying PETR models due to high computational costs and memory requirements. The proposed method introduces innovations such as Quantization-Friendly LiDAR-ray Position Embedding to enhance performance without significant accuracy loss, despite the inherent difficulties in quantizing non-linear operators.
PAVE: An End-to-End Dataset for Production Autonomous Vehicle Evaluation
PositiveArtificial Intelligence
The PAVE dataset represents a significant advancement in the evaluation of production autonomous vehicles (AVs). Unlike existing datasets that rely on human-driven data, PAVE is the first end-to-end benchmark collected entirely through autonomous driving in real-world conditions. It includes over 100 hours of data segmented into 32,727 key frames, featuring synchronized camera images and high-precision GNSS/IMU data, aimed at enhancing the safety evaluation of AVs.
Learning from Mistakes: Loss-Aware Memory Enhanced Continual Learning for LiDAR Place Recognition
PositiveArtificial Intelligence
LiDAR place recognition is essential for SLAM, robot navigation, and autonomous driving. Current methods often face catastrophic forgetting when adapting to new environments. To combat this, a new framework called KDF+ has been proposed, which incorporates a loss-aware sampling strategy and a rehearsal enhancement mechanism to improve continual learning in LiDAR place recognition.
Class-Aware PillarMix: Can Mixed Sample Data Augmentation Enhance 3D Object Detection with Radar Point Clouds?
PositiveArtificial Intelligence
The paper discusses the application of mixed sample data augmentation (MSDA) techniques to enhance 3D object detection using radar point clouds. While MSDA has been effective for LiDAR data, its adaptation for radar point clouds presents unique challenges, including irregular angular distribution and point sparsity. The authors propose a new method called Class-Aware PillarMix (CAPMix) that utilizes MixUp at the pillar level, guided by class labels, to address these challenges.
CompTrack: Information Bottleneck-Guided Low-Rank Dynamic Token Compression for Point Cloud Tracking
PositiveArtificial Intelligence
CompTrack is a novel framework designed for 3D single object tracking in LiDAR point clouds, addressing challenges posed by spatial and informational redundancy. By utilizing a Spatial Foreground Predictor to filter background noise and an Information Bottleneck-guided Dynamic Token Compression module to enhance efficiency, CompTrack aims to improve the accuracy and performance of existing tracking systems in autonomous driving applications.
V2VLoc: Robust GNSS-Free Collaborative Perception via LiDAR Localization
PositiveArtificial Intelligence
The article presents a new framework for GNSS-free collaborative perception using LiDAR localization, addressing the challenges faced in GNSS-denied environments. Traditional localization methods often struggle in these settings, hindering effective collaboration among multi-agent systems. The proposed solution includes a lightweight Pose Generator with Confidence (PGC) for estimating poses and confidence, alongside the Pose-Aware Spatio-Temporal Alignment Transformer (PASTAT) for spatial alignment. A new simulation dataset, V2VLoc, is introduced, which supports LiDAR localization and collabor…
CARScenes: Semantic VLM Dataset for Safe Autonomous Driving
PositiveArtificial Intelligence
CAR-Scenes is a frame-level dataset designed for autonomous driving, facilitating the training and evaluation of vision-language models (VLMs) for scene-level understanding. The dataset comprises 5,192 annotated images from sources like Argoverse, Cityscapes, KITTI, and nuScenes, utilizing a comprehensive 28-key category/sub-category knowledge base. The annotations are generated through a GPT-4o-assisted pipeline with human verification, providing detailed attributes and supporting semantic retrieval and risk-aware scenario mining.
Towards Sharper Object Boundaries in Self-Supervised Depth Estimation
PositiveArtificial Intelligence
Accurate monocular depth estimation is essential for understanding 3D scenes, yet current methods often produce blurred depth at object boundaries, leading to erroneous 3D points. This study introduces a self-supervised approach that models per-pixel depth as a mixture distribution, allowing for sharp depth discontinuities without fine-grained supervision. The method integrates variance-aware loss functions and uncertainty propagation, achieving up to 35% higher boundary sharpness and improved point cloud quality on KITTI and VKITTIv2 datasets.