RTS-Mono: A Real-Time Self-Supervised Monocular Depth Estimation Method for Real-World Deployment

arXiv — cs.CVWednesday, November 19, 2025 at 5:00:00 AM
  • RTS
  • The development of RTS
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended Readings
PAVE: An End-to-End Dataset for Production Autonomous Vehicle Evaluation
PositiveArtificial Intelligence
The PAVE dataset represents a significant advancement in the evaluation of autonomous vehicles (AVs), being the first end-to-end benchmark dataset collected entirely through autonomous driving in real-world conditions. It includes over 100 hours of naturalistic data from various production AV models, segmented into 32,727 key frames with synchronized camera images and high-precision GNSS/IMU data. This dataset aims to enhance the understanding of AV behavior and safety, providing crucial insights for future developments in autonomous driving technology.
Towards Sharper Object Boundaries in Self-Supervised Depth Estimation
PositiveArtificial Intelligence
Accurate monocular depth estimation is essential for understanding 3D scenes, yet current methods often produce blurred depth at object boundaries, leading to erroneous 3D points. This study introduces a self-supervised approach that models per-pixel depth as a mixture distribution, allowing for sharp depth discontinuities without fine-grained supervision. The method integrates variance-aware loss functions and uncertainty propagation, achieving up to 35% higher boundary sharpness and improved point cloud quality on KITTI and VKITTIv2 datasets.
CARScenes: Semantic VLM Dataset for Safe Autonomous Driving
PositiveArtificial Intelligence
CAR-Scenes is a frame-level dataset designed for autonomous driving, facilitating the training and evaluation of vision-language models (VLMs) for scene-level understanding. The dataset comprises 5,192 annotated images from sources like Argoverse, Cityscapes, KITTI, and nuScenes, utilizing a comprehensive 28-key category/sub-category knowledge base. The annotations are generated through a GPT-4o-assisted pipeline with human verification, providing detailed attributes and supporting semantic retrieval and risk-aware scenario mining.