arXiv:2511.08224v1 Announce Type: new 
Abstract: We introduce 2Dto3D-SR, a versatile framework for real-time single-view 3D super-resolution that eliminates the need for high-resolution RGB guidance. Our framework encodes 3D data from a single viewpoint into a structured 2D representation, enabling the direct application of existing 2D image super-resolution architectures. We utilize the Projected Normalized Coordinate Code (PNCC) to represent 3D geometry from a visible surface as a regular image, thereby circumventing the complexities of 3D point-based or RGB-guided methods. This design supports lightweight and fast models adaptable to various deployment environments. We evaluate 2Dto3D-SR with two implementations: one using Swin Transformers for high accuracy, and another using Vision Mamba for high efficiency. Experiments show the Swin Transformer model achieves state-of-the-art accuracy on standard benchmarks, while the Vision Mamba model delivers competitive results at real-time speeds. This establishes our geometry-guided pipeline as a surprisingly simple yet viable and practical solution for real-world scenarios, especially where high-resolution RGB data is inaccessible.

يقدم إطار 2Dto3D-SR نهجًا مبتكرًا لتحسين دقة 3D في الوقت الحقيقي من عرض واحد فقط، دون الحاجة إلى توجيه RGB عالي الدقة. من خلال ترميز البيانات ثلاثية الأبعاد في تمثيل ثنائي الأبعاد منظم باستخدام رمز الإحداثيات الطبيعية المعروضة (PNCC)، يبسط العملية ويدعم النماذج الخفيفة. تم تقييمه باستخدام محولات Swin و Vision Mamba، حيث حقق دقة رائدة وأداءً تنافسيًا في الوقت الحقيقي، مما يجعله حلاً عمليًا للسيناريوهات التي تفتقر إلى بيانات عالية الدقة.

El marco 2Dto3D-SR presenta un enfoque innovador para la super-resolución 3D en tiempo real desde una sola vista, sin necesidad de guía RGB de alta resolución. Al codificar datos 3D en una representación 2D estructurada utilizando el Código de Coordenadas Normalizadas Proyectadas (PNCC), simplifica el proceso y admite modelos ligeros. Evaluado con Swin Transformers y Vision Mamba, logra una precisión de vanguardia y un rendimiento competitivo en tiempo real, convirtiéndose en una solución práctica para escenarios sin datos de alta resolución.

Le cadre 2Dto3D-SR introduit une approche novatrice pour la super-résolution 3D en temps réel à partir d'une seule vue, sans nécessiter de guidage RGB haute résolution. En encodant les données 3D dans une représentation 2D structurée à l'aide du Code de Coordonnées Normalisées Projetées (PNCC), il simplifie le processus et prend en charge des modèles légers. Évalué avec les Swin Transformers et Vision Mamba, il atteint une précision de pointe et des performances compétitives en temps réel, ce qui en fait une solution pratique pour les scénarios manquant de données haute résolution.

The 2Dto3D-SR framework introduces a novel approach for real-time single-view 3D super-resolution without requiring high-resolution RGB guidance. By encoding 3D data into a structured 2D representation using the Projected Normalized Coordinate Code (PNCC), it simplifies the process and supports lightweight models. Evaluated with Swin Transformers and Vision Mamba, it achieves state-of-the-art accuracy and competitive real-time performance, making it a practical solution for scenarios lacking high-resolution data.

2D Representation for Unguided Single-View 3D Super-Resolution in Real-Time

Was this article worth reading? Share it

Ready to build your own newsroom?