FLEG: Feed-Forward Language Embedded Gaussian Splatting from Any Views
PositiveArtificial Intelligence
- A new feed-forward network named FLEG has been introduced, capable of reconstructing language-embedded 3D Gaussians from any views, overcoming limitations of previous methods that relied on fixed input views and lacked sufficient 3D training data. This innovative framework allows for 2D-to-3D lifting from uncalibrated multi-view images without requiring 3D annotations.
- The development of FLEG is significant as it enables the use of large-scale video data to enhance semantic embedding, thereby improving the efficiency and accuracy of 3D representation in various applications.
- This advancement reflects a broader trend in AI and computer vision, where methods are increasingly leveraging unstructured data and self-supervised learning techniques to enhance model performance, as seen in related developments in multi-view camera calibration and 3D human pose estimation.
— via World Pulse Now AI Editorial System
