WeatherDiffusion: Controllable Weather Editing in Intrinsic Space

arXiv — cs.CVThursday, November 27, 2025 at 5:00:00 AM
  • WeatherDiffusion has been introduced as a diffusion-based framework that enables controllable weather editing in intrinsic space, utilizing an inverse renderer to estimate material properties and scene geometry from input images. This framework enhances the ability to manipulate weather conditions in generated images through an intrinsic map-aware attention mechanism and CLIP-space interpolation.
  • This development is significant as it allows for greater precision in visual content creation, particularly in fields such as autonomous driving and environmental simulation, where accurate weather representation is crucial for safety and realism.
  • The advancement of WeatherDiffusion aligns with ongoing efforts to improve image generation technologies, particularly in autonomous driving, where accurate scene representation under varying weather conditions is essential. This reflects a broader trend in AI research focusing on enhancing situational awareness and operational efficiency in complex environments.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
AnchorOPT: Towards Optimizing Dynamic Anchors for Adaptive Prompt Learning
PositiveArtificial Intelligence
The recent introduction of AnchorOPT marks a significant advancement in prompt learning methodologies, particularly for CLIP models. This framework enhances the adaptability of anchor tokens by allowing them to learn dynamically from task-specific data and optimizing their positional relationships with soft tokens based on the training context.
Cheating Stereo Matching in Full-scale: Physical Adversarial Attack against Binocular Depth Estimation in Autonomous Driving
NeutralArtificial Intelligence
A novel physical adversarial attack has been developed targeting stereo matching models used in autonomous driving, marking a significant advancement in understanding the vulnerabilities of these systems. This method employs a 3D physical adversarial example (PAE) with a global camouflage texture, enhancing its effectiveness across various viewpoints of stereo cameras.
DeLightMono: Enhancing Self-Supervised Monocular Depth Estimation in Endoscopy by Decoupling Uneven Illumination
PositiveArtificial Intelligence
A new framework called DeLight-Mono has been introduced to enhance self-supervised monocular depth estimation in endoscopy by addressing the challenges posed by uneven illumination in endoscopic images. This innovative approach utilizes an illumination-reflectance-depth model and auxiliary networks to improve depth estimation accuracy, particularly in low-light conditions.
Face, Whole-Person, and Object Classification in a Unified Space Via The Interleaved Multi-Domain Identity Curriculum
PositiveArtificial Intelligence
A new study introduces the Interleaved Multi-Domain Identity Curriculum (IMIC), enabling models to perform object recognition, face recognition from varying image qualities, and person recognition in a unified embedding space without significant catastrophic forgetting. This approach was tested on foundation models DINOv3, CLIP, and EVA-02, demonstrating comparable performance to domain experts across all tasks.
4DWorldBench: A Comprehensive Evaluation Framework for 3D/4D World Generation Models
PositiveArtificial Intelligence
The introduction of 4DWorldBench marks a significant advancement in the evaluation of 3D/4D World Generation Models, which are crucial for developing realistic and dynamic environments for applications like virtual reality and autonomous driving. This framework assesses models based on perceptual quality, physical realism, and 4D consistency, addressing the need for a unified benchmark in a rapidly evolving field.
Reasoning-VLA: A Fast and General Vision-Language-Action Reasoning Model for Autonomous Driving
PositiveArtificial Intelligence
A new model named Reasoning-VLA has been introduced, enhancing Vision-Language-Action (VLA) capabilities for autonomous driving. This model aims to improve decision-making efficiency and generalization across diverse driving scenarios by utilizing learnable action queries and a standardized dataset format for training.
stable-pretraining-v1: Foundation Model Research Made Simple
PositiveArtificial Intelligence
The stable-pretraining library has been introduced as a modular and performance-optimized tool for foundation model research, built on PyTorch, Lightning, Hugging Face, and TorchMetrics. This library aims to simplify self-supervised learning (SSL) by providing essential utilities and enhancing the visibility of training dynamics through comprehensive logging.
Concept-Aware Batch Sampling Improves Language-Image Pretraining
PositiveArtificial Intelligence
A recent study introduces Concept-Aware Batch Sampling (CABS), a novel framework designed to enhance language-image pretraining by utilizing a dynamic, concept-based approach to data curation. This method builds on DataConcept, a dataset of 128 million annotated image-text pairs, allowing for more adaptive and efficient training processes in vision-language models.