WeatherDiffusion: Controllable Weather Editing in Intrinsic Space

arXiv — cs.CV•Thursday, November 27, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

WeatherDiffusion has been introduced as a diffusion-based framework that enables controllable weather editing in intrinsic space, utilizing an inverse renderer to estimate material properties and scene geometry from input images. This framework enhances the ability to manipulate weather conditions in generated images through an intrinsic map-aware attention mechanism and CLIP-space interpolation.
This development is significant as it allows for greater precision in visual content creation, particularly in fields such as autonomous driving and environmental simulation, where accurate weather representation is crucial for safety and realism.
The advancement of WeatherDiffusion aligns with ongoing efforts to improve image generation technologies, particularly in autonomous driving, where accurate scene representation under varying weather conditions is essential. This reflects a broader trend in AI research focusing on enhancing situational awareness and operational efficiency in complex environments.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

Cont3xt.dev

Document rules once, sync context across all AI coding tools instantly.

AI & DataTry the app

MyArchitectAI

Generate photorealistic 3D architectural renders instantly with AI technology.

Tech & Developer ToolsTry the app

OneSky Localization Agent

Automate your app translations with AI agents for faster, accurate localization.

AI & DataTry the app

Continue Readings

arXiv — cs.CV17 hours ago

AnchorOPT: Towards Optimizing Dynamic Anchors for Adaptive Prompt Learning

PositiveArtificial Intelligence

The recent introduction of AnchorOPT marks a significant advancement in prompt learning methodologies, particularly for CLIP models. This framework enhances the adaptability of anchor tokens by allowing them to learn dynamically from task-specific data and optimizing their positional relationships with soft tokens based on the training context.

Read full article

via arXiv — cs.CV

arXiv — cs.CV17 hours ago

Cheating Stereo Matching in Full-scale: Physical Adversarial Attack against Binocular Depth Estimation in Autonomous Driving

NeutralArtificial Intelligence

A novel physical adversarial attack has been developed targeting stereo matching models used in autonomous driving, marking a significant advancement in understanding the vulnerabilities of these systems. This method employs a 3D physical adversarial example (PAE) with a global camouflage texture, enhancing its effectiveness across various viewpoints of stereo cameras.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

DeLightMono: Enhancing Self-Supervised Monocular Depth Estimation in Endoscopy by Decoupling Uneven Illumination

PositiveArtificial Intelligence

A new framework called DeLight-Mono has been introduced to enhance self-supervised monocular depth estimation in endoscopy by addressing the challenges posed by uneven illumination in endoscopic images. This innovative approach utilizes an illumination-reflectance-depth model and auxiliary networks to improve depth estimation accuracy, particularly in low-light conditions.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

Face, Whole-Person, and Object Classification in a Unified Space Via The Interleaved Multi-Domain Identity Curriculum

PositiveArtificial Intelligence

A new study introduces the Interleaved Multi-Domain Identity Curriculum (IMIC), enabling models to perform object recognition, face recognition from varying image qualities, and person recognition in a unified embedding space without significant catastrophic forgetting. This approach was tested on foundation models DINOv3, CLIP, and EVA-02, demonstrating comparable performance to domain experts across all tasks.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

4DWorldBench: A Comprehensive Evaluation Framework for 3D/4D World Generation Models

PositiveArtificial Intelligence

The introduction of 4DWorldBench marks a significant advancement in the evaluation of 3D/4D World Generation Models, which are crucial for developing realistic and dynamic environments for applications like virtual reality and autonomous driving. This framework assesses models based on perceptual quality, physical realism, and 4D consistency, addressing the need for a unified benchmark in a rapidly evolving field.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

Reasoning-VLA: A Fast and General Vision-Language-Action Reasoning Model for Autonomous Driving

PositiveArtificial Intelligence

A new model named Reasoning-VLA has been introduced, enhancing Vision-Language-Action (VLA) capabilities for autonomous driving. This model aims to improve decision-making efficiency and generalization across diverse driving scenarios by utilizing learnable action queries and a standardized dataset format for training.

Read full article

via arXiv — cs.CV

arXiv — cs.LG2 days ago

stable-pretraining-v1: Foundation Model Research Made Simple

PositiveArtificial Intelligence

The stable-pretraining library has been introduced as a modular and performance-optimized tool for foundation model research, built on PyTorch, Lightning, Hugging Face, and TorchMetrics. This library aims to simplify self-supervised learning (SSL) by providing essential utilities and enhancing the visibility of training dynamics through comprehensive logging.

Read full article

via arXiv — cs.LG

arXiv — cs.LG2 days ago

Concept-Aware Batch Sampling Improves Language-Image Pretraining

PositiveArtificial Intelligence

A recent study introduces Concept-Aware Batch Sampling (CABS), a novel framework designed to enhance language-image pretraining by utilizing a dynamic, concept-based approach to data curation. This method builds on DataConcept, a dataset of 128 million annotated image-text pairs, allowing for more adaptive and efficient training processes in vision-language models.

Read full article

via arXiv — cs.LG