InvFusion: Bridging Supervised and Zero-shot Diffusion for Inverse Problems

arXiv — cs.CV•Thursday, November 20, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

InvFusion introduces a groundbreaking method that merges supervised and zero
This development is crucial as it allows for better adaptation to various degradation scenarios during testing, potentially improving outcomes in fields reliant on high
The advancement reflects a broader trend in AI research towards creating models that not only perform well under ideal conditions but also adapt effectively to real

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

Recommended Readings

arXiv — cs.CV9 hours ago

ANTS: Adaptive Negative Textual Space Shaping for OOD Detection via Test-Time MLLM Understanding and Reasoning

PositiveArtificial Intelligence

The paper presents ANTS, an innovative method for enhancing Out-of-Distribution (OOD) detection by utilizing Adaptive Negative Textual Space. By leveraging multimodal large language models (MLLMs), the approach generates expressive negative sentences that accurately characterize OOD distributions. This method addresses the limitations of existing techniques, particularly in near-OOD detection, by caching images likely to be OOD samples and prompting MLLMs for detailed descriptions.

Read full article

via arXiv — cs.CV

arXiv — cs.CV9 hours ago

Learning to Expand Images for Efficient Visual Autoregressive Modeling

PositiveArtificial Intelligence

The paper introduces Expanding Autoregressive Representation (EAR), a new paradigm for visual generation that mimics the human visual system's center-outward perception. This method improves efficiency by unfolding image tokens in a spiral order, allowing for parallel decoding and preserving spatial continuity. Additionally, a length-adaptive decoding strategy is proposed to enhance flexibility and speed, ultimately reducing computational costs and improving generation quality.

Read full article

via arXiv — cs.CV

arXiv — cs.CVa day ago

FGM-HD: Boosting Generation Diversity of Fractal Generative Models through Hausdorff Dimension Induction

PositiveArtificial Intelligence

The article discusses a novel approach to enhancing the diversity of outputs in Fractal Generative Models (FGMs) while maintaining high visual quality. By incorporating the Hausdorff Dimension (HD), a concept from fractal geometry that quantifies structural complexity, the authors propose a learnable HD estimation method that predicts HD from image embeddings. This method aims to improve the diversity of generated images, addressing challenges such as image quality degradation and limited diversity enhancement in FGMs.

Read full article

via arXiv — cs.CV

arXiv — cs.CVa day ago

MAVias: Mitigate any Visual Bias

PositiveArtificial Intelligence

MAVias is an innovative approach aimed at mitigating biases in computer vision models, which is crucial for enhancing the trustworthiness of artificial intelligence systems. Traditional bias mitigation techniques often address a limited range of predefined biases, which restricts their effectiveness in diverse visual datasets that may contain multiple, unknown biases. MAVias utilizes foundation models to identify spurious associations between visual attributes and target classes, capturing a broad spectrum of visual features and translating them into language-coded potential biases for further…

Read full article

via arXiv — cs.CV

arXiv — cs.CVa day ago

Diffusion As Self-Distillation: End-to-End Latent Diffusion In One Model

PositiveArtificial Intelligence

Standard Latent Diffusion Models utilize a complex architecture comprising separate encoder, decoder, and diffusion network components, which are trained in multiple stages. This modular design is computationally inefficient and leads to suboptimal performance. The proposed solution aims to unify these components into a single, end-to-end trainable network. The authors identify issues of instability in naive joint training due to 'latent collapse' and introduce Diffusion as Self-Distillation (DSD), a framework that addresses these challenges.

Read full article

via arXiv — cs.CV

arXiv — cs.LGa day ago

FoilDiff: A Hybrid Transformer Backbone for Diffusion-based Modelling of 2D Airfoil Flow Fields

PositiveArtificial Intelligence

The accurate prediction of flow fields around airfoils is essential for aerodynamic design and optimization. While Computational Fluid Dynamics (CFD) models are effective, they are computationally expensive. This has led to the development of surrogate models using deep learning architectures, including Convolutional Neural Networks (CNNs) and Diffusion Models (DMs). The proposed model, FoilDiff, utilizes a hybrid-backbone denoising network that combines convolutional feature extraction with transformer-based global attention, enhancing adaptability and accuracy in flow structure representatio…

Read full article

via arXiv — cs.LG

arXiv — cs.CVa day ago

Benchmarking Deep Learning-Based Object Detection Models on Feature Deficient Astrophotography Imagery Dataset

NeutralArtificial Intelligence

The study benchmarks various deep learning-based object detection models using the MobilTelesco dataset, which features sparse astrophotography images. Traditional datasets like ImageNet and COCO focus on everyday objects, lacking the unique challenges presented by feature-deficient conditions. The research highlights the difficulties these models face when applied to non-commercial domains, emphasizing the need for specialized datasets in astrophotography.

Read full article

via arXiv — cs.CV

arXiv — cs.LG2 days ago

MeanFlow Transformers with Representation Autoencoders

PositiveArtificial Intelligence

MeanFlow (MF) is a generative model inspired by diffusion processes, designed for efficient few-step generation by learning direct transitions from noise to data. It is commonly utilized as a latent MF, employing the pre-trained Stable Diffusion variational autoencoder (SD-VAE) for high-dimensional data modeling. However, MF training is computationally intensive and often unstable. This study introduces an efficient training and sampling scheme for MF in the latent space of a Representation Autoencoder (RAE), addressing issues like gradient explosion during training.

Read full article

via arXiv — cs.LG