World PulseNowPowered by AI

Trending:

Enhancing Diffusion Model Guidance through Calibration and Regularization

arXiv — cs.LG•Wednesday, November 12, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

The paper titled 'Enhancing Diffusion Model Guidance through Calibration and Regularization' presents significant advancements in classifier-guided diffusion models, which are essential for conditional image generation. The authors identify a critical problem: overconfident predictions during early denoising steps that lead to ineffective guidance gradients. To combat this, they propose a differentiable calibration objective based on the Smooth Expected Calibration Error, which enhances classifier calibration with minimal fine-tuning. Additionally, they introduce innovative sampling guidance methods that do not require retraining existing classifiers. These methods include tilted sampling with batch-level reweighting and adaptive entropy-regularized sampling, which help maintain diversity in generated images. The experiments conducted on the ImageNet 128x128 dataset demonstrate that their divergence-regularized guidance achieves an impressive FID of 2.13 using a ResNet-101 classifier, …

— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended Readings

ERMoE: Eigen-Reparameterized Mixture-of-Experts for Stable Routing and Interpretable Specialization

arXiv — cs.CV3 days ago

ERMoE: Eigen-Reparameterized Mixture-of-Experts for Stable Routing and Interpretable Specialization

PositiveArtificial Intelligence

The article introduces ERMoE, a new Mixture-of-Experts (MoE) architecture designed to enhance model capacity by addressing challenges in routing and expert specialization. ERMoE reparameterizes experts in an orthonormal eigenbasis and utilizes an 'Eigenbasis Score' for routing, which stabilizes expert utilization and improves interpretability. This approach aims to overcome issues of misalignment and load imbalances that have hindered previous MoE architectures.

Read full article

via arXiv — cs.CV

Unleashing the Potential of Large Language Models for Text-to-Image Generation through Autoregressive Representation Alignment

arXiv — cs.CV3 days ago

Unleashing the Potential of Large Language Models for Text-to-Image Generation through Autoregressive Representation Alignment

PositiveArtificial Intelligence

The article introduces Autoregressive Representation Alignment (ARRA), a novel training framework designed to enhance text-to-image generation in autoregressive large language models (LLMs) without altering their architecture. ARRA achieves this by aligning the hidden states of LLMs with visual representations from external models through a global visual alignment loss and a hybrid token. Experimental results demonstrate that ARRA significantly reduces the Fréchet Inception Distance (FID) for models like LlamaGen, indicating improved coherence in generated images.

Read full article

via arXiv — cs.CV

Enhanced Structured Lasso Pruning with Class-wise Information

arXiv — cs.CV3 days ago

Enhanced Structured Lasso Pruning with Class-wise Information

PositiveArtificial Intelligence

The paper titled 'Enhanced Structured Lasso Pruning with Class-wise Information' discusses advancements in neural network pruning methods. Traditional pruning techniques often overlook class-wise information, leading to potential loss of statistical data. This study introduces two new pruning schemes, sparse graph-structured lasso pruning with Information Bottleneck (sGLP-IB) and sparse tree-guided lasso pruning with Information Bottleneck (sTLP-IB), aimed at preserving statistical information while reducing model complexity.

Read full article

via arXiv — cs.CV

UHKD: A Unified Framework for Heterogeneous Knowledge Distillation via Frequency-Domain Representations

arXiv — cs.CV3 days ago

UHKD: A Unified Framework for Heterogeneous Knowledge Distillation via Frequency-Domain Representations

PositiveArtificial Intelligence

Unified Heterogeneous Knowledge Distillation (UHKD) is a proposed framework that enhances knowledge distillation (KD) by utilizing intermediate features in the frequency domain. This approach addresses the limitations of traditional KD methods, which are primarily designed for homogeneous models and struggle in heterogeneous environments. UHKD aims to improve model compression while maintaining accuracy, making it a significant advancement in the field of artificial intelligence.

Read full article

via arXiv — cs.CV

Flow matching-based generative models for MIMO channel estimation

arXiv — cs.LG3 days ago

Flow matching-based generative models for MIMO channel estimation

PositiveArtificial Intelligence

The article presents a novel flow matching (FM)-based generative model for multiple-input multiple-output (MIMO) channel estimation. This approach addresses the slow sampling speed challenge associated with diffusion model (DM)-based schemes by formulating the channel estimation problem within the FM framework. The proposed method shows potential for superior channel estimation accuracy and significantly reduced sampling overhead compared to existing DM-based methods.

Read full article

via arXiv — cs.LG

RiverScope: High-Resolution River Masking Dataset

arXiv — cs.CV3 days ago

RiverScope: High-Resolution River Masking Dataset

PositiveArtificial Intelligence

RiverScope is a newly developed high-resolution dataset aimed at improving the monitoring of rivers and surface water dynamics, which are crucial for understanding Earth's climate system. The dataset includes 1,145 high-resolution images covering 2,577 square kilometers, with expert-labeled river and surface water masks. This initiative addresses the challenges of monitoring narrow or sediment-rich rivers that are often inadequately represented in low-resolution satellite data.

Read full article

via arXiv — cs.CV