From Structure to Detail: Hierarchical Distillation for Efficient Diffusion Model

arXiv — cs.CV•Thursday, November 13, 2025 at 5:00:00 AM

The recent paper on Hierarchical Distillation (HD) presents a solution to the critical issue of inference latency in diffusion models, which has hindered their real-time application. Traditional methods, whether trajectory-based or distribution-based, have inherent trade-offs—trajectory methods preserve global structure but lose high-frequency details, while distribution methods achieve higher fidelity but face challenges like mode collapse. The HD framework synergistically integrates these approaches, using trajectory distillation to create a structural sketch that optimally initializes the distribution-based refinement stage. This innovative strategy not only enhances overall performance but also introduces an Adaptive Weighted Discriminator to improve adversarial training. The results are promising, demonstrating state-of-the-art performance across various tasks, particularly on ImageNet, indicating a significant leap forward in the efficiency and effectiveness of AI models.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

Recommended Readings

arXiv — cs.LG4 hours ago

SCALEX: Scalable Concept and Latent Exploration for Diffusion Models

PositiveArtificial Intelligence

SCALEX is a newly introduced framework designed for scalable and automated exploration of latent spaces in diffusion models. It addresses the issue of social biases, such as gender and racial stereotypes, that are often encoded in image generation models. By utilizing natural language prompts, SCALEX enables zero-shot interpretation, allowing for systematic comparisons across various concepts and facilitating the discovery of internal model associations without the need for retraining or labeling.

Read full article

via arXiv — cs.LG

arXiv — cs.LGa day ago

Optimizing Input of Denoising Score Matching is Biased Towards Higher Score Norm

NeutralArtificial Intelligence

The paper titled 'Optimizing Input of Denoising Score Matching is Biased Towards Higher Score Norm' discusses the implications of using denoising score matching in optimizing diffusion models. It reveals that this optimization disrupts the equivalence between denoising score matching and exact score matching, resulting in a bias that favors higher score norms. The study also highlights similar biases in optimizing data distributions with pre-trained diffusion models, affecting various applications such as MAR, PerCo, and DreamFusion.

Read full article

via arXiv — cs.LG

arXiv — cs.CV2 days ago

Enhanced Structured Lasso Pruning with Class-wise Information

PositiveArtificial Intelligence

The paper titled 'Enhanced Structured Lasso Pruning with Class-wise Information' discusses advancements in neural network pruning methods. Traditional pruning techniques often overlook class-wise information, leading to potential loss of statistical data. This study introduces two new pruning schemes, sparse graph-structured lasso pruning with Information Bottleneck (sGLP-IB) and sparse tree-guided lasso pruning with Information Bottleneck (sTLP-IB), aimed at preserving statistical information while reducing model complexity.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

Rethinking Target Label Conditioning in Adversarial Attacks: A 2D Tensor-Guided Generative Approach

NeutralArtificial Intelligence

The article discusses advancements in multi-target adversarial attacks, highlighting the limitations of current generative methods that use one-dimensional tensors for target label encoding. It emphasizes the importance of both the quality and quantity of semantic features in enhancing the transferability of these attacks. A new framework, 2D Tensor-Guided Adversarial Fusion (TGAF), is proposed to improve the encoding process by leveraging diffusion models, ensuring that generated noise retains complete semantic information.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

UHKD: A Unified Framework for Heterogeneous Knowledge Distillation via Frequency-Domain Representations

PositiveArtificial Intelligence

Unified Heterogeneous Knowledge Distillation (UHKD) is a proposed framework that enhances knowledge distillation (KD) by utilizing intermediate features in the frequency domain. This approach addresses the limitations of traditional KD methods, which are primarily designed for homogeneous models and struggle in heterogeneous environments. UHKD aims to improve model compression while maintaining accuracy, making it a significant advancement in the field of artificial intelligence.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

Toward Generalized Detection of Synthetic Media: Limitations, Challenges, and the Path to Multimodal Solutions

NeutralArtificial Intelligence

Artificial intelligence (AI) in media has seen rapid advancements over the past decade, particularly with the introduction of Generative Adversarial Networks (GANs) and diffusion models, which have enhanced photorealistic image generation. However, these developments have also led to challenges in distinguishing between real and synthetic content, as evidenced by the rise of deepfakes. Many detection models utilizing deep learning methods like Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) have been created, but they often struggle with generalization and multimodal data.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

ERMoE: Eigen-Reparameterized Mixture-of-Experts for Stable Routing and Interpretable Specialization

PositiveArtificial Intelligence

The article introduces ERMoE, a new Mixture-of-Experts (MoE) architecture designed to enhance model capacity by addressing challenges in routing and expert specialization. ERMoE reparameterizes experts in an orthonormal eigenbasis and utilizes an 'Eigenbasis Score' for routing, which stabilizes expert utilization and improves interpretability. This approach aims to overcome issues of misalignment and load imbalances that have hindered previous MoE architectures.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

RiverScope: High-Resolution River Masking Dataset

PositiveArtificial Intelligence

RiverScope is a newly developed high-resolution dataset aimed at improving the monitoring of rivers and surface water dynamics, which are crucial for understanding Earth's climate system. The dataset includes 1,145 high-resolution images covering 2,577 square kilometers, with expert-labeled river and surface water masks. This initiative addresses the challenges of monitoring narrow or sediment-rich rivers that are often inadequately represented in low-resolution satellite data.

Read full article

via arXiv — cs.CV