LTD: Low Temperature Distillation for Gradient Masking-free Adversarial Training

arXiv — cs.CV•Thursday, November 27, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

A novel approach called Low-Temperature Distillation (LTD) has been introduced to enhance adversarial training in neural networks, addressing the vulnerabilities associated with one-hot label representations in image classification. LTD utilizes a lower temperature in the teacher model while keeping the student model's temperature fixed, refining label representations and improving model robustness against adversarial attacks.
This development is significant as it offers a solution to the gradient masking problem, which has hindered the effectiveness of traditional adversarial training methods. By refining data representation, LTD aims to bolster the reliability of neural networks in real-world applications, particularly in datasets where data ambiguity is prevalent.
The introduction of LTD aligns with ongoing efforts in the AI community to improve model robustness and address challenges in machine learning, such as the need for effective unlearning methods and the balance between perceptual quality and data likelihood. These themes highlight a growing recognition of the complexities in data representation and the importance of innovative approaches to enhance model performance.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

Republiclabs.ai

Generate custom images and videos with the people's AI playground.

Creative & DesignTry the app

Golan AI

Create AI images and videos with advanced tools for professional designers.

Creative & DesignTry the app

AI Filter

Apply AI-powered filters to instantly enhance and transform your photos.

AI & DataTry the app

Continue Readings

arXiv — cs.CV17 hours ago

Which Layer Causes Distribution Deviation? Entropy-Guided Adaptive Pruning for Diffusion and Flow Models

PositiveArtificial Intelligence

A new framework called EntPruner has been introduced to address parameter redundancy in large-scale vision generative models, specifically diffusion and flow models. This framework employs an entropy-guided automatic progressive pruning strategy, which assesses the importance of model blocks based on Conditional Entropy Deviation (CED) to optimize performance across various downstream tasks.

Read full article

via arXiv — cs.CV

arXiv — cs.CV17 hours ago

Filter Like You Test: Data-Driven Data Filtering for CLIP Pretraining

PositiveArtificial Intelligence

The introduction of Filter Like You Test (FLYT) presents a novel algorithm for curating large-scale vision-language datasets, enhancing the selection of pretraining examples by learning the usefulness of each data point through gradient signals from downstream tasks. This is complemented by Mixing-FLYT (M-FLYT) and Soft Cap Sampling (SCS), which improve dataset filtering and accuracy.

Read full article

via arXiv — cs.CV

arXiv — cs.CV17 hours ago

Dynamic Epsilon Scheduling: A Multi-Factor Adaptive Perturbation Budget for Adversarial Training

PositiveArtificial Intelligence

A novel framework called Dynamic Epsilon Scheduling (DES) has been proposed to enhance adversarial training for deep neural networks. This approach adapts the adversarial perturbation budget based on instance-specific characteristics, integrating factors such as distance to decision boundaries, prediction confidence, and model uncertainty. This advancement addresses the limitations of fixed perturbation budgets in existing methods.

Read full article

via arXiv — cs.CV

arXiv — cs.CV17 hours ago

From Diffusion to One-Step Generation: A Comparative Study of Flow-Based Models with Application to Image Inpainting

PositiveArtificial Intelligence

A comprehensive study has been conducted comparing three generative modeling paradigms: Denoising Diffusion Probabilistic Models (DDPM), Conditional Flow Matching (CFM), and MeanFlow, focusing on their application in image inpainting. The study highlights that CFM significantly outperforms DDPM in terms of efficiency and quality, achieving a notable FID score of 24.15 with only 50 steps, while MeanFlow allows for single-step generation, reducing inference time by 50 times.

Read full article

via arXiv — cs.CV

arXiv — cs.CV17 hours ago

Mechanisms of Non-Monotonic Scaling in Vision Transformers

NeutralArtificial Intelligence

A recent study on Vision Transformers (ViTs) reveals a non-monotonic scaling behavior, where deeper models like ViT-L may underperform compared to shallower variants such as ViT-S and ViT-B. This research identifies a three-phase pattern—Cliff-Plateau-Climb—indicating how representation quality evolves with depth, particularly noting the diminishing role of the [CLS] token in favor of patch tokens for better performance.

Read full article

via arXiv — cs.CV

arXiv — cs.LG2 days ago

SG-OIF: A Stability-Guided Online Influence Framework for Reliable Vision Data

PositiveArtificial Intelligence

The Stability-Guided Online Influence Framework (SG-OIF) has been introduced to enhance the reliability of vision data in deep learning models, addressing challenges such as the computational expense of influence function implementations and the instability of training dynamics. This framework aims to provide real-time control over algorithmic stability, facilitating more accurate identification of critical training examples.

Read full article

via arXiv — cs.LG

arXiv — cs.LG2 days ago

DP-MicroAdam: Private and Frugal Algorithm for Training and Fine-tuning

PositiveArtificial Intelligence

The introduction of DP-MicroAdam marks a significant advancement in the realm of adaptive optimizers for differentially private training, demonstrating superior performance and convergence rates compared to traditional methods like DP-SGD. This new algorithm is designed to be memory-efficient and sparsity-aware, addressing the challenges of extensive compute and hyperparameter tuning typically associated with differential privacy.

Read full article

via arXiv — cs.LG

arXiv — stat.ML2 days ago

ModHiFi: Identifying High Fidelity predictive components for Model Modification

PositiveArtificial Intelligence

A recent study titled 'ModHiFi: Identifying High Fidelity predictive components for Model Modification' explores methods to modify open weight models without access to training data or loss functions. The research focuses on identifying critical components that influence predictive performance using only distributional access, such as synthetic data.

Read full article

via arXiv — stat.ML