Variational Supervised Contrastive Learning

arXiv — cs.LG•Monday, December 8, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

Variational Supervised Contrastive Learning (VarCon) has been introduced to enhance supervised contrastive learning by reformulating it as variational inference over latent class variables, addressing limitations in embedding distribution and generalization. This method aims to improve class-aware matching and control intra-class dispersion in the embedding space.
The development of VarCon is significant as it seeks to overcome challenges in contrastive learning, particularly the tendency to push semantically related instances apart and the reliance on large in-batch negatives, which can hinder model performance across diverse datasets like CIFAR-10 and ImageNet.
This advancement reflects a broader trend in artificial intelligence research, where enhancing model robustness and generalization is critical. The ongoing exploration of methods such as dataset distillation, adversarial training, and noise handling in labels indicates a collective effort to refine machine learning techniques, ensuring they are more effective and reliable in real-world applications.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

LucidQuery AI

Combines diffusion reasoning with autoregressive LLM for advanced AI analysis.

AI & DataView app details

The Visualizer

Transform complex topics into clear, visual explanations for effortless learning.

AI & DataView app details

AIPortalX

Browse, compare, and use over 100 verified AI models with detailed insights and filtering.

Creative & DesignView app details

Continue Readings

arXiv — cs.CV2 days ago

The Inductive Bottleneck: Data-Driven Emergence of Representational Sparsity in Vision Transformers

NeutralArtificial Intelligence

Recent research has identified an 'Inductive Bottleneck' in Vision Transformers (ViTs), where these models exhibit a U-shaped entropy profile, compressing information in middle layers before expanding it for final classification. This phenomenon is linked to the semantic abstraction required by specific tasks and is not merely an architectural flaw but a data-dependent adaptation observed across various datasets such as UC Merced, Tiny ImageNet, and CIFAR-100.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

Thicker and Quicker: A Jumbo Token for Fast Plain Vision Transformers

PositiveArtificial Intelligence

A new approach to Vision Transformers (ViTs) has been introduced, featuring a Jumbo token that enhances processing speed by reducing patch token width while increasing global token width. This innovation aims to address the slow performance of ViTs without compromising their generality or accuracy, making them more practical for various applications.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

PrunedCaps: A Case For Primary Capsules Discrimination

PositiveArtificial Intelligence

A recent study has introduced a pruned version of Capsule Networks (CapsNets), demonstrating that it can operate up to 9.90 times faster than traditional architectures by eliminating 95% of Primary Capsules while maintaining accuracy across various datasets, including MNIST and CIFAR-10.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

Adaptive Dataset Quantization: A New Direction for Dataset Pruning

PositiveArtificial Intelligence

A new paper introduces an innovative dataset quantization method aimed at reducing storage and communication costs for large-scale datasets on resource-constrained edge devices. This approach focuses on compressing individual samples by minimizing intra-sample redundancy while retaining essential features, marking a shift from traditional inter-sample redundancy methods.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

CLUENet: Cluster Attention Makes Neural Networks Have Eyes

PositiveArtificial Intelligence

The CLUster attEntion Network (CLUENet) has been introduced as a novel deep architecture aimed at enhancing visual semantic understanding by addressing the limitations of existing convolutional and attention-based models, particularly their rigid receptive fields and complex architectures. This innovation incorporates global soft aggregation, hard assignment, and improved cluster pooling strategies to enhance local modeling and interpretability.

Read full article

via arXiv — cs.CV

arXiv — cs.LG2 days ago

Arc Gradient Descent: A Mathematically Derived Reformulation of Gradient Descent with Phase-Aware, User-Controlled Step Dynamics

PositiveArtificial Intelligence

The paper introduces Arc Gradient Descent (ArcGD), a new optimizer that reformulates traditional gradient descent methods to incorporate phase-aware and user-controlled step dynamics. The evaluation of ArcGD shows it outperforming the Adam optimizer on a non-convex benchmark and a real-world ML dataset, particularly in challenging scenarios like the Rosenbrock function and CIFAR-10 image classification.

Read full article

via arXiv — cs.LG

arXiv — cs.CV2 days ago

Causal Interpretability for Adversarial Robustness: A Hybrid Generative Classification Approach

NeutralArtificial Intelligence

A new study presents a hybrid generative classification approach aimed at enhancing adversarial robustness in deep learning models. The proposed deep ensemble model integrates a pre-trained discriminative network for feature extraction with a generative classification network, achieving high accuracy and robustness against adversarial attacks without the need for adversarial training. Extensive experiments on CIFAR-10 and CIFAR-100 validate its effectiveness.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

Structured Initialization for Vision Transformers

PositiveArtificial Intelligence

A new study proposes a structured initialization method for Vision Transformers (ViTs), aiming to integrate the strong inductive biases of Convolutional Neural Networks (CNNs) without altering the architecture. This approach is designed to enhance performance on small datasets while maintaining scalability as data increases.

Read full article

via arXiv — cs.CV