World PulseNowPowered by AI

Trending:

NeuCLIP: Efficient Large-Scale CLIP Training with Neural Normalizer Optimization

arXiv — cs.LG•Wednesday, November 12, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

On November 12, 2025, the article titled 'NeuCLIP: Efficient Large-Scale CLIP Training with Neural Normalizer Optimization' was submitted to arXiv, highlighting a significant advancement in training Contrastive Language-Image Pre-training (CLIP) models. The challenge of accurately estimating the normalization term in contrastive loss has long hindered effective training, particularly as conventional methods rely heavily on large batches, which demand substantial computational resources. NeuCLIP proposes a novel approach by reformulating the contrastive loss into a minimization problem and transforming it through variational analysis. This allows for more accurate normalizer estimates, addressing the optimization errors that arise when using smaller batches. The introduction of an alternating optimization algorithm enables the simultaneous training of the CLIP model and an auxiliary network, enhancing the overall efficiency of the training process. This development is crucial as it open…

— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended Readings

Preserving Cross-Modal Consistency for CLIP-based Class-Incremental Learning

arXiv — cs.CV2 days ago

Preserving Cross-Modal Consistency for CLIP-based Class-Incremental Learning

PositiveArtificial Intelligence

The paper titled 'Preserving Cross-Modal Consistency for CLIP-based Class-Incremental Learning' addresses the challenges of class-incremental learning (CIL) in vision-language models like CLIP. It introduces a two-stage framework called DMC, which separates the adaptation of the vision encoder from the optimization of textual soft prompts. This approach aims to mitigate classifier bias and maintain cross-modal alignment, enhancing the model's ability to learn new categories without forgetting previously acquired knowledge.

Read full article

via arXiv — cs.CV

CLIPPan: Adapting CLIP as A Supervisor for Unsupervised Pansharpening

arXiv — cs.CV2 days ago

CLIPPan: Adapting CLIP as A Supervisor for Unsupervised Pansharpening

PositiveArtificial Intelligence

The article presents CLIPPan, an unsupervised pansharpening framework that utilizes CLIP, a visual-language model, as a supervisor. This approach addresses the challenges faced by supervised pansharpening methods, particularly the domain adaptation issues arising from the disparity between simulated low-resolution training data and real-world high-resolution scenarios. The framework is designed to improve the understanding of the pansharpening process and enhance the model's ability to recognize various image types, ultimately setting a new state of the art in unsupervised full-resolution pans…

Read full article

via arXiv — cs.CV

Neural Network-Powered Finger-Drawn Biometric Authentication

arXiv — cs.LG2 days ago

Neural Network-Powered Finger-Drawn Biometric Authentication

PositiveArtificial Intelligence

A recent study published on arXiv investigates the use of neural networks for biometric authentication through finger-drawn digits on touchscreen devices. The research involved twenty participants who contributed a total of 2,000 finger-drawn digits. Two CNN architectures were evaluated, achieving approximately 89% authentication accuracy, while autoencoder approaches reached about 75% accuracy. The findings suggest that this method offers a secure and user-friendly biometric solution that can be integrated with existing authentication systems.

Read full article

via arXiv — cs.LG

NP-LoRA: Null Space Projection Unifies Subject and Style in LoRA Fusion

arXiv — cs.CV2 days ago

NP-LoRA: Null Space Projection Unifies Subject and Style in LoRA Fusion

PositiveArtificial Intelligence

The article introduces NP-LoRA, a novel framework for Low-Rank Adaptation (LoRA) fusion that addresses the issue of interference in existing methods. Traditional weight-based merging often leads to one LoRA dominating another, resulting in degraded fidelity. NP-LoRA utilizes a projection-based approach to maintain subspace separation, thereby enhancing the quality of fusion by preventing structural interference among principal directions.

Read full article

via arXiv — cs.CV

SplineSplat: 3D Ray Tracing for Higher-Quality Tomography

arXiv — cs.CV2 days ago

SplineSplat: 3D Ray Tracing for Higher-Quality Tomography

PositiveArtificial Intelligence

The article presents a new method for computing tomographic projections of a 3D volume using a linear combination of shifted B-splines. This method employs a ray-tracing algorithm to calculate 3D line integrals with various projection geometries. A neural network is integrated into the algorithm to efficiently compute the contributions of the basis functions, resulting in higher reconstruction quality compared to traditional voxel-based methods.

Read full article

via arXiv — cs.CV

Bi-Level Contextual Bandits for Individualized Resource Allocation under Delayed Feedback

arXiv — cs.LG2 days ago

Bi-Level Contextual Bandits for Individualized Resource Allocation under Delayed Feedback

PositiveArtificial Intelligence

The article discusses a novel bi-level contextual bandit framework aimed at individualized resource allocation in high-stakes domains such as education, employment, and healthcare. This framework addresses the challenges of delayed feedback, hidden heterogeneity, and ethical constraints, which are often overlooked in traditional learning-based allocation methods. The proposed model optimizes budget allocations at the subgroup level while identifying responsive individuals using a neural network trained on observational data.

Read full article

via arXiv — cs.LG