Attention Via Convolutional Nearest Neighbors

arXiv — cs.CV•Wednesday, November 19, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

The introduction of the Convolutional Nearest Neighbors (ConvNN) framework marks a significant advancement in the integration of Convolutional Neural Networks (CNNs) and Transformers, suggesting that both architectures can be unified under a single neighbor selection approach. This framework allows for a more nuanced exploration of their capabilities and interactions.
The development of ConvNN is crucial as it provides researchers and practitioners with a versatile tool that can enhance model performance across various tasks, particularly in computer vision, by bridging the gap between CNNs and Transformers.
This innovation reflects a broader trend in artificial intelligence research, where the blending of different architectures is becoming increasingly common. The exploration of hybrid models, such as ConvNN, aligns with ongoing efforts to improve model efficiency and effectiveness in tasks like image classification, as seen in related studies focusing on enhancing CNNs and Transformers.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

Recommended Readings

arXiv — cs.LG19 hours ago

WARP-LUTs - Walsh-Assisted Relaxation for Probabilistic Look Up Tables

PositiveArtificial Intelligence

WARP-LUTs, or Walsh-Assisted Relaxation for Probabilistic Look-Up Tables, is a novel gradient-based method introduced to enhance machine learning efficiency. This approach focuses on learning combinations of logic gates with fewer trainable parameters, addressing the high computational costs associated with training models like Differentiable Logic Gate Networks (DLGNs). WARP-LUTs aim to improve accuracy, resource usage, and latency, making them a significant advancement in the field of AI.

Read full article

via arXiv — cs.LG

arXiv — cs.CV19 hours ago

H-CNN-ViT: A Hierarchical Gated Attention Multi-Branch Model for Bladder Cancer Recurrence Prediction

PositiveArtificial Intelligence

Bladder cancer is a prevalent malignancy with a high recurrence rate of up to 78%, necessitating precise post-operative monitoring. Multi-sequence contrast-enhanced MRI is commonly utilized for recurrence detection, but interpreting these scans is challenging due to post-surgical changes. This study introduces a curated multi-sequence, multi-modal MRI dataset designed for bladder cancer recurrence prediction and proposes H-CNN-ViT, a new model aimed at enhancing prediction accuracy in this critical area.

Read full article

via arXiv — cs.CV

arXiv — cs.LG19 hours ago

Observational Auditing of Label Privacy

PositiveArtificial Intelligence

The article discusses a new framework for differential privacy auditing in machine learning systems. Traditional methods require altering training datasets, which can be resource-intensive. The proposed observational auditing framework utilizes the randomness of data distributions to evaluate privacy without modifying the original dataset. This approach extends privacy auditing to protected attributes, including labels, addressing significant gaps in existing techniques. Experiments conducted on Criteo and CIFAR-10 datasets validate its effectiveness.

Read full article

via arXiv — cs.LG

arXiv — cs.LG19 hours ago

MI-to-Mid Distilled Compression (M2M-DC): An Hybrid-Information-Guided-Block Pruning with Progressive Inner Slicing Approach to Model Compression

PositiveArtificial Intelligence

MI-to-Mid Distilled Compression (M2M-DC) is a novel compression framework that combines information-guided block pruning with progressive inner slicing and staged knowledge distillation. The method ranks residual blocks based on a mutual information signal, removing the least informative units. It alternates short knowledge distillation phases with channel slicing to maintain computational efficiency while preserving model accuracy. The approach has demonstrated promising results on CIFAR-100, achieving high accuracy with significantly reduced parameters.

Read full article

via arXiv — cs.LG

arXiv — cs.LG19 hours ago

Benchmark on Drug Target Interaction Modeling from a Drug Structure Perspective

PositiveArtificial Intelligence

The article discusses advancements in predicting drug-target interactions, a critical aspect of drug discovery and design. Recent methods utilizing deep learning technologies, particularly graph neural networks (GNNs) and Transformers, have shown remarkable performance by effectively extracting structural information. However, the benchmarking of these methods varies significantly, affecting algorithmic progress. The authors conducted a comprehensive survey and benchmark to integrate various structure learning algorithms for improved modeling.

Read full article

via arXiv — cs.LG

arXiv — cs.CV19 hours ago

Can You Learn to See Without Images? Procedural Warm-Up for Vision Transformers

PositiveArtificial Intelligence

This study explores a novel approach to enhance vision transformers (ViTs) by pretraining them on procedurally-generated data that lacks visual or semantic content. Utilizing simple algorithms, the research aims to instill generic biases in ViTs, allowing them to internalize abstract computational priors. The findings indicate that this warm-up phase, followed by standard image-based training, significantly boosts data efficiency, convergence speed, and overall performance, with notable improvements observed on ImageNet-1k.

Read full article

via arXiv — cs.CV

arXiv — cs.LG19 hours ago

DeepDefense: Layer-Wise Gradient-Feature Alignment for Building Robust Neural Networks

PositiveArtificial Intelligence

Deep neural networks are susceptible to adversarial perturbations that can lead to incorrect predictions. The paper introduces DeepDefense, a defense framework utilizing Gradient-Feature Alignment (GFA) regularization across multiple layers to mitigate this vulnerability. By aligning input gradients with internal feature representations, DeepDefense creates a smoother loss landscape, reducing sensitivity to adversarial noise. The method shows significant robustness improvements against various attacks, particularly on the CIFAR-10 dataset.

Read full article

via arXiv — cs.LG

arXiv — cs.LG19 hours ago

ARC Is a Vision Problem!

PositiveArtificial Intelligence

The Abstraction and Reasoning Corpus (ARC) aims to advance research in abstract reasoning, a key component of human intelligence. Traditional methods approach ARC as a language problem, often utilizing large language models or recurrent reasoning models. This study proposes a vision-centric approach, treating ARC as an image-to-image translation task. By using a 'canvas' for input representation, standard vision architectures like Vision Transformers (ViT) can be applied, allowing the model to generalize to new tasks through test-time training.

Read full article

via arXiv — cs.LG