LUNA: Linear Universal Neural Attention with Generalization Guarantees

arXiv — stat.ML•Wednesday, December 10, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

A new linear attention mechanism named LUNA has been introduced, addressing the computational bottleneck of traditional softmax attention, which operates at a quadratic cost. LUNA achieves linear cost while maintaining or exceeding the accuracy of quadratic attention by learning the kernel feature map tailored to specific data and tasks.
This advancement is significant as it allows for more efficient processing of long sequences in various applications, potentially enhancing the performance of models like BERT and ViT-B/16 without sacrificing accuracy.
The development of LUNA reflects a broader trend in artificial intelligence towards optimizing attention mechanisms, as seen in various frameworks that integrate multi-modal data and enhance capabilities in areas such as financial sentiment analysis and time series forecasting.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

LucidQuery AI

Combines diffusion reasoning with autoregressive LLM for advanced AI analysis.

AI & DataView app details

Airparser

Extract and parse data from documents using GPT-4 automation.

AI & DataView app details

Attentive AI

Extract digital maps from satellite, aerial, and drone imagery using deep learning.

AI & DataView app details

Continue Readings

arXiv — cs.LG2 days ago

Using Text-Based Life Trajectories from Swedish Register Data to Predict Residential Mobility with Pretrained Transformers

PositiveArtificial Intelligence

A recent study has transformed extensive Swedish register data into textual life trajectories to predict residential mobility, utilizing data from 6.9 million individuals between 2001 and 2013. By converting demographic and life changes into semantically rich texts, the research employs various NLP architectures, including LSTM and BERT, to enhance prediction accuracy for residential moves from 2013 to 2017.

Read full article

via arXiv — cs.LG

arXiv — cs.LG2 days ago

Language Models for Controllable DNA Sequence Design

PositiveArtificial Intelligence

Researchers have introduced ATGC-Gen, an Automated Transformer Generator designed for controllable DNA sequence design, which generates sequences based on specific biological properties. This model utilizes cross-modal encoding and can operate under various transformer architectures, enhancing its flexibility in training and generation tasks, particularly in promoter and enhancer sequence design.

Read full article

via arXiv — cs.LG

arXiv — cs.CV3 days ago

SPOOF: Simple Pixel Operations for Out-of-Distribution Fooling

NeutralArtificial Intelligence

A new study titled 'SPOOF: Simple Pixel Operations for Out-of-Distribution Fooling' reveals that deep neural networks (DNNs) continue to show overconfidence in misclassifying inputs that do not resemble natural images. The research revisits fooling images and confirms that modern architectures, particularly the transformer-based ViT-B/16, are highly susceptible to misclassifications with fewer queries compared to convolution-based models.

Read full article

via arXiv — cs.CV

arXiv — cs.LG3 days ago

Less Is More for Multi-Step Logical Reasoning of LLM Generalisation Under Rule Removal, Paraphrasing, and Compression

NeutralArtificial Intelligence

Recent research has introduced a controlled evaluation framework to assess the generalization capabilities of large language models (LLMs) like BERT, Qwen2, and LLaMA under various logical perturbations, including rule deletion and contradictory evidence. The findings indicate that these models maintain high accuracy despite structural changes in reasoning tasks.

Read full article

via arXiv — cs.LG

arXiv — cs.LG3 days ago

LUNA: LUT-Based Neural Architecture for Fast and Low-Cost Qubit Readout

PositiveArtificial Intelligence

A new architecture named LUNA has been proposed to enhance qubit readout in quantum computing, utilizing a combination of low-cost integrator-based preprocessing and Look-Up Table (LUT) based neural networks. This approach aims to improve the accuracy and speed of qubit readout, which is essential for quantum error correction and low-latency decoding processes.

Read full article

via arXiv — cs.LG