S$^2$M-Former: Spiking Symmetric Mixing Branchformer for Brain Auditory Attention Detection

arXiv — cs.LGWednesday, November 12, 2025 at 5:00:00 AM
The S$^2$M-Former represents a significant advancement in auditory attention detection (AAD), a field crucial for developing neuro-steered hearing devices. By leveraging a spiking symmetric architecture with parallel spatial and frequency branches, it enhances the complementary learning of EEG features. The introduction of lightweight 1D token sequences allows for a remarkable 14.7 times reduction in parameters, while the brain-inspired design achieves a 5.8 times reduction in energy consumption compared to recent artificial neural network (ANN) methods. Comprehensive experiments on three AAD benchmarks demonstrate that S$^2$M-Former not only excels in energy efficiency but also achieves comparable state-of-the-art decoding accuracy, marking a pivotal step forward in the application of EEG technology in complex auditory environments.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended Readings
STAMP: Spatial-Temporal Adapter with Multi-Head Pooling
PositiveArtificial Intelligence
The article introduces STAMP, a Spatial-Temporal Adapter with Multi-Head Pooling, designed for time series foundation models (TSFMs) specifically applied to electroencephalography (EEG) data. STAMP utilizes univariate embeddings from general TSFMs to model the spatial-temporal characteristics of EEG data effectively. The study demonstrates that STAMP achieves performance comparable to state-of-the-art EEG-specific foundation models (EEGFMs) across eight benchmark datasets used for classification tasks.
On bounds for norms of reparameterized ReLU artificial neural network parameters: sums of fractional powers of the Lipschitz norm control the network parameter vector
NeutralArtificial Intelligence
A recent study published on arXiv discusses the bounds for norms of reparameterized ReLU artificial neural network (ANN) parameters. It establishes that the Lipschitz norm of the realization function of a feedforward fully-connected ReLU ANN can be bounded from above by sums of powers of the ANN parameter vector norm. The study also reveals that for shallow ANNs, the converse inequality holds true, and the upper bound is valid only when using the Lipschitz norm, not for Hölder or Sobolev-Slobodeckij norms.
CAT-Net: A Cross-Attention Tone Network for Cross-Subject EEG-EMG Fusion Tone Decoding
PositiveArtificial Intelligence
The study presents CAT-Net, a novel cross-subject multimodal brain-computer interface (BCI) decoding framework that integrates electroencephalography (EEG) and electromyography (EMG) signals to classify four Mandarin tones. This approach addresses the challenges of tonal variations in Mandarin, which can alter meanings despite identical phonemes. The framework demonstrates strong performance, achieving classification accuracies of 87.83% for audible speech and 88.08% for silent speech across 4800 EEG and 4800 EMG trials with 10 participants.
EMOD: A Unified EEG Emotion Representation Framework Leveraging V-A Guided Contrastive Learning
PositiveArtificial Intelligence
The article discusses EMOD, a new framework for emotion recognition from EEG signals, which addresses the limitations of existing deep learning models. These models often struggle with generalization across different datasets due to varying annotation schemes and data formats. EMOD utilizes Valence-Arousal (V-A) Guided Contrastive Learning to create transferable representations from heterogeneous datasets, projecting emotion labels into a unified V-A space and employing a soft-weighted supervised contrastive loss to enhance performance.
Shrinking the Teacher: An Adaptive Teaching Paradigm for Asymmetric EEG-Vision Alignment
PositiveArtificial Intelligence
The article discusses a new adaptive teaching paradigm aimed at improving the decoding of visual features from EEG signals. It highlights the inherent asymmetry between visual and brain modalities, characterized by a Fidelity Gap and a Semantic Gap. The proposed method allows the visual modality to adjust its knowledge structure to better align with the EEG modality, achieving a top-1 accuracy of 60.2%, which is a 9.8% improvement over previous methods.