RS-CA-HSICT: A Residual and Spatial Channel Augmented CNN Transformer Framework for Monkeypox Detection

arXiv — cs.LG•Thursday, November 20, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

The RS
This development is significant as it leverages advanced machine learning techniques to address public health concerns, particularly in the context of emerging infectious diseases like monkeypox. Enhanced detection capabilities can lead to better monitoring and response strategies.
The integration of CNN and Transformer models reflects a broader trend in artificial intelligence, where hybrid approaches are increasingly utilized to tackle complex problems across various domains, including healthcare and robotics, highlighting the ongoing evolution of deep learning methodologies.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

Recommended Readings

arXiv — cs.LG7 hours ago

BrainRotViT: Transformer-ResNet Hybrid for Explainable Modeling of Brain Aging from 3D sMRI

PositiveArtificial Intelligence

The BrainRotViT model combines Vision Transformer and ResNet architectures to improve brain age estimation from structural MRI scans. This hybrid approach addresses limitations of traditional methods, such as manual feature engineering and overfitting, by leveraging both global context and local refinement. The model is trained on auxiliary tasks to enhance feature extraction, ultimately providing a more accurate estimation of brain age, which is crucial for understanding aging and neurodegenerative conditions.

Read full article

via arXiv — cs.LG

arXiv — cs.CV7 hours ago

When CNNs Outperform Transformers and Mambas: Revisiting Deep Architectures for Dental Caries Segmentation

PositiveArtificial Intelligence

This study presents a comprehensive benchmarking of convolutional neural networks (CNNs), vision transformers, and state-space mamba architectures for automated dental caries segmentation using panoramic radiographs. The research, utilizing the DC1000 dataset, reveals that the CNN-based DoubleU-Net outperformed other architectures, achieving the highest dice coefficient, mIoU, and precision, highlighting the effectiveness of simpler models in this domain.

Read full article

via arXiv — cs.CV

arXiv — cs.CV7 hours ago

A Multimodal Transformer Approach for UAV Detection and Aerial Object Recognition Using Radar, Audio, and Video Data

PositiveArtificial Intelligence

This research presents a novel multimodal Transformer model for unmanned aerial vehicle (UAV) detection and aerial object recognition, integrating radar, RGB video, infrared video, and audio data. The model utilizes self-attention mechanisms to create comprehensive representations for classification, achieving high performance metrics, including 0.9812 accuracy and 0.9954 specificity on an independent test set.

Read full article

via arXiv — cs.CV

arXiv — cs.CV7 hours ago

A Hybrid CNN-ViT-GNN Framework with GAN-Based Augmentation for Intelligent Weed Detection in Precision Agriculture

PositiveArtificial Intelligence

The paper presents a hybrid deep learning framework for weed detection in precision agriculture, combining Convolutional Neural Networks (CNNs), Vision Transformers (ViTs), and Graph Neural Networks (GNNs). This approach enhances robustness across various field conditions and employs a Generative Adversarial Network (GAN) for data augmentation, achieving an impressive accuracy of 99.33% on benchmark datasets. The model's architecture supports comprehensive feature representation, crucial for sustainable crop management.

Read full article

via arXiv — cs.CV

arXiv — cs.CV7 hours ago

H-CNN-ViT: A Hierarchical Gated Attention Multi-Branch Model for Bladder Cancer Recurrence Prediction

PositiveArtificial Intelligence

Bladder cancer, with a recurrence rate of up to 78%, poses significant challenges for post-operative monitoring. Traditional multi-sequence contrast-enhanced MRI scans are often difficult to interpret due to changes from surgery. This study introduces H-CNN-ViT, a new AI model designed to enhance bladder cancer recurrence prediction by utilizing a curated multi-sequence MRI dataset, which aims to improve diagnostic accuracy and patient management.

Read full article

via arXiv — cs.CV

arXiv — cs.LGa day ago

Blurred Encoding for Trajectory Representation Learning

PositiveArtificial Intelligence

The article presents a novel approach to trajectory representation learning (TRL) through a method called BLUrred Encoding (BLUE). This technique addresses the limitations of existing TRL methods that often lose fine-grained spatial-temporal details by grouping GPS points into larger segments. BLUE creates hierarchical patches of varying sizes, allowing for the preservation of detailed travel semantics while capturing overall travel patterns. The model employs an encoder-decoder structure with a pyramid design to enhance the representation of trajectories.

Read full article

via arXiv — cs.LG

arXiv — cs.LGa day ago

Self-Attention as Distributional Projection: A Unified Interpretation of Transformer Architecture

NeutralArtificial Intelligence

This paper presents a mathematical interpretation of self-attention by connecting it to distributional semantics principles. It demonstrates that self-attention arises from projecting corpus-level co-occurrence statistics into sequence context. The authors show how the query-key-value mechanism serves as an asymmetric extension for modeling directional relationships, with positional encodings and multi-head attention as structured refinements. The analysis indicates that the Transformer architecture's algebraic form is derived from these projection principles.

Read full article

via arXiv — cs.LG

arXiv — cs.LGa day ago

DeepDefense: Layer-Wise Gradient-Feature Alignment for Building Robust Neural Networks

PositiveArtificial Intelligence

Deep neural networks are susceptible to adversarial perturbations that can lead to incorrect predictions. The paper introduces DeepDefense, a defense framework utilizing Gradient-Feature Alignment (GFA) regularization across multiple layers to mitigate this vulnerability. By aligning input gradients with internal feature representations, DeepDefense creates a smoother loss landscape, reducing sensitivity to adversarial noise. The method shows significant robustness improvements against various attacks, particularly on the CIFAR-10 dataset.

Read full article

via arXiv — cs.LG