LungX: A Hybrid EfficientNet-Vision Transformer Architecture with Multi-Scale Attention for Accurate Pneumonia Detection

arXiv — cs.CV•Tuesday, November 25, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

LungX, a new hybrid architecture combining EfficientNet and Vision Transformer, has been introduced to enhance pneumonia detection accuracy, achieving 86.5% accuracy and a 0.943 AUC on a dataset of 20,000 chest X-rays. This development is crucial as timely diagnosis of pneumonia is vital for reducing mortality rates associated with the disease.
The introduction of LungX represents a significant advancement in AI diagnostic tools, aiming for clinical deployment with a target accuracy of 88%. This could potentially transform pneumonia detection practices in healthcare settings, offering more reliable and interpretable results through advanced attention mechanisms.
The integration of multi-scale features and attention mechanisms in LungX aligns with ongoing trends in AI healthcare applications, where models are increasingly designed to provide explainable results. This reflects a broader movement towards improving diagnostic accuracy and interpretability in medical imaging, as seen in other studies utilizing Vision Transformers and deep learning frameworks for various conditions.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

AIPortalX

Browse, compare, and use over 100 verified AI models with detailed insights and filtering.

Creative & DesignTry the app

Brainactive

Accelerate your research with AI-powered insights at an affordable price.

Tech & Developer ToolsTry the app

Twofold Health

Automate medical documentation with AI for accuracy, security, and seamless integration.

AI & DataTry the app

Continue Readings

arXiv — cs.LGa day ago

VLCE: A Knowledge-Enhanced Framework for Image Description in Disaster Assessment

PositiveArtificial Intelligence

The Vision Language Caption Enhancer (VLCE) has been introduced as a multimodal framework designed to improve image description in disaster assessments by integrating external semantic knowledge from ConceptNet and WordNet. This framework addresses the limitations of current Vision-Language Models (VLMs) that often fail to generate disaster-specific descriptions due to a lack of domain knowledge.

Read full article

via arXiv — cs.LG

arXiv — cs.LGa day ago

ScriptViT: Vision Transformer-Based Personalized Handwriting Generation

PositiveArtificial Intelligence

A new framework named ScriptViT has been introduced, utilizing Vision Transformer technology to enhance personalized handwriting generation. This approach aims to synthesize realistic handwritten text that aligns closely with individual writer styles, addressing challenges in capturing global stylistic patterns and subtle writer-specific traits.

Read full article

via arXiv — cs.LG

arXiv — cs.LGa day ago

Functional Localization Enforced Deep Anomaly Detection Using Fundus Images

PositiveArtificial Intelligence

A recent study has demonstrated the effectiveness of a Vision Transformer (ViT) classifier in detecting retinal diseases from fundus images, achieving accuracies between 0.789 and 0.843 across various datasets, including the newly developed AEyeDB. The study highlights the challenges posed by imaging quality and subtle disease manifestations, particularly in diabetic retinopathy and age-related macular degeneration, while noting glaucoma as a frequently misclassified condition.

Read full article

via arXiv — cs.LG

arXiv — cs.CVa day ago

Algorithms Trained on Normal Chest X-rays Can Predict Health Insurance Types

NeutralArtificial Intelligence

Recent advancements in artificial intelligence have enabled deep learning models, trained on normal chest X-rays, to predict patients' health insurance types, which serve as a proxy for socioeconomic status. The study demonstrated significant accuracy using architectures like DenseNet121 and SwinV2-B, with AUC values around 0.67 and 0.68 on respective datasets.

Read full article

via arXiv — cs.CV

arXiv — cs.CVa day ago

EVCC: Enhanced Vision Transformer-ConvNeXt-CoAtNet Fusion for Classification

PositiveArtificial Intelligence

The introduction of EVCC (Enhanced Vision Transformer-ConvNeXt-CoAtNet) marks a significant advancement in hybrid vision architectures, integrating Vision Transformers, lightweight ConvNeXt, and CoAtNet. This multi-branch architecture employs innovative techniques such as adaptive token pruning and gated bidirectional cross-attention, achieving state-of-the-art accuracy on various datasets while reducing computational costs by 25 to 35% compared to existing models.

Read full article

via arXiv — cs.CV

arXiv — cs.CVa day ago

Toward explainable AI approaches for breast imaging: adapting foundation models to diverse populations

PositiveArtificial Intelligence

A recent study has adapted the BiomedCLIP foundation model for breast imaging, focusing on automated BI-RADS breast density classification using a diverse dataset of 96,995 images. The research compared single-modality and multi-modality training approaches, achieving similar accuracy levels while highlighting the multi-modality model's broader applicability and strong generalization capabilities across different imaging modalities.

Read full article

via arXiv — cs.CV

arXiv — cs.CVa day ago

Less Is More: An Explainable AI Framework for Lightweight Malaria Classification

PositiveArtificial Intelligence

A new study introduces the Extracted Morphological Feature Engineered (EMFE) pipeline, a lightweight machine learning approach for malaria classification that achieves performance levels comparable to deep learning models while requiring significantly less computational power. This method utilizes the NIH Malaria Cell Images dataset, focusing on simple cell morphology features such as non-background pixels and holes within cells.

Read full article

via arXiv — cs.CV

arXiv — cs.CVa day ago

Large-Scale Pre-training Enables Multimodal AI Differentiation of Radiation Necrosis from Brain Metastasis Progression on Routine MRI

PositiveArtificial Intelligence

A recent study has demonstrated that large-scale pre-training using self-supervised learning can effectively differentiate radiation necrosis from tumor progression in brain metastases using routine MRI scans. This approach utilized a Vision Transformer model pre-trained on over 10,000 unlabeled MRI sub-volumes and fine-tuned on a public dataset, achieving promising results in classification accuracy.

Read full article

via arXiv — cs.CV