DReX: Pure Vision Fusion of Self-Supervised and Convolutional Representations for Image Complexity Prediction

arXiv — cs.CV•Monday, November 24, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

DReX, a new vision-only model, has been introduced to predict image complexity by fusing self-supervised and convolutional representations through a learnable attention mechanism. This model integrates multi-scale hierarchical features from ResNet-50 with semantically rich representations from DINOv3 ViT-S/16, achieving state-of-the-art performance on the IC9600 benchmark.
The development of DReX is significant as it addresses the fundamental problem of visual complexity prediction, which has implications for various applications in computer vision, including image compression, retrieval, and classification, while also contributing to the understanding of human perception of image complexity.
This advancement reflects a broader trend in artificial intelligence where models are increasingly leveraging innovative architectures and attention mechanisms to enhance performance. The integration of ResNet-50 in DReX aligns with ongoing research in computer vision that explores the intersection of geometric and numerical concepts, as well as the application of attention mechanisms in medical imaging and other domains.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

LucidQuery AI

Combines diffusion reasoning with autoregressive LLM for advanced AI analysis.

AI & DataTry the app

The Visualizer

Transform complex topics into clear, visual explanations for effortless learning.

AI & DataTry the app

Attentive AI

Extract digital maps from satellite, aerial, and drone imagery using deep learning.

AI & DataTry the app

Continue Readings

arXiv — cs.CV19 hours ago

Balanced Few-Shot Episodic Learning for Accurate Retinal Disease Diagnosis

PositiveArtificial Intelligence

A new study introduces a balanced few-shot episodic learning framework aimed at improving the accuracy of automated retinal disease diagnosis, particularly for conditions like diabetic retinopathy and macular degeneration. This method utilizes the Retinal Fundus Multi-Disease Image Dataset (RFMiD) and addresses the challenge of imbalanced datasets in conventional deep learning approaches.

Read full article

via arXiv — cs.CV

arXiv — cs.CV19 hours ago

XAI-Driven Skin Disease Classification: Leveraging GANs to Augment ResNet-50 Performance

PositiveArtificial Intelligence

A new study has introduced a Computer-Aided Diagnosis (CAD) system that utilizes Deep Convolutional Generative Adversarial Networks (DCGANs) to augment data for training a fine-tuned ResNet-50 classifier, achieving an impressive accuracy of 92.50% in classifying seven skin disease categories. The integration of Explainable AI techniques, LIME and SHAP, enhances the transparency of predictions based on clinically relevant features.

Read full article

via arXiv — cs.CV

arXiv — cs.CV3 days ago

Tackling Tuberculosis: A Comparative Dive into Machine Learning for Tuberculosis Detection

PositiveArtificial Intelligence

A recent study has investigated the use of machine learning models, specifically ResNet-50 and SqueezeNet, for diagnosing tuberculosis (TB) through chest X-ray images. The research utilized a dataset of 4,200 X-rays from Kaggle, highlighting the limitations of traditional diagnostic methods in resource-limited settings. Results indicated that SqueezeNet achieved a notable performance with a loss of 32% and accuracy metrics that underscore the potential of deep learning in TB detection.

Read full article

via arXiv — cs.CV

arXiv — cs.CV3 days ago

Comparing Baseline and Day-1 Diffusion MRI Using Multimodal Deep Embeddings for Stroke Outcome Prediction

PositiveArtificial Intelligence

A study has compared baseline and 24-hour diffusion MRI to predict three-month outcomes after acute ischemic stroke (AIS) in 74 patients. The research utilized three-dimensional ResNet-50 embeddings combined with clinical data, achieving a predictive performance of AUC = 0.923 for the 24-hour MRI, surpassing the baseline's AUC of 0.86. Incorporating lesion-volume features enhanced model stability and interpretability.

Read full article

via arXiv — cs.CV