AVM: Towards Structure-Preserving Neural Response Modeling in the Visual Cortex Across Stimuli and Individuals

arXiv — cs.CVMonday, December 22, 2025 at 5:00:00 AM
  • The Adaptive Visual Model (AVM) has been introduced as a structure-preserving framework for modeling neural responses in the visual cortex, addressing limitations in deep learning models that struggle to separate stable visual encoding from condition-specific adaptations. AVM utilizes a frozen Vision Transformer-based encoder and modular subnetworks to adapt to variations in stimuli and individual identities.
  • This development is significant as it enhances the ability to generalize neural response modeling across different stimuli and subjects, potentially improving applications in neuroscience and artificial intelligence. By maintaining a consistent representation while allowing for condition-aware adaptations, AVM could lead to more accurate simulations of neural responses.
  • The introduction of AVM aligns with ongoing advancements in Vision Transformer architectures, which are increasingly being utilized across various domains, including robotics and medical imaging. The ability to effectively model neural responses may contribute to broader discussions on the integration of AI with cognitive neuroscience, as well as the exploration of explainable AI methods that mimic human-like processing.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
Knowledge-based learning in Text-RAG and Image-RAG
NeutralArtificial Intelligence
A recent study analyzed the multi-modal approach in the Vision Transformer (EVA-ViT) image encoder combined with LlaMA and ChatGPT large language models (LLMs) to address hallucination issues and enhance disease detection in chest X-ray images. The research utilized the NIH Chest X-ray dataset, comparing image-based and text-based retrieval-augmented generation (RAG) methods, revealing that text-based RAG effectively mitigates hallucinations while image-based RAG improves prediction confidence.
Temporal-Enhanced Interpretable Multi-Modal Prognosis and Risk Stratification Framework for Diabetic Retinopathy (TIMM-ProRS)
PositiveArtificial Intelligence
A novel deep learning framework named TIMM-ProRS has been introduced to enhance the prognosis and risk stratification of diabetic retinopathy (DR), a condition that threatens the vision of millions worldwide. This framework integrates Vision Transformer, Convolutional Neural Network, and Graph Neural Network technologies, utilizing both retinal images and temporal biomarkers to achieve a high accuracy rate of 97.8% across multiple datasets.

Ready to build your own newsroom?

Subscribe to unlock a personalised feed, podcasts, newsletters, and notifications tailored to the topics you actually care about