arXiv:2511.11286v1 Announce Type: new 
Abstract: Out-of-domain (OOD) robustness is challenging to achieve in real-world computer vision applications, where shifts in image background, style, and acquisition instruments always degrade model performance. Generic augmentations show inconsistent gains under such shifts, whereas dataset-specific augmentations require expert knowledge and prior analysis. Moreover, prior studies show that neural networks adapt poorly to domain shifts because they exhibit a learning bias to domain-specific frequency components. Perturbing frequency values can mitigate such bias but overlooks pixel-level details, leading to suboptimal performance. To address these problems, we propose D-GAP (Dataset-agnostic and Gradient-guided augmentation in Amplitude and Pixel spaces), improving OOD robustness by introducing targeted augmentation in both the amplitude space (frequency space) and pixel space. Unlike conventional handcrafted augmentations, D-GAP computes sensitivity maps in the frequency space from task gradients, which reflect how strongly the model responds to different frequency components, and uses the maps to adaptively interpolate amplitudes between source and target samples. This way, D-GAP reduces the learning bias in frequency space, while a complementary pixel-space blending procedure restores fine spatial details. Extensive experiments on four real-world datasets and three domain-adaptation benchmarks show that D-GAP consistently outperforms both generic and dataset-specific augmentations, improving average OOD performance by +5.3% on real-world datasets and +1.8% on benchmark datasets.

يقدم المقال D-GAP (التحسين غير المعتمد على مجموعة البيانات والتوجيه بواسطة التدرج في فضاءات السعة والبكسل)، وهو نهج مبتكر يهدف إلى تعزيز القوة ضد التغيرات خارج النطاق (OOD) في تطبيقات رؤية الكمبيوتر. غالبًا ما تفشل التعزيزات التقليدية تحت ظروف الصورة المتغيرة، بينما يقدم D-GAP تعزيزات مستهدفة في كل من فضاءات السعة والبكسل. تتناول هذه الطريقة انحياز التعلم في الشبكات العصبية نحو مكونات التردد المحددة للنطاق، مما يؤدي إلى تحسين الأداء عبر مجموعات بيانات متنوعة.

El artículo presenta D-GAP (aumento agnóstico a conjuntos de datos y guiado por gradientes en espacios de amplitud y píxeles), un enfoque novedoso destinado a mejorar la robustez fuera de dominio (OOD) en aplicaciones de visión por computadora. Los aumentos tradicionales a menudo fallan bajo condiciones de imagen variables, mientras que D-GAP introduce aumentos dirigidos en los espacios de amplitud y píxeles. Este método aborda el sesgo de aprendizaje de las redes neuronales hacia componentes de frecuencia específicos del dominio, lo que lleva a un mejor rendimiento en diversos conjuntos de da…

L'article présente D-GAP (augmentation agnostique au jeu de données et guidée par le gradient dans les espaces d'amplitude et de pixels), une approche novatrice visant à améliorer la robustesse hors domaine (OOD) dans les applications de vision par ordinateur. Les augmentations traditionnelles échouent souvent sous des conditions d'image variées, tandis que D-GAP introduit des augmentations ciblées dans les espaces d'amplitude et de pixels. Cette méthode aborde le biais d'apprentissage des réseaux neuronaux envers les composants de fréquence spécifiques au domaine, conduisant à de meilleures p…

The article presents D-GAP (Dataset-agnostic and Gradient-guided augmentation in Amplitude and Pixel spaces), a novel approach aimed at enhancing out-of-domain (OOD) robustness in computer vision applications. Traditional augmentations often fail under varying image conditions, while D-GAP introduces targeted augmentations in both amplitude and pixel spaces. This method addresses the learning bias of neural networks towards domain-specific frequency components, leading to improved performance across diverse datasets.

D-GAP: Improving Out-of-Domain Robustness via Dataset-Agnostic and Gradient-Guided Augmentation in Amplitude and Pixel Spaces

arXiv:2601.07982v1 Announce Type: new 
Abstract: We develop a new statistical ideal observer model that performs holistic visual search (or gist) processing in part by placing thresholds on minimum extractable image features. In this model, the ideal observer reduces the number of free parameters thereby shrinking down the system. The applications of this novel framework is in medical image perception (for optimizing imaging systems and algorithms), computer vision, benchmarking performance and enabling feature selection/evaluations. Other applications are in target detection and recognition in defense/security as well as evaluating sensors and detectors.

تم تطوير نموذج جديد للمراقب المثالي الإحصائي لتحسين معالجة البحث البصري الشامل من خلال وضع حدود على الميزات القابلة للاستخراج من الصورة. يهدف هذا النموذج إلى تبسيط النظام من خلال تقليل المعلمات الحرة، مع تطبيقات في إدراك الصور الطبية ورؤية الكمبيوتر والدفاع/الأمن.

Se ha desarrollado un nuevo modelo estadístico de observador ideal para mejorar el procesamiento de búsqueda visual holística al establecer umbrales sobre las características de imagen mínimas extraíbles. Este modelo busca optimizar el sistema reduciendo los parámetros libres, con aplicaciones en percepción de imágenes médicas, visión por computadora y defensa/seguridad.

Un nouveau modèle d'observateur idéal statistique a été développé pour améliorer le traitement de la recherche visuelle holistique en établissant des seuils sur les caractéristiques d'image minimales extractibles. Ce modèle vise à rationaliser le système en réduisant les paramètres libres, avec des applications dans la perception d'images médicales, la vision par ordinateur et la défense/sécurité.

A new statistical ideal observer model has been developed to enhance holistic visual search processing by establishing thresholds on minimum extractable image features. This model aims to streamline the system by reducing free parameters, with applications in medical image perception, computer vision, and defense/security.

Likelihood ratio for a binary Bayesian classifier under a noise-exclusion model

arXiv:2601.07976v1 Announce Type: cross 
Abstract: This study advances task-based image quality assessment by developing an anthropomorphic thresholded visual-search model observer. The model is an ideal observer for thresholded data inspired by the human visual system, allowing selective processing of high-salience features to improve discrimination performance. By filtering out irrelevant variability, the model enhances diagnostic accuracy and computational efficiency.
  The observer employs a two-stage framework: candidate selection and decision-making. Using thresholded data during candidate selection refines regions of interest, while stage-specific feature processing optimizes performance. Simulations were conducted to evaluate the effects of thresholding on feature maps, candidate localization, and multi-feature scenarios. Results demonstrate that thresholding improves observer performance by excluding low-salience features, particularly in noisy environments. Intermediate thresholds often outperform no thresholding, indicating that retaining only relevant features is more effective than keeping all features.
  Additionally, the model demonstrates effective training with fewer images while maintaining alignment with human performance. These findings suggest that the proposed novel framework can predict human visual search performance in clinically realistic tasks and provide solutions for model observer training with limited resources. Our novel approach has applications in other areas where human visual search and detection tasks are modeled such as in computer vision, machine learning, defense and security image analysis.

قدمت دراسة حديثة نموذجًا للمراقب البصري الأنثروبومورفي المعتمد على العتبة، مما يعزز تقييم جودة الصورة القائم على المهام من خلال تقليد النظام البصري البشري. يقوم هذا النموذج بمعالجة الميزات ذات الأهمية العالية بشكل انتقائي، مما يحسن أداء التمييز والدقة التشخيصية بينما يقوم بتصفية التباين غير ذي الصلة.

Un estudio reciente ha presentado un modelo de observador visual antropomórfico umbralizado, mejorando la evaluación de la calidad de imagen basada en tareas al imitar el sistema visual humano. Este modelo procesa selectivamente características de alta saliencia, mejorando el rendimiento de discriminación y la precisión diagnóstica mientras filtra la variabilidad irrelevante.

Une étude récente a introduit un modèle d'observateur visuel anthropomorphique à seuil, améliorant l'évaluation de la qualité d'image basée sur des tâches en imitant le système visuel humain. Ce modèle traite sélectivement des caractéristiques de haute saillance, améliorant la performance de discrimination et la précision diagnostique tout en filtrant la variabilité non pertinente.

A recent study has introduced an anthropomorphic thresholded visual-search model observer, enhancing task-based image quality assessment by mimicking the human visual system. This model selectively processes high-salience features, improving discrimination performance and diagnostic accuracy while filtering out irrelevant variability.

Application of Ideal Observer for Thresholded Data in Search Task

One More Thing in AI – Your Shortcut to AI Mastery

D-GAP: Improving Out-of-Domain Robustness via Dataset-Agnostic and Gradient-Guided Augmentation in Amplitude and Pixel Spaces

Was this article worth reading? Share it

One More Thing in AI

LucidQuery AI

Attentive AI

GPTHumanizer

Blunge

Omnifact

Ready to build your own newsroom?