Diffusion Classifiers Understand Compositionality, but Conditions Apply

arXiv — cs.CV•Tuesday, November 4, 2025 at 5:00:00 AM

Recent advancements in diffusion models are reshaping our understanding of visual scenes, a key aspect of human intelligence. While traditional discriminative models have made strides in computer vision, they often fall short in grasping compositionality. However, generative text-to-image diffusion models have shown remarkable capabilities in synthesizing complex scenes, indicating a potential for deeper compositional understanding. This development is significant as it opens new avenues for applying zero-shot diffusion classifiers, enhancing the versatility and effectiveness of these models in various applications.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

Continue Readings

arXiv — cs.CV2 days ago

Automatic Uncertainty-Aware Synthetic Data Bootstrapping for Historical Map Segmentation

PositiveArtificial Intelligence

The automated analysis of historical maps has significantly improved due to advancements in deep learning, particularly in computer vision. However, the scarcity of annotated training data for specific historical map corpora poses a challenge. To address this, a method for generating synthetic historical maps by transferring the cartographic style of original maps onto vector data has been proposed, enabling the creation of an unlimited number of training samples for machine learning tasks.

Read full article

via arXiv — cs.CV

arXiv — cs.LG2 days ago

Unified all-atom molecule generation with neural fields

PositiveArtificial Intelligence

FuncBind is a new framework designed for structure-based drug design that utilizes neural fields to generate target-conditioned, all-atom molecules. This approach allows for a unified model capable of handling diverse atomic systems, including small and large molecules, and non-canonical amino acids. FuncBind demonstrates competitive performance in generating various molecular structures, including small molecules and macrocyclic peptides, conditioned on target structures.

Read full article

via arXiv — cs.LG

arXiv — cs.CL2 days ago

TS-PEFT: Token-Selective Parameter-Efficient Fine-Tuning with Learnable Threshold Gating

PositiveArtificial Intelligence

The paper introduces Token-Selective Parameter-Efficient Fine-Tuning (TS-PEFT), a novel approach in natural language processing and computer vision that selectively applies modifications to a subset of position indices. This method challenges the traditional Parameter-Efficient Fine-Tuning (PEFT) approach, which indiscriminately modifies all indices. Experimental results indicate that the targeted application of TS-PEFT can enhance performance on downstream tasks, suggesting a shift towards more efficient fine-tuning strategies.

Read full article

via arXiv — cs.CL

arXiv — stat.ML2 days ago

Enhancing Visual Feature Attribution via Weighted Integrated Gradients

PositiveArtificial Intelligence

The paper introduces Weighted Integrated Gradients (WG), an advanced method for feature attribution in explainable AI, particularly in computer vision. WG addresses the limitations of Integrated Gradients (IG) by adaptively selecting and weighting baseline images, improving attribution reliability. This method preserves the core properties of IG while enhancing the quality of explanations, making it a significant contribution to the field.

Read full article

via arXiv — stat.ML