arXiv:2412.02542v3 Announce Type: replace 
Abstract: Diffusion models have shown remarkable abilities in generating realistic and high-quality images from text prompts. However, a trained model remains largely black-box; little do we know about the roles of its components in exhibiting a concept such as objects or styles. Recent works employ causal tracing to localize knowledge-storing layers in generative models without showing how other layers contribute to the target concept. In this work, we approach diffusion models' interpretability problem from a more general perspective and pose a question: \textit{``How do model components work jointly to demonstrate knowledge?''}. To answer this question, we decompose diffusion models using component attribution, systematically unveiling the importance of each component (specifically the model parameter) in generating a concept. The proposed framework, called \textbf{C}omponent \textbf{A}ttribution for \textbf{D}iffusion Model (CAD), discovers the localization of concept-inducing (positive) components, while interestingly uncovers another type of components that contribute negatively to generating a concept, which is missing in the previous knowledge localization work. Based on this holistic understanding of diffusion models, we introduce two fast, inference-time model editing algorithms, CAD-Erase and CAD-Amplify; in particular, CAD-Erase enables erasure and CAD-Amplify allows amplification of a generated concept by ablating the positive and negative components, respectively, while retaining knowledge of other concepts. Extensive experimental results validate the significance of both positive and negative components pinpointed by our framework, demonstrating the potential of providing a complete view of interpreting generative models. Our code is available \href{https://github.com/mail-research/CAD-attribution4diffusion}{here}.

أظهرت دراسة حديثة كيفية عمل نماذج الانتشار، المعروفة بقدرتها المذهلة على توليد صور عالية الجودة من النصوص. على الرغم من فعالية هذه النماذج، إلا أنها غالبًا ما تعمل كصناديق سوداء، مما يترك العديد من الأسئلة حول كيفية مساهمة المكونات المختلفة في توليد مفاهيم معينة مثل الكائنات أو الأنماط. تقدم هذه البحث طريقة لتتبع الأسباب تساعد في تحديد الطبقات التي تخزن المعرفة في هذه النماذج التوليدية، مما يعزز فهمنا لوظيفتها وقد يؤدي إلى تحسين تصميمها.

Un estudio reciente ha arrojado luz sobre el funcionamiento interno de los modelos de difusión, conocidos por su impresionante capacidad para generar imágenes de alta calidad a partir de descripciones textuales. Aunque estos modelos han sido efectivos, a menudo operan como cajas negras, dejando muchas preguntas sobre cómo diferentes componentes contribuyen a la generación de conceptos específicos como objetos o estilos. Esta investigación presenta un método de trazado causal que ayuda a identificar qué capas en estos modelos generativos almacenan conocimiento, mejorando nuestra comprensión de su funcionalidad y potencialmente su diseño.

Une étude récente a mis en lumière le fonctionnement interne des modèles de diffusion, connus pour leur capacité impressionnante à générer des images de haute qualité à partir de descriptions textuelles. Bien que ces modèles soient efficaces, ils fonctionnent souvent comme des boîtes noires, laissant de nombreuses questions sur la manière dont les différentes composantes contribuent à la génération de concepts spécifiques tels que des objets ou des styles. Cette recherche introduit une méthode de traçage causal qui aide à identifier quelles couches dans ces modèles génératifs stockent des connaissances, améliorant ainsi notre compréhension de leur fonctionnement et potentiellement leur conception.

A recent study has shed light on the inner workings of diffusion models, which are known for their impressive ability to generate high-quality images from text prompts. While these models have been effective, they often operate as black boxes, leaving many questions about how different components contribute to the generation of specific concepts like objects or styles. This research introduces a method for causal tracing that helps identify which layers in these generative models store knowledge, enhancing our understanding of their functionality and potentially improving their design.

Unveiling Concept Attribution in Diffusion Models

arXiv:2505.18227v3 Announce Type: replace 
Abstract: In Transformer architectures, tokens\textemdash discrete units derived from raw data\textemdash are formed by segmenting inputs into fixed-length chunks. Each token is then mapped to an embedding, enabling parallel attention computations while preserving the input's essential information. Due to the quadratic computational complexity of transformer self-attention mechanisms, token reduction has primarily been used as an efficiency strategy. This is especially true in single vision and language domains, where it helps balance computational costs, memory usage, and inference latency. Despite these advances, this paper argues that token reduction should transcend its traditional efficiency-oriented role in the era of large generative models. Instead, we position it as a fundamental principle in generative modeling, critically influencing both model architecture and broader applications. Specifically, we contend that across vision, language, and multimodal systems, token reduction can: (i) facilitate deeper multimodal integration and alignment, (ii) mitigate "overthinking" and hallucinations, (iii) maintain coherence over long inputs, and (iv) enhance training stability, etc. We reframe token reduction as more than an efficiency measure. By doing so, we outline promising future directions, including algorithm design, reinforcement learning-guided token reduction, token optimization for in-context learning, agentic framework design, and broader ML and scientific domains.

يبرز مقال حديث أن تقليل الرموز في هياكل المحولات يجب أن يتجاوز مجرد الكفاءة، داعيًا إلى دوره كمبدأ أساسي في النمذجة التوليدية عبر مجالات متعددة، بما في ذلك الرؤية واللغة.

Un artículo reciente enfatiza que la reducción de tokens en arquitecturas Transformer debería ir más allá de la mera eficiencia, abogando por su papel como un principio fundamental en la modelización generativa en varios dominios, incluyendo visión y lenguaje.

Un article récent souligne que la réduction des tokens dans les architectures Transformer devrait aller au-delà de la simple efficacité, plaidant pour son rôle en tant que principe fondamental dans la modélisation générative à travers divers domaines, y compris la vision et le langage.

A recent paper emphasizes that token reduction in Transformer architectures should extend beyond mere efficiency, advocating for its role as a fundamental principle in generative modeling across various domains, including vision and language.

Token Reduction Should Go Beyond Efficiency in Generative Models -- From Vision, Language to Multimodality

One More Thing in AI – Your Shortcut to AI Mastery

Unveiling Concept Attribution in Diffusion Models

Was this article worth reading? Share it

One More Thing in AI

LucidQuery AI

4o Image Gen

Blunge

The Visualizer

Uwear

Ready to build your own newsroom?