World PulseNowPowered by AI

Trending:

Now You See It, Now You Don't - Instant Concept Erasure for Safe Text-to-Image and Video Generation

arXiv — cs.CV•Tuesday, November 25, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

Researchers have introduced Instant Concept Erasure (ICE), a novel approach for robust concept removal in text-to-image (T2I) and text-to-video (T2V) models. This method eliminates the need for costly retraining and minimizes inference overhead while addressing vulnerabilities to adversarial attacks. ICE employs a training-free, one-shot weight modification technique that ensures precise and persistent unlearning without collateral damage to surrounding content.
The development of ICE is significant as it enhances the safety and reliability of T2I and T2V models, which are increasingly used in various applications. By providing a solution that does not require extensive retraining, ICE allows for more efficient updates and modifications to these models, thereby improving their overall functionality and user trust.
This advancement reflects a broader trend in AI research focused on improving the safety and effectiveness of generative models. As the demand for high-quality text-to-image and text-to-video generation grows, addressing issues like adversarial vulnerabilities and ensuring coherent outputs becomes critical. Innovations such as ICE, alongside other frameworks for optimizing video captions and enhancing semantic understanding, highlight the ongoing efforts to refine generative AI technologies.

— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps

KissGen AI

Generate AI videos and images with advanced tools for creative projects.

Creative & DesignTry the app

Humanize AI

Transform AI-generated text into undetectable, human-like content effortlessly.

Business & ProductivityTry the app

SVGenius

Turn text descriptions into stunning, custom SVG animations with ease.

AI & DataTry the app

Continue Readings

TRANSPORTER: Transferring Visual Semantics from VLM Manifolds

arXiv — cs.CVa day ago

TRANSPORTER: Transferring Visual Semantics from VLM Manifolds

PositiveArtificial Intelligence

The paper introduces TRANSPORTER, a model-independent approach designed to enhance video generation by transferring visual semantics from Vision Language Models (VLMs). This method addresses the challenge of understanding how VLMs derive their predictions, particularly in complex scenes with various objects and actions. TRANSPORTER generates videos that reflect changes in captions across diverse attributes and contexts.

Read full article

via arXiv — cs.CV

Spotlight: Identifying and Localizing Video Generation Errors Using VLMs

arXiv — cs.CVa day ago

Spotlight: Identifying and Localizing Video Generation Errors Using VLMs

PositiveArtificial Intelligence

A new task named Spotlight has been introduced to identify and localize video generation errors in text-to-video models (T2V), which can produce high-quality videos but still exhibit nuanced errors. The research generated 600 videos using diverse prompts and three advanced video generators, annotating over 1600 specific errors across various categories such as motion and physics.

Read full article

via arXiv — cs.CV

Frame-wise Conditioning Adaptation for Fine-Tuning Diffusion Models in Text-to-Video Prediction

arXiv — cs.CVa day ago

Frame-wise Conditioning Adaptation for Fine-Tuning Diffusion Models in Text-to-Video Prediction

PositiveArtificial Intelligence

A new method called Frame-wise Conditioning Adaptation (FCA) has been proposed to enhance text-to-video prediction (TVP) by improving the continuity of generated video frames based on initial frames and descriptive text. This approach addresses limitations in existing models that often rely on text-to-image pre-training, which can lead to disjointed video outputs.

Read full article

via arXiv — cs.CV

A Training-Free Style-aligned Image Generation with Scale-wise Autoregressive Model

arXiv — cs.CVa day ago

A Training-Free Style-aligned Image Generation with Scale-wise Autoregressive Model

PositiveArtificial Intelligence

A new training-free method for style-aligned image generation has been introduced, utilizing a scale-wise autoregressive model. This approach addresses common issues in large-scale text-to-image models, such as style misalignment and slow inference speeds, by implementing initial feature replacement, pivotal feature interpolation, and dynamic style injection to ensure consistency across generated images.

Read full article

via arXiv — cs.CV

ImAgent: A Unified Multimodal Agent Framework for Test-Time Scalable Image Generation

arXiv — cs.CVa day ago

ImAgent: A Unified Multimodal Agent Framework for Test-Time Scalable Image Generation

PositiveArtificial Intelligence

The introduction of ImAgent marks a significant advancement in text-to-image (T2I) technology, presenting a unified multimodal agent framework that enhances image generation by integrating reasoning, generation, and self-evaluation into a single system. This approach aims to address the challenges of randomness and inconsistency in image outputs, particularly when prompts are vague or underspecified.

Read full article

via arXiv — cs.CV

Reason2Attack: Jailbreaking Text-to-Image Models via LLM Reasoning

arXiv — cs.CV2 days ago

Reason2Attack: Jailbreaking Text-to-Image Models via LLM Reasoning

NeutralArtificial Intelligence

A new approach called Reason2Attack (R2A) has been proposed to enhance the reasoning capabilities of large language models (LLMs) in generating adversarial prompts for text-to-image (T2I) models. This method addresses the limitations of existing jailbreaking techniques that require numerous queries to bypass safety filters, thereby exposing vulnerabilities in T2I systems. R2A incorporates jailbreaking into the post-training process of LLMs, aiming to streamline the attack process.

Read full article

via arXiv — cs.CV