CoD: A Diffusion Foundation Model for Image Compression

arXiv — cs.CVTuesday, November 25, 2025 at 5:00:00 AM
  • CoD, a new compression-oriented diffusion foundation model, has been introduced to enhance image compression efficiency, particularly at ultra-low bitrates. Unlike existing models that rely on text conditioning, CoD is designed for end-to-end optimization of both compression and generation, achieving state-of-the-art results when integrated with downstream codecs like DiffC.
  • This development is significant as it marks a shift in the approach to image compression, potentially allowing for faster and more efficient training processes. CoD's training is reported to be 300 times faster than that of Stable Diffusion, making it a promising tool for developers in the field.
  • The introduction of CoD aligns with ongoing advancements in AI, particularly in generative models and object detection. As the industry grapples with challenges such as out-of-distribution objects and biases in image generation, innovations like CoD could play a crucial role in improving model reliability and efficiency across various applications.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
Deepfake Geography: Detecting AI-Generated Satellite Images
NeutralArtificial Intelligence
Recent advancements in AI, particularly with generative models like StyleGAN2 and Stable Diffusion, have raised concerns about the authenticity of satellite imagery, which is crucial for scientific and security analyses. A study has compared Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) for detecting AI-generated satellite images, revealing that ViTs outperform CNNs in accuracy and robustness.
SYNAPSE: Synergizing an Adapter and Finetuning for High-Fidelity EEG Synthesis from a CLIP-Aligned Encoder
PositiveArtificial Intelligence
SYNAPSE is a newly introduced framework that integrates an adapter and fine-tuning techniques to enhance high-fidelity EEG synthesis from a CLIP-aligned encoder. This two-stage approach aims to improve the representation of EEG signals, addressing challenges such as noise and inter-subject variability that have hindered previous image generation methods based on brain signals.
DICE: Distilling Classifier-Free Guidance into Text Embeddings
PositiveArtificial Intelligence
The paper presents DICE, a novel approach that distills Classifier-Free Guidance (CFG) into text embeddings, significantly reducing computational complexity while maintaining high-quality image generation in text-to-image diffusion models. This method addresses the common issue of misalignment between text prompts and generated images, which has been a challenge in the field.
SpecDiff: Accelerating Diffusion Model Inference with Self-Speculation
PositiveArtificial Intelligence
A new paradigm called SpecDiff has been introduced to accelerate diffusion model inference by utilizing self-speculation, which incorporates future information alongside historical data. This approach aims to enhance accuracy and speed in the inference process by employing a training-free multi-level feature caching strategy, including a feature selection algorithm based on self-speculative information.
Learning Latent Transmission and Glare Maps for Lens Veiling Glare Removal
PositiveArtificial Intelligence
A new generative model named VeilGen has been proposed to address the challenge of veiling glare in compact optical systems, which is often exacerbated by stray-light scattering from non-ideal surfaces. This model learns to simulate veiling glare by estimating optical transmission and glare maps from target images in an unsupervised manner, marking a significant advancement in lens performance enhancement.
Model-Agnostic Gender Bias Control for Text-to-Image Generation via Sparse Autoencoder
PositiveArtificial Intelligence
A new framework called SAE Debias has been introduced to address gender bias in text-to-image (T2I) generation models, particularly those that generate stereotypical associations between professions and gendered subjects. This model-agnostic approach utilizes a k-sparse autoencoder to identify and suppress biased directions during image generation, aiming for more gender-balanced outputs without requiring model-specific adjustments.