MeanFlow Transformers with Representation Autoencoders

arXiv — cs.LG•Tuesday, November 18, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

The development of MeanFlow Transformers with Representation Autoencoders presents a significant advancement in generative modeling, particularly in enhancing the efficiency of few
This innovation is crucial for improving the practicality of generative models, as it reduces the computational burden during inference and enhances the stability of the training process. The integration of a lightweight decoder with a pre
The advancements in MeanFlow and its connection to Stable Diffusion highlight ongoing efforts in the AI field to optimize generative models. As the demand for efficient text

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

Recommended Readings

arXiv — cs.LG11 hours ago

Certified but Fooled! Breaking Certified Defences with Ghost Certificates

NegativeArtificial Intelligence

The article discusses the vulnerabilities of certified defenses in machine learning, particularly in the context of adversarial attacks. It highlights how malicious actors can exploit probabilistic certification frameworks to mislead classifiers and generate false robustness guarantees for adversarial inputs. The study reveals that small, imperceptible perturbations can be crafted to spoof certification processes, raising concerns about the reliability of certified models in real-world applications.

Read full article

via arXiv — cs.LG

arXiv — cs.CV11 hours ago

MAVias: Mitigate any Visual Bias

PositiveArtificial Intelligence

MAVias is an innovative approach aimed at mitigating biases in computer vision models, which is crucial for enhancing the trustworthiness of artificial intelligence systems. Traditional bias mitigation techniques often address a limited range of predefined biases, which restricts their effectiveness in diverse visual datasets that may contain multiple, unknown biases. MAVias utilizes foundation models to identify spurious associations between visual attributes and target classes, capturing a broad spectrum of visual features and translating them into language-coded potential biases for further…

Read full article

via arXiv — cs.CV

arXiv — cs.LG11 hours ago

Cross-Domain Few-Shot Learning with Coalescent Projections and Latent Space Reservation

PositiveArtificial Intelligence

The article discusses advancements in cross-domain few-shot learning, highlighting a model that combines DINO with a prototypical classifier, outperforming current state-of-the-art methods. A significant challenge is the overfitting caused by updating too many transformer parameters due to limited labeled samples. To tackle this, the authors introduce coalescent projection as a successor to soft prompts and a novel pseudo-class generation method that utilizes self-supervised transformations, demonstrating effectiveness on the BSCD-FSL benchmark.

Read full article

via arXiv — cs.LG

arXiv — cs.CV11 hours ago

Benchmarking Deep Learning-Based Object Detection Models on Feature Deficient Astrophotography Imagery Dataset

NeutralArtificial Intelligence

The study benchmarks various deep learning-based object detection models using the MobilTelesco dataset, which features sparse astrophotography images. Traditional datasets like ImageNet and COCO focus on everyday objects, lacking the unique challenges presented by feature-deficient conditions. The research highlights the difficulties these models face when applied to non-commercial domains, emphasizing the need for specialized datasets in astrophotography.

Read full article

via arXiv — cs.CV

arXiv — cs.CV11 hours ago

FGM-HD: Boosting Generation Diversity of Fractal Generative Models through Hausdorff Dimension Induction

PositiveArtificial Intelligence

The article discusses a novel approach to enhancing the diversity of outputs in Fractal Generative Models (FGMs) while maintaining high visual quality. By incorporating the Hausdorff Dimension (HD), a concept from fractal geometry that quantifies structural complexity, the authors propose a learnable HD estimation method that predicts HD from image embeddings. This method aims to improve the diversity of generated images, addressing challenges such as image quality degradation and limited diversity enhancement in FGMs.

Read full article

via arXiv — cs.CV

arXiv — cs.LG11 hours ago

ODE$_t$(ODE$_l$): Shortcutting the Time and the Length in Diffusion and Flow Models for Faster Sampling

PositiveArtificial Intelligence

The article discusses a novel approach called ODE$_t$(ODE$_l$) that optimizes the sampling process in continuous normalizing flows (CNFs) and diffusion models (DMs). By rewiring transformer-based architectures and introducing a length consistency term, this method reduces computational complexity and allows for flexible sampling with varying time steps and transformer blocks. This advancement aims to enhance the efficiency and quality of data generation from noise distributions.

Read full article

via arXiv — cs.LG

arXiv — cs.CV11 hours ago

Diffusion As Self-Distillation: End-to-End Latent Diffusion In One Model

PositiveArtificial Intelligence

Standard Latent Diffusion Models utilize a complex architecture comprising separate encoder, decoder, and diffusion network components, which are trained in multiple stages. This modular design is computationally inefficient and leads to suboptimal performance. The proposed solution aims to unify these components into a single, end-to-end trainable network. The authors identify issues of instability in naive joint training due to 'latent collapse' and introduce Diffusion as Self-Distillation (DSD), a framework that addresses these challenges.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

Fast Data Attribution for Text-to-Image Models

PositiveArtificial Intelligence

Data attribution for text-to-image models seeks to identify the training images that significantly influenced generated outputs. Current methods require substantial computational resources for each query, limiting their practicality. A novel approach is proposed for scalable and efficient data attribution, distilling a slow, unlearning-based method into a feature embedding space for quick retrieval of influential training images. The method, combined with efficient indexing and search techniques, demonstrates competitive performance on medium and large-scale models, achieving results faster th…

Read full article

via arXiv — cs.CV