U-REPA: Aligning Diffusion U-Nets to ViTs

arXiv — cs.CVTuesday, November 25, 2025 at 5:00:00 AM

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
Understanding, Accelerating, and Improving MeanFlow Training
PositiveArtificial Intelligence
Recent advancements in MeanFlow training have clarified the dynamics between instantaneous and average velocity fields, revealing that effective learning of average velocity relies on the prior establishment of accurate instantaneous velocities. This understanding has led to the design of a new training scheme that accelerates the formation of these velocities, enhancing the overall training process.
Parallel qMRI Reconstruction from 4x Accelerated Acquisitions
PositiveArtificial Intelligence
A new deep learning framework has been proposed for Magnetic Resonance Imaging (MRI) that enables parallel reconstruction from 4x accelerated acquisitions, significantly reducing scan times while maintaining image quality. This method utilizes a two-module architecture that estimates coil sensitivity maps and reconstructs images from undersampled k-space data, addressing the limitations of traditional techniques like SENSE.
BD-Net: Has Depth-Wise Convolution Ever Been Applied in Binary Neural Networks?
PositiveArtificial Intelligence
A recent study introduces BD-Net, which successfully applies depth-wise convolution in Binary Neural Networks (BNNs) by proposing a 1.58-bit convolution and a pre-BN residual connection to enhance expressiveness and stabilize training. This innovation marks a significant advancement in model compression techniques, achieving a new state-of-the-art performance on ImageNet with MobileNet V1 and outperforming previous methods across various datasets.
Flow Map Distillation Without Data
PositiveArtificial Intelligence
A new approach to flow map distillation has been introduced, which eliminates the need for external datasets traditionally used in the sampling process. This method aims to mitigate the risks associated with Teacher-Data Mismatch by relying solely on the prior distribution, ensuring that the teacher's generative capabilities are accurately represented without data dependency.
DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation
PositiveArtificial Intelligence
The newly proposed DeCo framework introduces a frequency-decoupled pixel diffusion method for end-to-end image generation, addressing the inefficiencies of existing models that combine high and low-frequency signal modeling within a single diffusion transformer. This innovation allows for improved training and inference speeds by separating the generation processes of high-frequency details and low-frequency semantics.
Temporal-adaptive Weight Quantization for Spiking Neural Networks
PositiveArtificial Intelligence
A new study introduces Temporal-adaptive Weight Quantization (TaWQ) for Spiking Neural Networks (SNNs), which aims to reduce energy consumption while maintaining accuracy. This method leverages temporal dynamics to allocate ultra-low-bit weights, demonstrating minimal quantization loss of 0.22% on ImageNet and high energy efficiency in extensive experiments.
Annotation-Free Class-Incremental Learning
PositiveArtificial Intelligence
A new paradigm in continual learning, Annotation-Free Class-Incremental Learning (AFCIL), has been introduced, addressing the challenge of learning from unlabeled data that arrives sequentially. This approach allows systems to adapt to new classes without supervision, marking a significant shift from traditional methods reliant on labeled data.
MammothModa2: A Unified AR-Diffusion Framework for Multimodal Understanding and Generation
PositiveArtificial Intelligence
MammothModa2, a new unified autoregressive-diffusion framework, has been introduced to enhance multimodal understanding and generation. This framework aims to bridge the gap between discrete semantic reasoning and high-fidelity visual synthesis, utilizing a serial design that couples autoregressive semantic planning with diffusion-based generation.