Density-Informed VAE (DiVAE): Reliable Log-Prior Probability via Density Alignment Regularization

arXiv — cs.LG•Thursday, December 4, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

A new method called Density-Informed VAE (DiVAE) has been introduced, which enhances the Variational Autoencoder (VAE) framework by aligning the log-prior probability with data-derived log-density estimates. This approach allows for better allocation of posterior mass in relation to data-space density and improves prior coverage, particularly in synthetic datasets and the MNIST dataset.
The development of DiVAE is significant as it not only improves the interpretability of latent variable models but also enhances out-of-distribution (OOD) uncertainty calibration, potentially leading to more reliable applications of VAEs in various AI tasks.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

LucidQuery AI

Combines diffusion reasoning with autoregressive LLM for advanced AI analysis.

AI & DataTry the app

The Visualizer

Transform complex topics into clear, visual explanations for effortless learning.

AI & DataTry the app

Zemith-3bda3b

Your all-in-one AI platform for work and research assistance.

AI & DataTry the app

Continue Readings

arXiv — cs.CVa day ago

SimFlow: Simplified and End-to-End Training of Latent Normalizing Flows

PositiveArtificial Intelligence

SimFlow introduces a simplified and end-to-end training method for Latent Normalizing Flows (NFs), addressing limitations in previous models that relied on complex noise addition and frozen VAE encoders. By fixing the variance to a constant, the model enhances the encoder's output distribution and stabilizes training, leading to improved image reconstruction and generation quality.

Read full article

via arXiv — cs.CV

arXiv — cs.CVa day ago

DynamicCity: Large-Scale 4D Occupancy Generation from Dynamic Scenes

PositiveArtificial Intelligence

DynamicCity has introduced a groundbreaking 4D occupancy generation framework that enhances urban scene generation by focusing on the dynamic nature of real-world driving environments. This framework utilizes a VAE model and a novel Projection Module to create high-quality dynamic 4D scenes, significantly improving fitting quality and reconstruction accuracy.

Read full article

via arXiv — cs.CV

arXiv — cs.CVa day ago

Multi-Scale Visual Prompting for Lightweight Small-Image Classification

PositiveArtificial Intelligence

A new approach called Multi-Scale Visual Prompting (MSVP) has been introduced to enhance small-image classification tasks, utilizing lightweight, learnable parameters integrated into the input space. This method significantly improves performance across various convolutional neural networks (CNN) and Vision Transformer architectures while maintaining a minimal increase in parameters.

Read full article

via arXiv — cs.CV

arXiv — cs.LGa day ago

Domain Feature Collapse: Implications for Out-of-Distribution Detection and Solutions

NeutralArtificial Intelligence

A recent study has revealed that state-of-the-art out-of-distribution (OOD) detection methods fail catastrophically when trained on single-domain datasets due to a phenomenon termed domain feature collapse, where domain-specific information is discarded. This collapse leads to models relying solely on class-specific features, significantly impairing their ability to detect out-of-domain samples.

Read full article

via arXiv — cs.LG

arXiv — stat.MLa day ago

Unbiased Kinetic Langevin Monte Carlo with Inexact Gradients

PositiveArtificial Intelligence

A new unbiased method for Bayesian posterior means has been introduced, leveraging kinetic Langevin dynamics combined with advanced splitting methods and enhanced gradient approximations. This approach eliminates the need for Metropolis correction by integrating Markov chains at various discretization levels within a multilevel Monte Carlo framework, demonstrating unbiasedness and finite variance in theoretical analyses.

Read full article

via arXiv — stat.ML

arXiv — cs.CV2 days ago

WeMMU: Enhanced Bridging of Vision-Language Models and Diffusion Models via Noisy Query Tokens

PositiveArtificial Intelligence

Recent advancements in multimodal large language models (MLLMs) have led to the introduction of Noisy Query Tokens, which facilitate a more efficient connection between Vision-Language Models (VLMs) and Diffusion Models. This approach addresses the issue of generalization collapse, allowing for improved continual learning across diverse tasks and enhancing the overall performance of these models.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

APTx Neuron: A Unified Trainable Neuron Architecture Integrating Activation and Computation

PositiveArtificial Intelligence

The APTx Neuron has been introduced as a novel neural computation unit that integrates non-linear activation and linear transformation into a single trainable expression, derived from the APTx activation function. This architecture eliminates the need for separate activation layers, enhancing optimization efficiency. Validation on the MNIST dataset demonstrated a test accuracy of 96.69% within 11 epochs using approximately 332K trainable parameters.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

Steering One-Step Diffusion Model with Fidelity-Rich Decoder for Fast Image Compression

PositiveArtificial Intelligence

A novel single-step diffusion image compression model, SODEC, has been introduced to address the challenges of excessive decoding latency and poor fidelity in traditional diffusion-based image compression methods. By leveraging a pre-trained VAE-based model, SODEC produces informative latents and replaces the iterative denoising process with a single-step decoding, enhancing efficiency and output quality.

Read full article

via arXiv — cs.CV