Towards Synthesizing High-Dimensional Tabular Data with Limited Samples

arXiv — cs.LGWednesday, November 12, 2025 at 5:00:00 AM
The introduction of CtrTab marks a significant advancement in the synthesis of high-dimensional tabular data, addressing the limitations faced by existing diffusion-based models. These models often degenerate in performance when data dimensionality increases, particularly in low-sample scenarios, leading to results that can be inferior to simpler, non-diffusion-based approaches. CtrTab mitigates these issues by incorporating perturbed ground-truth samples as auxiliary inputs during training, which stabilizes the learning process and enhances the model's sensitivity to control signals. Experimental results demonstrate that CtrTab outperforms state-of-the-art models by an impressive margin, achieving over 90% accuracy on average. This development is crucial for various applications in artificial intelligence and data science, where high-dimensional data is common yet challenging to synthesize effectively.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it