MAC: An Efficient Gradient Preconditioning using Mean Activation Approximated Curvature

arXiv — cs.LGWednesday, November 12, 2025 at 5:00:00 AM
On November 12, 2025, researchers introduced MAC, an innovative optimization method that enhances the training of neural networks by efficiently approximating the Fisher information matrix (FIM). This algorithm addresses the computational challenges associated with second-order optimization methods like KFAC, which, while effective in improving convergence through curvature information, are often resource-intensive. MAC stands out as the first algorithm to apply Kronecker factorization specifically to the FIM of attention layers in transformers, integrating attention scores into its preconditioning process. The study demonstrates that MAC not only converges to global minima under specific conditions but also significantly outperforms KFAC and other state-of-the-art methods in terms of accuracy, training time, and memory usage. This advancement is crucial for the ongoing development of AI technologies, as it promises to streamline the training process and enhance the performance of neur…
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended Readings
CLAReSNet: When Convolution Meets Latent Attention for Hyperspectral Image Classification
PositiveArtificial Intelligence
CLAReSNet, a new hybrid architecture for hyperspectral image classification, integrates multi-scale convolutional extraction with transformer-style attention through an adaptive latent bottleneck. This model addresses challenges such as high spectral dimensionality, complex spectral-spatial correlations, and limited training samples with severe class imbalance. By combining convolutional networks and transformers, CLAReSNet aims to enhance classification accuracy and efficiency in hyperspectral imaging applications.