MAC: An Efficient Gradient Preconditioning using Mean Activation Approximated Curvature
PositiveArtificial Intelligence
On November 12, 2025, researchers introduced MAC, an innovative optimization method that enhances the training of neural networks by efficiently approximating the Fisher information matrix (FIM). This algorithm addresses the computational challenges associated with second-order optimization methods like KFAC, which, while effective in improving convergence through curvature information, are often resource-intensive. MAC stands out as the first algorithm to apply Kronecker factorization specifically to the FIM of attention layers in transformers, integrating attention scores into its preconditioning process. The study demonstrates that MAC not only converges to global minima under specific conditions but also significantly outperforms KFAC and other state-of-the-art methods in terms of accuracy, training time, and memory usage. This advancement is crucial for the ongoing development of AI technologies, as it promises to streamline the training process and enhance the performance of neur…
— via World Pulse Now AI Editorial System
