Low-Rank Curvature for Zeroth-Order Optimization in LLM Fine-Tuning
PositiveArtificial Intelligence
The recent introduction of LOREN, a curvature-aware zeroth-order optimization method, represents a notable advancement in fine-tuning large language models (LLMs). Traditional zeroth-order methods often struggle with high variance and ineffective search directions, which can hinder performance. LOREN addresses these challenges by reformulating gradient preconditioning and employing a low-rank block diagonal preconditioner, resulting in improved accuracy and faster convergence. Experimental results demonstrate that LOREN outperforms state-of-the-art methods, achieving a peak memory usage reduction of up to 27.3% compared to MeZO-Adam. This development is significant as it enhances the efficiency of LLM fine-tuning, making it a valuable contribution to the field of artificial intelligence.
— via World Pulse Now AI Editorial System
