Arc Gradient Descent: A Mathematically Derived Reformulation of Gradient Descent with Phase-Aware, User-Controlled Step Dynamics

arXiv — cs.LG•Tuesday, December 9, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

The paper introduces Arc Gradient Descent (ArcGD), a new optimizer that reformulates traditional gradient descent methods to incorporate phase-aware and user-controlled step dynamics. The evaluation of ArcGD shows it outperforming the Adam optimizer on a non-convex benchmark and a real-world ML dataset, particularly in challenging scenarios like the Rosenbrock function and CIFAR-10 image classification.
This development is significant as it highlights the potential for improved optimization techniques in machine learning, particularly in complex, high-dimensional spaces. By addressing learning-rate biases, ArcGD offers a promising alternative to existing optimizers, potentially enhancing model training efficiency and effectiveness.
The introduction of ArcGD aligns with ongoing advancements in optimization algorithms, where researchers are exploring various methods to bridge the performance gap between adaptive and non-adaptive optimizers. This trend reflects a broader effort to refine training processes in deep learning, as seen in recent studies that evaluate the limitations and advantages of popular methods like Adam and SGD, indicating a vibrant area of research focused on enhancing convergence rates and stability.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

LucidQuery AI

Combines diffusion reasoning with autoregressive LLM for advanced AI analysis.

AI & DataView app details

Augmeta

AI peers for collaborative problem-solving and enhanced team productivity.

AI & DataView app details

Graphite Note

Automated predictive analytics platform for business experts without data science backgrounds.

AI & DataView app details

Continue Readings

arXiv — cs.CV2 days ago

PrunedCaps: A Case For Primary Capsules Discrimination

PositiveArtificial Intelligence

A recent study has introduced a pruned version of Capsule Networks (CapsNets), demonstrating that it can operate up to 9.90 times faster than traditional architectures by eliminating 95% of Primary Capsules while maintaining accuracy across various datasets, including MNIST and CIFAR-10.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

Adaptive Dataset Quantization: A New Direction for Dataset Pruning

PositiveArtificial Intelligence

A new paper introduces an innovative dataset quantization method aimed at reducing storage and communication costs for large-scale datasets on resource-constrained edge devices. This approach focuses on compressing individual samples by minimizing intra-sample redundancy while retaining essential features, marking a shift from traditional inter-sample redundancy methods.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

Causal Interpretability for Adversarial Robustness: A Hybrid Generative Classification Approach

NeutralArtificial Intelligence

A new study presents a hybrid generative classification approach aimed at enhancing adversarial robustness in deep learning models. The proposed deep ensemble model integrates a pre-trained discriminative network for feature extraction with a generative classification network, achieving high accuracy and robustness against adversarial attacks without the need for adversarial training. Extensive experiments on CIFAR-10 and CIFAR-100 validate its effectiveness.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

Structured Initialization for Vision Transformers

PositiveArtificial Intelligence

A new study proposes a structured initialization method for Vision Transformers (ViTs), aiming to integrate the strong inductive biases of Convolutional Neural Networks (CNNs) without altering the architecture. This approach is designed to enhance performance on small datasets while maintaining scalability as data increases.

Read full article

via arXiv — cs.CV

arXiv — stat.ML2 days ago

Stochastic Approximation with Block Coordinate Optimal Stepsizes

NeutralArtificial Intelligence

The recent study on stochastic approximation with block-coordinate optimal stepsizes introduces adaptive stepsize rules designed to minimize the expected distance from an unknown target point. These rules utilize online estimates of the second moment of the search direction, leading to a new method that competes effectively with the widely used Adam algorithm while requiring less memory and fewer hyper-parameters.

Read full article

via arXiv — stat.ML

arXiv — stat.ML2 days ago

ADAM Optimization with Adaptive Batch Selection

PositiveArtificial Intelligence

The introduction of Adam with Combinatorial Bandit Sampling (AdamCB) enhances the widely used Adam optimizer by integrating combinatorial bandit techniques, allowing for adaptive sample selection during neural network training. This approach addresses the inefficiencies of treating all data samples equally, leading to improved convergence rates and theoretical guarantees over previous methods.

Read full article

via arXiv — stat.ML

arXiv — cs.LG2 days ago

Quantization Blindspots: How Model Compression Breaks Backdoor Defenses

NeutralArtificial Intelligence

A recent study highlights the vulnerabilities of backdoor defenses in neural networks when subjected to post-training quantization, revealing that INT8 quantization leads to a 0% detection rate for all evaluated defenses while attack success rates remain above 99%. This raises concerns about the effectiveness of existing security measures in machine learning systems.

Read full article

via arXiv — cs.LG

arXiv — cs.LG2 days ago

FOAM: Blocked State Folding for Memory-Efficient LLM Training

PositiveArtificial Intelligence

The introduction of the Folded Optimizer with Approximate Moment (FOAM) presents a new approach to training large language models (LLMs) by compressing optimizer states through block-wise gradient means and a residual correction mechanism. This method aims to alleviate memory bottlenecks associated with traditional optimizers like Adam, which are often memory-intensive during training.

Read full article

via arXiv — cs.LG