CoGraM: Context-sensitive granular optimization method with rollback for robust model fusion

arXiv — cs.LGThursday, December 4, 2025 at 5:00:00 AM
  • CoGraM (Contextual Granular Merging) is a newly introduced optimization method designed to enhance the merging of neural networks without retraining, addressing issues of accuracy and stability that are prevalent in existing methods like Fisher merging. This multi-stage, context-sensitive approach utilizes rollback mechanisms to prevent harmful updates, thereby improving the robustness of the merged network.
  • The introduction of CoGraM is significant for the fields of federated and distributed learning, as it provides a solution to the challenges of merging neural networks effectively. By aligning decisions with loss differences and thresholds, CoGraM aims to maintain high accuracy in collaborative learning environments, which is crucial for the advancement of AI technologies.
  • This development reflects a growing trend in AI research focused on optimizing federated learning processes, particularly in addressing communication overhead and ensuring data privacy. As various innovative methods emerge, such as CG-FKAN and FedAdamW, the emphasis on enhancing model performance while managing data heterogeneity and local overfitting continues to shape the landscape of machine learning.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
RapidUn: Influence-Driven Parameter Reweighting for Efficient Large Language Model Unlearning
PositiveArtificial Intelligence
A new framework called RapidUn has been introduced to address the challenges of unlearning specific data influences in large language models (LLMs). This method utilizes an influence-driven approach to selectively update parameters, achieving significant efficiency improvements over traditional retraining methods, particularly on models like Mistral-7B and Llama-3-8B.
Why Rectified Power Unit Networks Fail and How to Improve It: An Effective Field Theory Perspective
PositiveArtificial Intelligence
The introduction of the Modified Rectified Power Unit (MRePU) activation function addresses critical issues faced by deep Rectified Power Unit (RePU) networks, such as instability during training due to vanishing or exploding values. This new function retains the advantages of differentiability and universal approximation while ensuring stable training conditions, as demonstrated through extensive theoretical analysis and experiments.
Learning to Solve Constrained Bilevel Control Co-Design Problems
NeutralArtificial Intelligence
A new framework for Learning to Optimize (L2O) has been proposed to address the challenges of solving constrained bilevel control co-design problems, which are often complex and time-sensitive. This framework utilizes modern differentiation techniques to enhance the efficiency of finding solutions to these optimization problems.
Comparison of neural network training strategies for the simulation of dynamical systems
PositiveArtificial Intelligence
A recent study has compared two neural network training strategies—parallel and series-parallel training—specifically for simulating nonlinear dynamical systems. The empirical analysis involved five neural network architectures and practical examples, including a pneumatic valve test bench and an industrial robot benchmark. The findings indicate that while series-parallel training is prevalent, parallel training offers superior long-term prediction accuracy.
Mixed precision accumulation for neural network inference guided by componentwise forward error analysis
PositiveArtificial Intelligence
A new study proposes a mixed precision accumulation strategy for neural network inference, utilizing a componentwise forward error analysis to optimize error propagation in linear layers. This method suggests that the precision of each output component should be inversely proportional to the condition numbers of the weights and activation functions involved, potentially enhancing computational efficiency.
Projecting Assumptions: The Duality Between Sparse Autoencoders and Concept Geometry
NeutralArtificial Intelligence
Sparse Autoencoders (SAEs) have been analyzed to determine their effectiveness in uncovering meaningful concepts within neural network representations. A unified framework has been introduced, framing SAEs as solutions to a bilevel optimization problem, which highlights the inherent biases in concept detection based on the structural assumptions of different SAE architectures.
Verifying Closed-Loop Contractivity of Learning-Based Controllers via Partitioning
PositiveArtificial Intelligence
A recent study has introduced a method for verifying closed-loop contractivity in nonlinear control systems using neural networks for both controllers and contraction metrics. This approach employs interval analysis and a domain partitioning strategy to ensure that the dominant eigenvalue of a symmetric Metzler matrix remains nonpositive, which is essential for confirming contractivity. The method was validated on an inverted pendulum system, showcasing its effectiveness in training neural network controllers.
Using physics-inspired Singular Learning Theory to understand grokking & other phase transitions in modern neural networks
PositiveArtificial Intelligence
A recent study has applied Singular Learning Theory (SLT), a physics-inspired framework, to better understand the complexities of modern neural networks, particularly focusing on phenomena like grokking and phase transitions. The research empirically tests SLT in various toy models, including a grokking modulo-arithmetic model and Anthropic's Toy Models of Superposition, to explore the scaling of learning coefficients with problem difficulty.