Activation-Guided Consensus Merging for Large Language Models
PositiveArtificial Intelligence
Recent research has focused on reconciling the reasoning capabilities of System 2 with the efficiency of System 1. Existing training-based and prompt-based approaches face challenges in efficiency and stability. Model merging has emerged as a strategy to integrate the diverse capabilities of different Large Language Models (LLMs) into a unified model. The proposed Activation-Guided Consensus Merging (ACM) framework determines layer-specific merging coefficients based on mutual information between activations of pre-trained and fine-tuned models, preserving task-specific capabilities without requiring gradient computations.
— via World Pulse Now AI Editorial System