PERP: Rethinking the Prune-Retrain Paradigm in the Era of LLMs

arXiv — cs.LG•Wednesday, December 3, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

A recent study titled 'PERP: Rethinking the Prune-Retrain Paradigm in the Era of LLMs' reveals that neural networks can be effectively compressed through pruning, which reduces storage and compute demands while maintaining performance. The research indicates that instead of retraining all parameters, updating a small subset of highly expressive parameters can restore or even enhance performance after pruning, particularly in large language models (LLMs) like GPT.
This development is significant as it allows for the retraining of models with up to 30 billion parameters on a single GPU in minutes, addressing the challenges posed by memory and compute constraints in the era of LLMs. By demonstrating that only 0.01%-0.05% of parameters need retraining, the study offers a more efficient approach to model optimization, potentially transforming practices in AI development.
The findings contribute to ongoing discussions about the efficiency of AI models, particularly in the context of large-scale implementations. As traditional methods of pruning and retraining require extensive resources and expert knowledge, the new approach aligns with a growing trend towards more accessible and efficient AI solutions. This shift may influence future research directions and practical applications in various fields, including natural language processing and machine learning.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

LucidQuery AI

Combines diffusion reasoning with autoregressive LLM for advanced AI analysis.

AI & DataTry the app

Airparser

Extract and parse data from documents using GPT-4 automation.

AI & DataTry the app

Supametas.AI

Extract and structure unstructured data for seamless LLM RAG integration.

AI & DataTry the app

Continue Readings

arXiv — cs.CV18 hours ago

MORPH: PDE Foundation Models with Arbitrary Data Modality

PositiveArtificial Intelligence

MORPH has been introduced as a modality-agnostic, autoregressive foundation model designed for partial differential equations (PDEs), utilizing a convolutional vision transformer backbone to manage diverse spatiotemporal datasets across various resolutions and data modalities. The model incorporates advanced techniques such as component-wise convolution and inter-field cross-attention to enhance its predictive capabilities.

Read full article

via arXiv — cs.CV

arXiv — cs.CL18 hours ago

Optimizing Fine-Tuning through Advanced Initialization Strategies for Low-Rank Adaptation

PositiveArtificial Intelligence

Recent advancements in fine-tuning methodologies have led to the introduction of IniLoRA, a novel initialization strategy designed to optimize Low-Rank Adaptation (LoRA) for large language models. IniLoRA initializes low-rank matrices to closely approximate original model weights, addressing limitations in performance seen with traditional LoRA methods. Experimental results demonstrate that IniLoRA outperforms LoRA across various models and tasks, with two additional variants, IniLoRA-$\alpha$ and IniLoRA-$\beta$, further enhancing performance.

Read full article

via arXiv — cs.CL

arXiv — cs.CL2 days ago

Dual LoRA: Enhancing LoRA with Magnitude and Direction Updates

PositiveArtificial Intelligence

A novel method called Dual LoRA has been proposed to enhance the performance of Low-Rank Adaptation (LoRA) in fine-tuning large language models (LLMs). This method introduces two distinct groups within low-rank matrices: a magnitude group for controlling the extent of parameter updates and a direction group for determining the update direction, thereby improving the adaptation process.

Read full article

via arXiv — cs.CL

arXiv — cs.CV2 days ago

NAS-LoRA: Empowering Parameter-Efficient Fine-Tuning for Visual Foundation Models with Searchable Adaptation

PositiveArtificial Intelligence

The introduction of NAS-LoRA represents a significant advancement in the adaptation of the Segment Anything Model (SAM) for specialized tasks, particularly in medical and agricultural imaging. This new Parameter-Efficient Fine-Tuning (PEFT) method integrates a Neural Architecture Search (NAS) block to enhance SAM's performance by addressing its limitations in acquiring high-level semantic information due to the lack of spatial priors in its Transformer encoder.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

LoRA Patching: Exposing the Fragility of Proactive Defenses against Deepfakes

NegativeArtificial Intelligence

A recent study highlights the vulnerabilities of proactive defenses against deepfakes, revealing that these defenses often lack the necessary robustness and reliability. The research introduces a novel technique called Low-Rank Adaptation (LoRA) patching, which effectively bypasses existing defenses by injecting adaptable patches into deepfake generators. This method also includes a Multi-Modal Feature Alignment loss to ensure semantic consistency in outputs.

Read full article

via arXiv — cs.CV

arXiv — cs.LG2 days ago

Delta Sampling: Data-Free Knowledge Transfer Across Diffusion Models

PositiveArtificial Intelligence

Delta Sampling (DS) has been introduced as a novel method for enabling data-free knowledge transfer across different diffusion models, particularly addressing the challenges faced when upgrading base models like Stable Diffusion. This method operates at inference time, utilizing the delta between model predictions before and after adaptation, thus facilitating the reuse of adaptation components across varying architectures.

Read full article

via arXiv — cs.LG

arXiv — cs.CV3 days ago

Glance: Accelerating Diffusion Models with 1 Sample

PositiveArtificial Intelligence

Recent advancements in diffusion models have led to the development of a phase-aware strategy that accelerates image generation by applying different speedups to various stages of the process. This approach utilizes lightweight LoRA adapters, named Slow-LoRA and Fast-LoRA, to enhance efficiency without extensive retraining of models.

Read full article

via arXiv — cs.CV

arXiv — cs.CL3 days ago

An Empirical Survey of Model Merging Algorithms for Social Bias Mitigation

NeutralArtificial Intelligence

A recent empirical survey examined seven model merging algorithms aimed at mitigating social bias in large language models (LLMs), including Linear, Karcher Mean, and SLERP, among others. The study evaluated their effectiveness using 13 open weight models from the GPT, LLaMA, and Qwen families against three bias datasets: BBQ, BOLD, and HONEST, while also assessing their impact on downstream performance in tasks from the SuperGLUE benchmark.

Read full article

via arXiv — cs.CL