SparseSwaps: Tractable LLM Pruning Mask Refinement at Scale

arXiv — cs.LG•Friday, December 12, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

SparseSwaps introduces a scalable method for refining pruning masks in large language models (LLMs), addressing the computational challenges associated with traditional pruning techniques that often lead to performance degradation. This approach enhances the efficiency of LLMs by optimizing the selection of pruning masks without the need for full retraining, which is typically resource-intensive.
The development of SparseSwaps is significant as it allows for more effective model compression while maintaining performance, thereby reducing the resource requirements for deploying LLMs. This advancement could lead to broader accessibility and application of LLMs in various fields, including AI research and commercial applications.
This innovation reflects a growing trend in AI research towards improving the efficiency of neural networks through methods like pruning and quantization. As LLMs become increasingly prevalent, the need for techniques that minimize computational costs while preserving model integrity is paramount. The ongoing exploration of frameworks that enhance model performance without extensive retraining indicates a shift in focus towards sustainable AI practices.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

Continue Readings

arXiv — cs.LG2 days ago

JITServe: SLO-aware LLM Serving with Imprecise Request Information

PositiveArtificial Intelligence

JITServe has been introduced as the first SLO-aware serving system for Large Language Models (LLMs), addressing the challenges posed by diverse workloads and unpredictable request information. This system aims to optimize service goodput by effectively scheduling requests to meet specific service-level objectives (SLOs) across various applications, including chatbots and multi-agent systems.

Read full article

via arXiv — cs.LG

$\textsc{Text2Graph}: Combining Lightweight LLMs and GNNs for Efficient Text Classification in Label-Scarce Scenarios$

arXiv — cs.LG2 days ago

\textsc{Text2Graph}: Combining Lightweight LLMs and GNNs for Efficient Text Classification in Label-Scarce Scenarios

PositiveArtificial Intelligence

The newly introduced framework, Text2Graph, integrates lightweight large language models (LLMs) with graph neural networks (GNNs) to enhance text classification, particularly in scenarios with limited labels. This open-source Python package allows for flexible component swapping, including feature extractors and sampling strategies, and has been benchmarked across five datasets for zero-shot classification tasks.

Read full article

via arXiv — cs.LG

arXiv — cs.LG2 days ago

Local LLM Ensembles for Zero-shot Portuguese Named Entity Recognition

PositiveArtificial Intelligence

A novel approach to Named Entity Recognition (NER) for Portuguese has been introduced, utilizing a three-step ensemble pipeline of locally run Large Language Models (LLMs). This method demonstrates superior performance over individual models across multiple datasets, particularly in zero-shot scenarios, where minimal annotated data is available.

Read full article

via arXiv — cs.LG

arXiv — cs.LG2 days ago

UrbanAI 2025 Challenge: Linear vs Transformer Models for Long-Horizon Exogenous Temperature Forecasting

NeutralArtificial Intelligence

The UrbanAI 2025 Challenge has revealed significant findings in long-horizon exogenous temperature forecasting, comparing linear models such as Linear, NLinear, and DLinear against Transformer-family models including Informer and Autoformer. The study indicates that linear models consistently outperform their more complex counterparts, with DLinear achieving the highest accuracy across all evaluation splits.

Read full article

via arXiv — cs.LG

arXiv — cs.CL2 days ago

LLMs in Interpreting Legal Documents

NeutralArtificial Intelligence

This chapter discusses the use of Large Language Models (LLMs) in the legal field, highlighting their ability to enhance traditional legal tasks such as interpreting statutes, contracts, and case law. It also addresses the challenges posed by these technologies, including algorithmic monoculture and compliance with regulations like the EU's AI Act and U.S. initiatives.

Read full article

via arXiv — cs.CL

arXiv — cs.CL2 days ago

LMSpell: Neural Spell Checking for Low-Resource Languages

PositiveArtificial Intelligence

LMSpell has been introduced as a neural spell checking toolkit specifically designed for low-resource languages (LRLs), showcasing the effectiveness of large language models (LLMs) in improving spell correction. This toolkit includes an evaluation function that addresses the hallucination issues often associated with LLMs, marking a significant advancement in the field of natural language processing for underrepresented languages.

Read full article

via arXiv — cs.CL

arXiv — cs.CL2 days ago

A Greek Government Decisions Dataset for Public-Sector Analysis and Insight

PositiveArtificial Intelligence

An open, machine-readable dataset of Greek government decisions has been introduced, sourced from the national transparency platform Diavgeia, comprising 1 million decisions with high-quality raw text extracted from PDFs. This dataset is released with a reproducible extraction pipeline and includes qualitative analyses to explore boilerplate patterns and a retrieval-augmented generation (RAG) task to evaluate information access and reasoning over governmental documents.

Read full article

via arXiv — cs.CL

arXiv — cs.LG2 days ago

RoboNeuron: A Modular Framework Linking Foundation Models and ROS for Embodied AI

PositiveArtificial Intelligence

RoboNeuron has been introduced as a modular framework designed to enhance the adaptability and efficiency of embodied AI systems by integrating Large Language Models (LLMs) and Vision-Language-Action (VLA) models with the Robot Operating System (ROS). This framework aims to address existing challenges such as rigid inter-module coupling and fragmented inference acceleration, thereby improving real-time execution capabilities.

Read full article

via arXiv — cs.LG

Ready to build your own newsroom?

Subscribe to unlock a personalised feed, podcasts, newsletters, and notifications tailored to the topics you actually care about