World PulseNowPowered by AI

Trending:

Intrinsic Structure as a Proxy for Saliency: SVD-Based Weight Preservation for Mixed-Precision Quantization in Large Language Models

arXiv — cs.LG•Tuesday, December 2, 2025 at 5:00:00 AM

NeutralArtificial Intelligence

arXiv:2512.01343v1 Announce Type: new Abstract: As Large Language Models (LLMs) continue to scale in parameter count, deploying them on commodity hardware has become increasingly challenging. Post-Training Quantization (PTQ) addresses this by reducing the precision of model weights, typically to 4-bit or lower. However, uniform quantization often leads to significant performance degradation due to the presence of ``outlier features'' -- weights that, while few in number, are critical for maintaining model accuracy. Current state-of-the-art methods such as AWQ (Activation-aware Weight Quantization) and SpQR (Sparse Quantization Representations) rely on calibration data to identify these salient weights via activation magnitudes or Hessian sensitivity. In scenarios where data privacy is paramount or calibration data is unavailable, these methods are inapplicable. In this work, we propose a data-free, structure-aware hypothesis: that the weights identified as Principal Components via Singular Value Decomposition (SVD) are intrinsically important to the model's downstream performance. We introduce a novel selection heuristic that preserves the top-$k$ weights aligned with the principal components in FP32, while aggressively quantizing the residual weights. We compare our method against activation-aware (AWQ) and second-order (SpQR) methods across GLUE benchmarks (MRPC, RTE, QNLI) using a DistilBERT backbone. Our experiments reveal that structural importance is highly correlated with functional importance. On the challenging RTE task, our SVD-based method achieves an accuracy of 66.06\%, outperforming both AWQ (65.34\%) and SpQR (65.34\%) at high protection budgets, validating that intrinsic matrix structure can serve as a robust proxy for weight saliency without the need for forward passes or calibration data.

— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps

Continue Readings

SkeletonAgent: An Agentic Interaction Framework for Skeleton-based Action Recognition

arXiv — cs.CVa day ago

SkeletonAgent: An Agentic Interaction Framework for Skeleton-based Action Recognition

PositiveArtificial Intelligence

The SkeletonAgent framework has been introduced to enhance skeleton-based action recognition by integrating Large Language Models (LLMs) with a recognition model through two cooperative agents, the Questioner and Selector. This innovative approach aims to improve the accuracy of distinguishing similar actions by providing targeted guidance and feedback between the LLM and the recognition model.

Read full article

via arXiv — cs.CV

Masking Matters: Unlocking the Spatial Reasoning Capabilities of LLMs for 3D Scene-Language Understanding

arXiv — cs.CVa day ago

Masking Matters: Unlocking the Spatial Reasoning Capabilities of LLMs for 3D Scene-Language Understanding

PositiveArtificial Intelligence

Recent advancements in 3D scene-language understanding have led to the development of the 3D Spatial Language Instruction Mask (3D-SLIM), which enhances the reasoning capabilities of Large Language Models (LLMs) by replacing traditional causal attention masks with adaptive attention masks tailored to the spatial structures of 3D scenes. This innovation addresses key limitations in current methodologies, such as sequential bias and restricted attention in task-specific reasoning.

Read full article

via arXiv — cs.CV

Towards Unification of Hallucination Detection and Fact Verification for Large Language Models

arXiv — cs.CLa day ago

Towards Unification of Hallucination Detection and Fact Verification for Large Language Models

PositiveArtificial Intelligence

A new framework named UniFact has been introduced to unify Hallucination Detection (HD) and Fact Verification (FV) for Large Language Models (LLMs), addressing the prevalent issue of LLMs generating factually incorrect content, known as hallucinations. This initiative aims to bridge the gap between two previously isolated research paradigms, enhancing the evaluation of LLM outputs.

Read full article

via arXiv — cs.CL

A benchmark dataset for evaluating Syndrome Differentiation and Treatment in large language models

arXiv — cs.CLa day ago

A benchmark dataset for evaluating Syndrome Differentiation and Treatment in large language models

PositiveArtificial Intelligence

A new benchmark dataset, TCM-BEST4SDT, has been proposed to evaluate the capabilities of Large Language Models (LLMs) in the context of Traditional Chinese Medicine (TCM), specifically focusing on Syndrome Differentiation and Treatment (SDT). This dataset aims to address the challenges posed by TCM's individualized and holistic nature, which current evaluation frameworks often overlook.

Read full article

via arXiv — cs.CL

SR-GRPO: Stable Rank as an Intrinsic Geometric Reward for Large Language Model Alignment

arXiv — cs.CLa day ago

SR-GRPO: Stable Rank as an Intrinsic Geometric Reward for Large Language Model Alignment

PositiveArtificial Intelligence

A new study introduces Stable Rank Group Relative Policy Optimization (SR-GRPO), which utilizes stable rank as an intrinsic quality signal for aligning Large Language Models (LLMs) with human preferences, addressing limitations of traditional methods that rely on external supervision. The stable rank measures the effective dimensionality of hidden states, achieving notable improvements in task accuracy.

Read full article

via arXiv — cs.CL

Bangla Hate Speech Classification with Fine-tuned Transformer Models

arXiv — cs.CLa day ago

Bangla Hate Speech Classification with Fine-tuned Transformer Models

PositiveArtificial Intelligence

A recent study has focused on hate speech classification in the Bangla language, which is spoken by over 230 million people in Bangladesh and India. The research, part of the BLP 2025 Shared Task, utilized various machine learning models, including transformer-based models like BanglaBERT and XLM-RoBERTa, achieving significant improvements in hate speech detection compared to traditional baseline methods.

Read full article

via arXiv — cs.CL

The Moral Consistency Pipeline: Continuous Ethical Evaluation for Large Language Models

arXiv — cs.CLa day ago

The Moral Consistency Pipeline: Continuous Ethical Evaluation for Large Language Models

PositiveArtificial Intelligence

The rapid advancement of Large Language Models (LLMs) has prompted the introduction of the Moral Consistency Pipeline (MoCoP), a framework designed for continuous ethical evaluation of these models. MoCoP operates without static datasets, employing a self-sustaining architecture that autonomously generates and refines ethical scenarios, thereby addressing the limitations of existing alignment frameworks that often rely on post-hoc evaluations.

Read full article

via arXiv — cs.CL

Misalignment of LLM-Generated Personas with Human Perceptions in Low-Resource Settings

arXiv — cs.CLa day ago

Misalignment of LLM-Generated Personas with Human Perceptions in Low-Resource Settings

NegativeArtificial Intelligence

A recent study analyzed the effectiveness of Large Language Models (LLMs) in generating social personas in low-resource settings, specifically in Bangladesh. The research revealed that human responses significantly outperformed LLM-generated personas across various metrics, particularly in empathy and credibility, highlighting the limitations of LLMs in understanding cultural and emotional contexts.

Read full article

via arXiv — cs.CL