DCIS: Efficient Length Extrapolation of LLMs via Divide-and-Conquer Scaling Factor Search

arXiv — cs.CLTuesday, November 25, 2025 at 5:00:00 AM
  • A novel framework called Divide-and-Conquer Incremental Search (DCIS) has been proposed to enhance the fine-tuning of large language models (LLMs) by optimizing the scaling factors of Rotary Position Embedding (RoPE). This approach aims to extend the context length of LLMs while mitigating performance decay during fine-tuning, addressing the limitations of traditional methods that often lead to increased costs and reduced efficiency.
  • The introduction of DCIS is significant as it allows for more effective utilization of LLMs in various applications, potentially improving their performance in tasks requiring longer context windows. By refining the scaling factors, this method not only enhances model efficiency but also reduces the computational burden associated with fine-tuning, making advanced LLMs more accessible for practical use.
  • This development reflects a broader trend in artificial intelligence where researchers are increasingly focused on optimizing model architectures and training methodologies. As the demand for more capable and efficient AI systems grows, innovations like DCIS highlight the ongoing efforts to overcome existing limitations in model performance and resource utilization, paralleling advancements in other areas such as multimodal understanding and real-time inference.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
Shape-Adapting Gated Experts: Dynamic Expert Routing for Colonoscopic Lesion Segmentation
PositiveArtificial Intelligence
The introduction of Shape-Adapting Gated Experts (SAGE) marks a significant advancement in computer-aided cancer detection, particularly for colonoscopic lesion segmentation. This innovative framework addresses the challenges posed by cellular heterogeneity in gigapixel Whole Slide Images (WSIs) by enabling dynamic expert routing, thus enhancing adaptability to input variability.
BCWildfire: A Long-term Multi-factor Dataset and Deep Learning Benchmark for Boreal Wildfire Risk Prediction
PositiveArtificial Intelligence
A new dataset titled 'BCWildfire' has been introduced, providing a comprehensive 25-year daily-resolution record of wildfire risk across 240 million hectares in British Columbia. This dataset includes 38 covariates such as active fire detections, weather variables, fuel conditions, terrain features, and human activity, addressing the scarcity of publicly available benchmark datasets for wildfire risk prediction.
Life-IQA: Boosting Blind Image Quality Assessment through GCN-enhanced Layer Interaction and MoE-based Feature Decoupling
PositiveArtificial Intelligence
A new study has introduced Life-IQA, a framework designed to enhance blind image quality assessment (BIQA) by utilizing GCN-enhanced layer interaction and MoE-based feature decoupling. This approach addresses the limitations of existing BIQA methods that often overlook the varying contributions of shallow and deep features in quality prediction.
Learning Plug-and-play Memory for Guiding Video Diffusion Models
PositiveArtificial Intelligence
A new study introduces a plug-and-play memory system for Diffusion Transformer-based video generation models, specifically the DiT, enhancing their ability to incorporate world knowledge and improve visual coherence. This development addresses the models' frequent violations of physical laws and commonsense dynamics, which have been a significant limitation in their application.
DualGazeNet: A Biologically Inspired Dual-Gaze Query Network for Salient Object Detection
PositiveArtificial Intelligence
DualGazeNet has been introduced as a biologically inspired dual-gaze query network aimed at enhancing salient object detection (SOD) while minimizing architectural complexity. This framework seeks to overcome challenges faced by existing SOD methods, which often suffer from feature redundancy and performance bottlenecks due to their intricate designs. By simplifying the architecture, DualGazeNet aims to achieve state-of-the-art accuracy and computational efficiency.
Selective Rotary Position Embedding
PositiveArtificial Intelligence
The introduction of Selective Rotary Position Embedding (Selective RoPE) presents a novel input-dependent rotary embedding mechanism that generalizes existing Rotary Position Embeddings (RoPE) for both linear and softmax transformers. This mechanism allows for arbitrary angle rotations, enhancing the encoding of positional information essential for language modeling.
Generalizable Radio-Frequency Radiance Fields for Spatial Spectrum Synthesis
PositiveArtificial Intelligence
The introduction of Generalizable Radio-Frequency (RF) Radiance Fields, or GRaF, marks a significant advancement in modeling RF signal propagation, allowing for the synthesis of spatial spectra at arbitrary transmitter or receiver locations. This framework utilizes an interpolation theory that approximates the spatial spectrum from nearby transmitters, enhancing the understanding of RF signal behavior in various environments.
A Unified Voxel Diffusion Module for Point Cloud 3D Object Detection
PositiveArtificial Intelligence
A novel Voxel Diffusion Module (VDM) has been proposed to enhance voxel-level representation and diffusion in point cloud data, addressing limitations in detection accuracy associated with traditional voxel-based representations. This module integrates sparse 3D convolutions and residual connections, allowing for improved processing of point cloud data in 3D object detection tasks.