Random Initialization of Gated Sparse Adapters

arXiv — cs.LGTuesday, November 4, 2025 at 5:00:00 AM
A new approach called Random Initialization of Gated Sparse Adapters (RIGSA) has been introduced to tackle the issue of catastrophic forgetting in language models during fine-tuning. Unlike traditional methods like LoRA, RIGSA utilizes sparse adaptation without rank constraints, offering a promising alternative for improving model performance on new tasks.
— Curated by the World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended Readings
Regularization Through Reasoning: Systematic Improvements in Language Model Classification via Explanation-Enhanced Fine-Tuning
PositiveArtificial Intelligence
A recent study explores how adding brief explanations to labels during the fine-tuning of language models can enhance their classification abilities. By evaluating the quality of conversational responses based on naturalness, comprehensiveness, and relevance, researchers found that this method significantly improves model performance.
Tool Zero: Training Tool-Augmented LLMs via Pure RL from Scratch
PositiveArtificial Intelligence
Tool Zero introduces an innovative approach to training language models using pure reinforcement learning from scratch. This method aims to enhance the capabilities of language models for complex tasks, overcoming the limitations of traditional supervised fine-tuning that often struggles with unfamiliar scenarios.
Accumulating Context Changes the Beliefs of Language Models
NeutralArtificial Intelligence
Recent advancements in language models have enhanced their autonomy, allowing them to accumulate more context without user input. While this can improve their performance in tasks like brainstorming and research, it also raises concerns about how these changes might affect their belief profiles and understanding of the world.
Towards Global Retrieval Augmented Generation: A Benchmark for Corpus-Level Reasoning
PositiveArtificial Intelligence
A new benchmark for Retrieval-Augmented Generation (RAG) has been introduced, aiming to enhance the capabilities of large language models by addressing hallucinations. Unlike previous benchmarks that focused on local retrieval, this new approach emphasizes the need for global reasoning, which is essential for many real-world applications.
ORANGE: An Online Reflection ANd GEneration framework with Domain Knowledge for Text-to-SQL
PositiveArtificial Intelligence
The article discusses ORANGE, a new framework that leverages domain knowledge to improve the translation of natural language into SQL queries. It highlights the advancements made by large language models while addressing the existing semantic gaps in database-specific contexts. By utilizing historical translation logs, ORANGE aims to enhance the understanding of real-world database usage patterns.
Adapting General-Purpose Foundation Models for X-ray Ptychography in Low-Data Regimes
PositiveArtificial Intelligence
A new benchmark called PtychoBench has been introduced to enhance the automation of workflows in advanced microscopy, particularly for ptychographic analysis. This development aims to adapt general-purpose foundation models like language and vision-language models for specialized scientific tasks, addressing the challenges of domain adaptation.
Mixture of Routers
PositiveArtificial Intelligence
Recent advancements in machine learning highlight the benefits of combining Low-Rank Adaptation (LoRA) with Mixture-of-Experts (MoE) to improve the performance of large language models. While LoRA has been recognized for its efficiency in parameter usage, its impact alone has been limited. This new approach could lead to significant enhancements in fine-tuning, making it an exciting development in the field.
Flashlight: PyTorch Compiler Extensions to Accelerate Attention Variants
PositiveArtificial Intelligence
The recent introduction of FlashAttention and its compiler extensions marks a significant advancement in optimizing attention mechanisms for large language models. By leveraging techniques like tiling and kernel fusion, these innovations aim to enhance both model quality and efficiency, addressing the challenges posed by various attention variants.
Latest from Artificial Intelligence
Databricks Free Edition Hackathon: show the world what’s possible in data and AI
PositiveArtificial Intelligence
The Databricks Free Edition Hackathon is an exciting opportunity for developers and students to showcase their creativity in data and AI. By providing free access to powerful tools, Databricks is fostering innovation and collaboration worldwide. This initiative not only empowers participants to explore new ideas but also highlights the potential of data-driven solutions in various industries, making it a significant event for the tech community.
Best early Black Friday Walmart deals 2025: 20+ sales out early
PositiveArtificial Intelligence
Walmart has kicked off the holiday shopping season by unveiling its early Black Friday deals for 2025, showcasing a variety of discounts on popular items like TVs and headphones. This is significant as it gives shoppers a head start on their holiday shopping, allowing them to snag great deals before the rush. With more than 20 sales already live, customers can expect to find substantial savings, making it an exciting time for bargain hunters.
Which portable power station is the most efficient? See our lab-tested winners
PositiveArtificial Intelligence
In our latest lab tests, we evaluated eight leading portable power stations from brands like Jackery, Anker, and Bluetti to determine which models stand out in efficiency. This matters because as more people rely on portable power for outdoor activities and emergencies, knowing which products perform best can help consumers make informed choices.
Hundreds of CBP Civilian Employees Unpaid or Furloughed Amid Ongoing Shutdown: Report
NegativeArtificial Intelligence
The ongoing federal government shutdown has left hundreds of civilian employees at U.S. Customs and Border Protection (CBP) either unpaid or furloughed for over a month. This situation not only affects the livelihoods of these workers but also raises concerns about the operational capacity of CBP during a critical time. The implications of such a shutdown extend beyond just the employees, impacting border security and immigration processes, which are vital to national interests.
Early New Typhoon Heading Toward Philippines After Kalmaegi Devastates the Nation
NegativeArtificial Intelligence
The Philippines is grappling with the aftermath of Typhoon Kalmaegi, which has tragically claimed at least 40 lives and displaced hundreds of thousands. As the nation begins to recover from this devastation, a new tropical system is on the horizon, raising concerns about further challenges ahead. This situation is critical as it highlights the vulnerability of the region to severe weather events and the urgent need for disaster preparedness.
Former Meta employees launch a ring to take voice notes and control music
PositiveArtificial Intelligence
Two former Meta employees have launched a new startup called Sandbar, introducing a unique ring designed for taking voice notes and controlling music. This innovation is part of a growing trend in voice-based hardware aimed at enhancing companionship and productivity. As technology continues to evolve, products like Sandbar's ring could significantly change how we interact with devices, making everyday tasks more seamless and intuitive.