NAS-LoRA: Empowering Parameter-Efficient Fine-Tuning for Visual Foundation Models with Searchable Adaptation

arXiv — cs.CV•Thursday, December 4, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

The introduction of NAS-LoRA represents a significant advancement in the adaptation of the Segment Anything Model (SAM) for specialized tasks, particularly in medical and agricultural imaging. This new Parameter-Efficient Fine-Tuning (PEFT) method integrates a Neural Architecture Search (NAS) block to enhance SAM's performance by addressing its limitations in acquiring high-level semantic information due to the lack of spatial priors in its Transformer encoder.
This development is crucial as it enables SAM to better adapt to diverse domains, thereby improving its utility in practical applications. By bridging the semantic gap between pre-trained models and specialized tasks, NAS-LoRA enhances the model's effectiveness, making it a valuable tool for researchers and practitioners in fields requiring precise image segmentation.
The evolution of SAM and its adaptations, such as NAS-LoRA, reflects a broader trend in artificial intelligence towards improving model efficiency and adaptability. As various frameworks emerge to tackle challenges like low-rank adaptation and segmentation granularity, the ongoing innovations signify a concerted effort to refine visual foundation models, ultimately aiming for enhanced performance across multiple applications, including medical imaging and open-vocabulary semantic segmentation.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

LucidQuery AI

Combines diffusion reasoning with autoregressive LLM for advanced AI analysis.

AI & DataTry the app

Lenso.ai

Find any image instantly with AI-powered reverse search.

AI & DataTry the app

Open Source Surveillance

Search social media, cameras, and IoT devices for public safety insights.

AI & DataTry the app

Continue Readings

arXiv — cs.CV20 hours ago

Malicious Image Analysis via Vision-Language Segmentation Fusion: Detection, Element, and Location in One-shot

PositiveArtificial Intelligence

A new zero-shot pipeline has been introduced for detecting illicit visual content, which not only identifies harmful images but also pinpoints the specific objects and their locations within the images. This system utilizes a foundation segmentation model to generate object masks and employs a vision-language model to assess the malicious relevance of these objects, culminating in a consolidated malicious object map.

Read full article

via arXiv — cs.CV

arXiv — cs.CV20 hours ago

MORPH: PDE Foundation Models with Arbitrary Data Modality

PositiveArtificial Intelligence

MORPH has been introduced as a modality-agnostic, autoregressive foundation model designed for partial differential equations (PDEs), utilizing a convolutional vision transformer backbone to manage diverse spatiotemporal datasets across various resolutions and data modalities. The model incorporates advanced techniques such as component-wise convolution and inter-field cross-attention to enhance its predictive capabilities.

Read full article

via arXiv — cs.CV

arXiv — cs.CL20 hours ago

Optimizing Fine-Tuning through Advanced Initialization Strategies for Low-Rank Adaptation

PositiveArtificial Intelligence

Recent advancements in fine-tuning methodologies have led to the introduction of IniLoRA, a novel initialization strategy designed to optimize Low-Rank Adaptation (LoRA) for large language models. IniLoRA initializes low-rank matrices to closely approximate original model weights, addressing limitations in performance seen with traditional LoRA methods. Experimental results demonstrate that IniLoRA outperforms LoRA across various models and tasks, with two additional variants, IniLoRA-$\alpha$ and IniLoRA-$\beta$, further enhancing performance.

Read full article

via arXiv — cs.CL

arXiv — cs.CV20 hours ago

Boundary-Aware Test-Time Adaptation for Zero-Shot Medical Image Segmentation

PositiveArtificial Intelligence

A new framework named BA-TTA-SAM has been proposed to enhance zero-shot medical image segmentation by integrating test-time adaptation mechanisms with the Segment Anything Model (SAM). This approach addresses the challenges posed by limited annotated data and domain shifts in medical datasets, aiming to improve segmentation performance without extensive retraining.

Read full article

via arXiv — cs.CV

arXiv — cs.CL2 days ago

Dual LoRA: Enhancing LoRA with Magnitude and Direction Updates

PositiveArtificial Intelligence

A novel method called Dual LoRA has been proposed to enhance the performance of Low-Rank Adaptation (LoRA) in fine-tuning large language models (LLMs). This method introduces two distinct groups within low-rank matrices: a magnitude group for controlling the extent of parameter updates and a direction group for determining the update direction, thereby improving the adaptation process.

Read full article

via arXiv — cs.CL

arXiv — cs.CV2 days ago

LoRA Patching: Exposing the Fragility of Proactive Defenses against Deepfakes

NegativeArtificial Intelligence

A recent study highlights the vulnerabilities of proactive defenses against deepfakes, revealing that these defenses often lack the necessary robustness and reliability. The research introduces a novel technique called Low-Rank Adaptation (LoRA) patching, which effectively bypasses existing defenses by injecting adaptable patches into deepfake generators. This method also includes a Multi-Modal Feature Alignment loss to ensure semantic consistency in outputs.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

AfroBeats Dance Movement Analysis Using Computer Vision: A Proof-of-Concept Framework Combining YOLO and Segment Anything Model

PositiveArtificial Intelligence

A new study has introduced a proof-of-concept framework for analyzing AfroBeats dance movements using advanced computer vision techniques, specifically integrating YOLOv8 and v11 for dancer detection alongside the Segment Anything Model (SAM) for precise segmentation. This innovative approach allows for the tracking and quantification of dancer movements in video recordings without the need for specialized equipment or markers.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

On Efficient Variants of Segment Anything Model: A Survey

NeutralArtificial Intelligence

A comprehensive survey has been published on efficient variants of the Segment Anything Model (SAM), highlighting its strong generalization capabilities for image segmentation tasks while addressing its high computational demands. The survey categorizes various acceleration strategies and discusses future research directions aimed at improving efficiency without sacrificing accuracy.

Read full article

via arXiv — cs.CV