Collaborative Learning with Multiple Foundation Models for Source-Free Domain Adaptation

arXiv — cs.LGTuesday, November 25, 2025 at 5:00:00 AM
  • A new framework called Collaborative Multi-foundation Adaptation (CoMA) has been proposed to enhance Source-Free Domain Adaptation (SFDA) by utilizing multiple Foundation Models (FMs) such as CLIP and BLIP. This approach aims to improve task adaptation in unlabeled target domains by capturing diverse contextual cues and aligning different FMs with the target model while preserving their semantic distinctiveness.
  • The introduction of CoMA is significant as it addresses the limitations of relying on a single FM, which often leads to biased adaptation and restricted semantic coverage. By leveraging complementary properties of multiple FMs, this framework enhances the adaptability and performance of models in various applications, particularly in scenarios where source data is unavailable.
  • This development reflects a broader trend in artificial intelligence towards collaborative learning and the integration of diverse models to tackle complex tasks. The emphasis on improving semantic understanding and contextual awareness is echoed in various recent advancements, such as enhancing open-vocabulary semantic segmentation and addressing safety concerns in vision-language models, indicating a growing recognition of the need for robust and versatile AI systems.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
Are Neuro-Inspired Multi-Modal Vision-Language Models Resilient to Membership Inference Privacy Leakage?
PositiveArtificial Intelligence
A recent study investigates the resilience of neuro-inspired multi-modal vision-language models (VLMs) against membership inference attacks, which can lead to privacy leakage of sensitive training data. The research introduces a neuroscience-inspired topological regularization framework to analyze the vulnerability of these models to privacy attacks, highlighting a gap in existing literature that primarily focuses on unimodal systems.
AnchorOPT: Towards Optimizing Dynamic Anchors for Adaptive Prompt Learning
PositiveArtificial Intelligence
The recent introduction of AnchorOPT marks a significant advancement in prompt learning methodologies, particularly for CLIP models. This framework enhances the adaptability of anchor tokens by allowing them to learn dynamically from task-specific data and optimizing their positional relationships with soft tokens based on the training context.
Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency Adaptation
PositiveArtificial Intelligence
Earth-Adapter has been introduced as a novel Parameter-Efficient Fine-Tuning (PEFT) method specifically designed to address challenges in Remote Sensing (RS) scenarios, particularly the handling of artifacts that affect image features. This method employs a Mixture of Frequency Adaptation process that integrates Discrete Fourier Transformation to effectively separate artifacts from original features.
Disentangled Geometric Alignment with Adaptive Contrastive Perturbation for Reliable Domain Transfer
PositiveArtificial Intelligence
A novel framework named GAMA++ has been introduced to enhance geometry-aware domain adaptation, addressing issues of disentanglement and rigid perturbation schemes that affect performance. This method employs latent space disentanglement and an adaptive contrastive perturbation strategy tailored to class-specific needs, achieving state-of-the-art results on benchmarks like DomainNet, Office-Home, and VisDA.
Geometrically Regularized Transfer Learning with On-Manifold and Off-Manifold Perturbation
PositiveArtificial Intelligence
A novel framework named MAADA (Manifold-Aware Adversarial Data Augmentation) has been introduced to tackle the challenges of transfer learning under domain shift, effectively decomposing adversarial perturbations into on-manifold and off-manifold components. This approach enhances model robustness and generalization by minimizing geodesic discrepancies between source and target data manifolds, as demonstrated through experiments on DomainNet, VisDA, and Office-Home.
WeatherDiffusion: Controllable Weather Editing in Intrinsic Space
PositiveArtificial Intelligence
WeatherDiffusion has been introduced as a diffusion-based framework that enables controllable weather editing in intrinsic space, utilizing an inverse renderer to estimate material properties and scene geometry from input images. This framework enhances the ability to manipulate weather conditions in generated images through an intrinsic map-aware attention mechanism and CLIP-space interpolation.
Face, Whole-Person, and Object Classification in a Unified Space Via The Interleaved Multi-Domain Identity Curriculum
PositiveArtificial Intelligence
A new study introduces the Interleaved Multi-Domain Identity Curriculum (IMIC), enabling models to perform object recognition, face recognition from varying image qualities, and person recognition in a unified embedding space without significant catastrophic forgetting. This approach was tested on foundation models DINOv3, CLIP, and EVA-02, demonstrating comparable performance to domain experts across all tasks.
stable-pretraining-v1: Foundation Model Research Made Simple
PositiveArtificial Intelligence
The stable-pretraining library has been introduced as a modular and performance-optimized tool for foundation model research, built on PyTorch, Lightning, Hugging Face, and TorchMetrics. This library aims to simplify self-supervised learning (SSL) by providing essential utilities and enhancing the visibility of training dynamics through comprehensive logging.