The paper titled 'Preserving Cross-Modal Consistency for CLIP-based Class-Incremental Learning' addresses the challenges of class-incremental learning (CIL) in vision-language models like CLIP. It introduces a two-stage framework called DMC, which separates the adaptation of the vision encoder from the optimization of textual soft prompts. This approach aims to mitigate classifier bias and maintain cross-modal alignment, enhancing the model's ability to learn new categories without forgetting previously acquired knowledge.
The article presents CLIPPan, an unsupervised pansharpening framework that utilizes CLIP, a visual-language model, as a supervisor. This approach addresses the challenges faced by supervised pansharpening methods, particularly the domain adaptation issues arising from the disparity between simulated low-resolution training data and real-world high-resolution scenarios. The framework is designed to improve the understanding of the pansharpening process and enhance the model's ability to recognize various image types, ultimately setting a new state of the art in unsupervised full-resolution pans…
The Long-Range (LoRa) protocol is increasingly used in tags for mentally incapacitated persons (MIPs) to prevent them from going missing. A new study introduces LoRaCompass, a reinforcement learning model aimed at efficiently locating these LoRa tags in unknown environments. This model addresses challenges such as domain shift and signal fluctuation, which can lead to significant localization errors, by learning robust spatial representations from received signal strength indicators (RSSI).