UINO-FSS: Unifying Representation Learning and Few-shot Segmentation via Hierarchical Distillation and Mamba-HyperCorrelation

arXiv — cs.CV•Thursday, November 20, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

UINO
The development of UINO
This innovation reflects a broader trend in AI towards creating more adaptable and efficient models, as seen in related works that address segmentation granularity and semantic diversity, indicating a growing emphasis on enhancing model capabilities in complex environments.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

Recommended Readings

arXiv — cs.CV9 hours ago

Deep Learning for Accurate Vision-based Catch Composition in Tropical Tuna Purse Seiners

PositiveArtificial Intelligence

Purse seiners are essential in tuna fishing, accounting for about 69% of the global catch of tropical tuna. To enhance monitoring, Regional Fisheries Management Organizations have mandated the use of electronic monitoring (EM) alongside traditional observers. However, the identification of tuna species remains challenging for AI systems, which require balanced training data. This study highlights the difficulties experts face in distinguishing between bigeye tuna and yellowfin tuna using EM-captured images.

Read full article

via arXiv — cs.CV

arXiv — cs.CV9 hours ago

Unbiased Semantic Decoding with Vision Foundation Models for Few-shot Segmentation

PositiveArtificial Intelligence

The paper presents an Unbiased Semantic Decoding (USD) strategy integrated with the Segment Anything Model (SAM) for few-shot segmentation tasks. This approach aims to enhance the model's generalization ability by extracting target information from both support and query sets simultaneously, addressing the limitations of previous methods that relied heavily on explicit prompts. The study highlights the potential of USD in improving segmentation accuracy across unknown classes.

Read full article

via arXiv — cs.CV

arXiv — cs.CV9 hours ago

Multi-Text Guided Few-Shot Semantic Segmentation

PositiveArtificial Intelligence

Recent advancements in few-shot semantic segmentation using CLIP-based methods have highlighted limitations in capturing the semantic diversity of complex categories. The proposed Multi-Text Guided Few-Shot Semantic Segmentation Network (MTGNet) addresses these issues by employing a dual-branch framework that integrates multiple textual prompts, enhancing segmentation performance through refined textual priors and improved cross-modal optimization of visual priors.

Read full article

via arXiv — cs.CV

arXiv — cs.CVa day ago

D-PerceptCT: Deep Perceptual Enhancement for Low-Dose CT Images

PositiveArtificial Intelligence

D-PerceptCT is a new architecture designed to enhance the quality of Low Dose Computed Tomography (LDCT) images, which are commonly used in medical imaging but often suffer from poor quality due to reduced radiation doses. Traditional enhancement methods can lead to excessive smoothing and loss of important details. D-PerceptCT aims to improve image quality by preserving perceptually relevant features, inspired by the Human Visual System. It includes a Visual Dual-path Extractor (ViDex) to integrate semantic information for better image clarity.

Read full article

via arXiv — cs.CV

arXiv — cs.LGa day ago

Weight Variance Amplifier Improves Accuracy in High-Sparsity One-Shot Pruning

PositiveArtificial Intelligence

Deep neural networks excel in visual recognition tasks but their large parameter counts hinder practical applications. One-shot pruning has emerged as a solution to reduce model size without retraining. However, aggressive pruning often leads to significant accuracy drops. Existing optimizers like SAM and CrAM help mitigate this issue but require additional computations. The proposed Variance Amplifying Regularizer (VAR) increases parameter variance during training, enhancing pruning robustness and maintaining accuracy.

Read full article

via arXiv — cs.LG

arXiv — cs.CVa day ago

LENS: Learning to Segment Anything with Unified Reinforced Reasoning

PositiveArtificial Intelligence

LENS is a new reinforcement-learning framework designed for text-prompted image segmentation, enhancing visual understanding crucial for applications in human-computer interaction and robotics. Unlike traditional supervised methods, LENS incorporates explicit chain-of-thought reasoning during testing, improving generalization to unseen prompts. By utilizing a 3-billion-parameter vision-language model, LENS achieves an average cIoU of 81.2% on benchmark datasets, surpassing existing fine-tuning methods.

Read full article

via arXiv — cs.CV

arXiv — cs.LG2 days ago

UnSAMv2: Self-Supervised Learning Enables Segment Anything at Any Granularity

PositiveArtificial Intelligence

The Segment Anything Model (SAM) has gained popularity as a vision foundation model, but it struggles with controlling segmentation granularity, often requiring manual refinement by users. To overcome this challenge, UnSAMv2 has been introduced, allowing segmentation at any granularity without human annotations. This model builds on the divide-and-conquer strategy of its predecessor, UnSAM, by identifying numerous mask-granularity pairs and implementing a new granularity control embedding for precise segmentation scale management. The model demonstrates effectiveness with only 6,000 unlabeled …

Read full article

via arXiv — cs.LG