Click2Graph: Interactive Panoptic Video Scene Graphs from a Single Click

arXiv — cs.CV•Friday, November 21, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

Click2Graph introduces a groundbreaking approach to Panoptic Video Scene Graph Generation, enabling user interaction to enhance visual understanding in video analysis.
This development represents a significant advancement in AI
The integration of interactive frameworks like Click2Graph highlights a growing trend in AI research towards enhancing user engagement and precision in applications such as surgical video analysis, where models like SAM2 are also being evaluated for their effectiveness.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

Continue Readings

arXiv — cs.CV3 days ago

UniUltra: Interactive Parameter-Efficient SAM2 for Universal Ultrasound Segmentation

PositiveArtificial Intelligence

The Segment Anything Model 2 (SAM2) has shown impressive universal segmentation capabilities on natural images, but its performance on ultrasound images is hindered by domain disparities. To tackle this issue, UniUltra is proposed, featuring a context-edge hybrid adapter (CH-Adapter) for enhanced ultrasound imaging perception and a deep-supervised knowledge distillation (DSKD) technique to facilitate effective deployment in clinical settings.

Read full article

via arXiv — cs.CV

arXiv — cs.CV3 days ago

Segmenting Collision Sound Sources in Egocentric Videos

PositiveArtificial Intelligence

The proposed task of Collision Sound Source Segmentation (CS3) aims to identify and segment objects responsible for collision sounds in egocentric videos. This task addresses challenges such as cluttered visual scenes and brief interactions, utilizing a weakly-supervised method that leverages audio cues and foundation models like CLIP and SAM2. The focus on egocentric video allows for clearer sound identification despite visual complexity.

Read full article

via arXiv — cs.CV

arXiv — cs.CV3 days ago

VideoSeg-R1:Reasoning Video Object Segmentation via Reinforcement Learning

PositiveArtificial Intelligence

VideoSeg-R1 is a novel framework that integrates reinforcement learning into video object segmentation, overcoming limitations of traditional supervised methods. It features a decoupled architecture that combines referring image segmentation with video mask propagation, utilizing a hierarchical text-guided frame sampler, a reasoning model, and a segmentation-propagation stage. This approach enhances efficiency and accuracy in complex video reasoning tasks, achieving state-of-the-art performance across multiple benchmarks.

Read full article

via arXiv — cs.CV

arXiv — cs.CV3 days ago

SAM2S: Segment Anything in Surgical Videos via Semantic Long-term Tracking

PositiveArtificial Intelligence

The Segment Anything Model 2 (SAM2) has been enhanced with the introduction of SAM2S, a model designed for surgical video segmentation. This development addresses challenges in long-term tracking and domain gaps in surgical scenarios by utilizing the SA-SV benchmark, which includes extensive spatio-temporal annotations. The model incorporates a diverse memory mechanism and temporal semantic learning to improve instrument and tissue tracking in surgical videos.

Read full article

via arXiv — cs.CV