World PulseNowPowered by AI

Trending:

Learning Spatio-Temporal Feature Representations for Video-Based Gaze Estimation

arXiv — cs.CV•Monday, December 22, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

A new model called the Spatio-Temporal Gaze Network (ST-Gaze) has been proposed to enhance video-based gaze estimation by effectively capturing both spatial and temporal dynamics of human eye gaze across multiple frames. This model integrates a CNN backbone with channel and self-attention modules to optimally fuse eye and face features, achieving state-of-the-art performance on the EVE dataset.
The development of ST-Gaze is significant as it addresses the limitations of existing gaze estimation methods, which often struggle to maintain accuracy due to the complexities of temporal dynamics and feature representation. By improving gaze estimation, this model could have applications in various fields, including human-computer interaction and augmented reality.
This advancement reflects a broader trend in artificial intelligence where models are increasingly designed to leverage temporal information and multi-modal data. Similar innovations in image segmentation and pose estimation highlight the growing emphasis on integrating complex feature representations to enhance performance across diverse AI applications.

— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps

Attentive AI

Extract digital maps from satellite, aerial, and drone imagery using deep learning.

AI & DataView app details

X Headshot

Transform your selfies into professional headshots with AI in minutes.

AI & DataView app details

sync. labs

Create, reanimate, and understand humans in video with advanced lip-sync technology.

Creative & DesignView app details

FETCH HIVE

Build, test, and launch generative AI applications in minutes with ease.

AI & DataView app details

Videotok

Generate viral videos automatically using advanced AI technology.

AI & DataView app details

Eyeye

Train your eyesight with real-time eye tracking and personalized exercises.

Tech & Developer ToolsView app details

Continue Readings

HiFi-Mamba: Dual-Stream W-Laplacian Enhanced Mamba for High-Fidelity MRI Reconstruction

arXiv — cs.CV2 days ago

HiFi-Mamba: Dual-Stream W-Laplacian Enhanced Mamba for High-Fidelity MRI Reconstruction

PositiveArtificial Intelligence

The introduction of HiFi-Mamba, a dual-stream Mamba-based architecture, aims to enhance high-fidelity MRI reconstruction from undersampled k-space data by addressing key limitations of existing Mamba variants. The architecture features stacked W-Laplacian and HiFi-Mamba blocks, which separate low- and high-frequency streams to improve image fidelity and detail.

Read full article

via arXiv — cs.CV

Ready to build your own newsroom?

Subscribe to unlock a personalised feed, podcasts, newsletters, and notifications tailored to the topics you actually care about