Explainable Parkinsons Disease Gait Recognition Using Multimodal RGB-D Fusion and Large Language Models

arXiv — cs.CVFriday, December 5, 2025 at 5:00:00 AM
  • A new framework for recognizing Parkinsonian gait patterns has been developed, utilizing a multimodal approach that fuses RGB and Depth (RGB-D) data. This system employs dual YOLOv11-based encoders and a Multi-Scale Local-Global Extraction module to enhance gait analysis, particularly in challenging conditions such as low lighting or occlusion.
  • This advancement is significant for early detection of Parkinson's disease, as it improves the accuracy and interpretability of gait analysis, addressing limitations of existing single-modality approaches that lack robustness and clinical transparency.
  • The integration of Large Language Models (LLMs) in this context highlights a growing trend in AI research, where multimodal frameworks are increasingly being employed to enhance various applications, including medical diagnostics and robotic tasks. This reflects a broader movement towards improving the interpretability and effectiveness of AI systems across multiple domains.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
EoS-FM: Can an Ensemble of Specialist Models act as a Generalist Feature Extractor?
PositiveArtificial Intelligence
Recent advancements in Earth Observation have led to the development of the Ensemble-of-Specialists framework, which aims to create Remote Sensing Foundation Models (RSFMs) that generalize across tasks with limited supervision. This approach contrasts with the current trend of scaling model size, which is resource-intensive and environmentally unsustainable.
SignRoundV2: Closing the Performance Gap in Extremely Low-Bit Post-Training Quantization for LLMs
PositiveArtificial Intelligence
SignRoundV2 has been introduced as a post-training quantization framework aimed at improving the efficiency of deploying Large Language Models (LLMs) while minimizing performance degradation typically associated with low-bit quantization. This framework employs a fast sensitivity metric and a lightweight pre-tuning search to optimize layer-wise bit allocation and quantization scales, achieving competitive accuracy even at extremely low-bit levels.
Refa\c{c}ade: Editing Object with Given Reference Texture
PositiveArtificial Intelligence
Recent advancements in diffusion models have led to the introduction of Refa\c{c}ade, a novel method for Object Retexture, which allows for the transfer of local textures from a reference object to a target object in images or videos. This method addresses the limitations of existing approaches by enhancing controllability and precision in texture transfer through innovative designs, including a texture remover trained on 3D mesh renderings.
OmniScaleSR: Unleashing Scale-Controlled Diffusion Prior for Faithful and Realistic Arbitrary-Scale Image Super-Resolution
PositiveArtificial Intelligence
OmniScaleSR has been introduced as a novel approach to arbitrary-scale super-resolution (ASSR), addressing the limitations of traditional super-resolution methods that only function at fixed scales. This model utilizes a scale-controlled diffusion prior to enhance the realism and detail in generated images, overcoming challenges faced by existing diffusion-based models that lack explicit scale control.
A dynamic memory assignment strategy for dilation-based ICP algorithm on embedded GPUs
PositiveArtificial Intelligence
A new memory-efficient optimization strategy for the VANICP point cloud registration algorithm has been proposed, enabling its lightweight execution on embedded GPUs with limited hardware resources. This strategy addresses the high memory demands of the original implementation, which hindered its deployment in resource-constrained environments.
4DLangVGGT: 4D Language-Visual Geometry Grounded Transformer
PositiveArtificial Intelligence
The introduction of 4DLangVGGT, a Transformer-based framework for 4D language grounding, marks a significant advancement in the construction of 4D language fields, essential for applications in embodied AI and augmented/virtual reality. This framework integrates geometric perception and language alignment, addressing limitations of existing methods that rely on scene-specific Gaussian splatting.
Average Calibration Losses for Reliable Uncertainty in Medical Image Segmentation
PositiveArtificial Intelligence
Recent research has introduced a differentiable formulation of marginal L1 Average Calibration Error (mL1-ACE) as an auxiliary loss for deep neural networks in medical image segmentation, addressing the issue of overconfidence in predictions. The study demonstrated that incorporating mL1-ACE significantly reduces calibration errors across four datasets, including ACDC and BraTS, while maintaining high Dice Similarity Coefficients.
Shadow in the Cache: Unveiling and Mitigating Privacy Risks of KV-cache in LLM Inference
PositiveArtificial Intelligence
A recent study has unveiled significant privacy risks associated with the Key-Value (KV) cache used in Large Language Model (LLM) inference. The research highlights that attackers can reconstruct sensitive user inputs from the KV-cache, demonstrating vulnerabilities through various attack vectors, including direct Inversion, Collision, and semantic-based Injection Attacks. To address these issues, the study proposes KV-Cloak, a novel defense mechanism designed to enhance privacy during LLM operations.