Knowledge-Guided Textual Reasoning for Explainable Video Anomaly Detection via LLMs

arXiv — cs.CVWednesday, November 12, 2025 at 5:00:00 AM
The newly proposed Text-based Explainable Video Anomaly Detection (TbVAD) framework leverages language-driven techniques to enhance video anomaly detection, moving away from traditional models that depend on visual features. TbVAD operates in three stages: it first transforms video content into detailed captions using a vision-language model, then organizes these captions into four semantic slots—action, object, context, and environment—creating a structured knowledge base. Finally, it generates explanations that clarify which semantic factors influence anomaly detection. Evaluated on the UCF-Crime and XD-Violence benchmarks, TbVAD demonstrates that textual knowledge reasoning can provide reliable and interpretable results, crucial for effective surveillance applications.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps

Ready to build your own newsroom?

Subscribe to unlock a personalised feed, podcasts, newsletters, and notifications tailored to the topics you actually care about