Curing Semantic Drift: A Dynamic Approach to Grounding Generation in Large Vision-Language Models
PositiveArtificial Intelligence
Large Vision-Language Models (LVLMs) often experience 'semantic drift', a phenomenon where they progressively detach from visual input, leading to hallucinations. Current training-free decoding strategies have limitations, including high computational costs and reliance on unreliable proxies. The introduction of Dynamic Logits Calibration (DLC) offers a novel, efficient solution to this issue. DLC operates in real-time, performing visual alignment checks to ensure that the generated outputs remain grounded in visual evidence.
— via World Pulse Now AI Editorial System
