MoHoBench: Assessing Honesty of Multimodal Large Language Models via Unanswerable Visual Questions
NeutralArtificial Intelligence
- MoHoBench has been introduced as a benchmark to evaluate the honesty of Multimodal Large Language Models (MLLMs) in response to unanswerable visual questions, addressing a critical gap in assessing AI reliability. The study analyzed 28 MLLMs using over 12,000 visual question samples, revealing significant shortcomings in their honesty.
- This development is crucial as it underscores the importance of trustworthiness in AI systems, particularly in applications where accurate information is vital. The findings indicate that many MLLMs fail to provide reliable answers, raising concerns about their deployment in real
- The challenges of ensuring honesty in AI models are echoed in broader discussions about the reliability of AI
— via World Pulse Now AI Editorial System
