Reasoning Models Hallucinate More: Factuality-Aware Reinforcement Learning for Large Reasoning Models
NeutralArtificial Intelligence
Recent research highlights a significant issue with large language models (LLMs) that have been optimized through reinforcement learning (RL) for reasoning tasks. While these models have shown impressive capabilities, they also exhibit a troubling increase in hallucinations during reasoning-oriented RL fine-tuning. This finding is crucial as it sheds light on the complexities of training LLMs and the potential risks associated with their deployment in real-world applications.
— via World Pulse Now AI Editorial System
