Hallucinate or Memorize? The Two Sides of Probabilistic Learning in Large Language Models

arXiv — cs.CLThursday, November 13, 2025 at 5:00:00 AM
The recent study titled 'Hallucinate or Memorize? The Two Sides of Probabilistic Learning in Large Language Models' published on arXiv explores how citation frequency impacts the accuracy of bibliographic records generated by large language models (LLMs). Conducted using GPT-4.1, the research involved generating and verifying 100 citations across various computer science domains. The findings reveal a strong correlation between citation count and factual accuracy, indicating that LLMs tend to produce more reliable citations for highly cited papers. This suggests that LLMs may rely on memorized information rather than generating new content, particularly when the citation count exceeds approximately 1,000. The implications of this study are crucial, as hallucinated references can significantly affect the credibility of LLM outputs in academic and professional settings, highlighting the need for improved mechanisms to ensure the reliability of generated information.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
Counterfactual World Models via Digital Twin-conditioned Video Diffusion
PositiveArtificial Intelligence
A new framework for counterfactual world models has been introduced, which allows for the prediction of temporal sequences under hypothetical modifications to observed scene properties. This advancement builds on traditional world models that focus solely on factual observations, enabling a more nuanced understanding of environments through forward simulation.