Look It Up: Analysing Internal Web Search Capabilities of Modern LLMs
NeutralArtificial Intelligence
- Modern large language models (LLMs) like GPT-5-mini and Claude Haiku 4.5 have been evaluated for their internal web search capabilities, revealing that while web access improves accuracy for static queries, it does not effectively enhance performance on dynamic queries due to poor query formulation. This assessment introduces a benchmark to measure the necessity and effectiveness of web searches in real-time responses.
- The findings highlight significant implications for the development of LLMs, as they underscore the need for improved calibration in recognizing when to utilize web search. This could lead to advancements in how these models are trained and deployed, ultimately enhancing user experience and trust in AI-generated responses.
- The ongoing exploration of LLM capabilities reflects a broader trend in AI research, where benchmarks like Bench360 and metrics such as ConCISE are being developed to evaluate various aspects of model performance. These efforts aim to address common challenges in AI, such as the balance between accuracy and conciseness, and the impact of training data on model behavior.
— via World Pulse Now AI Editorial System

