When Reject Turns into Accept: Quantifying the Vulnerability of LLM-Based Scientific Reviewers to Indirect Prompt Injection

arXiv — cs.CLFriday, December 12, 2025 at 5:00:00 AM
  • A recent study has examined the vulnerability of Large Language Model (LLM)-based scientific reviewers to indirect prompt injection, focusing on the potential to alter peer review decisions from 'Reject' to 'Accept'. This research introduces a new metric, the Weighted Adversarial Vulnerability Score (WAVS), and evaluates 15 attack strategies across 13 LLMs, including GPT-5 and DeepSeek, using a dataset of 200 scientific papers.
  • The findings are significant as they highlight the risks associated with the increasing reliance on LLMs in scientific peer review processes, particularly as institutions like AAAI and Stanford implement AI-driven assessment systems. Understanding these vulnerabilities is crucial for maintaining the integrity of scientific evaluations.
  • This development reflects broader concerns regarding the reliability of AI in critical tasks, such as political fact-checking and error detection in published literature. As LLMs become more integrated into various domains, the need for robust evaluation frameworks and safeguards against manipulation becomes increasingly urgent.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
AAAI 2025 presidential panel on the future of AI research – video discussion on AGI
NeutralArtificial Intelligence
In March 2025, the Association for the Advancement of Artificial Intelligence (AAAI) released a comprehensive report on the Future of AI Research, led by outgoing President Francesca Rossi. The report addresses 17 key AI topics, aiming to outline the future trajectory of AI research in a structured manner.
TheMCPCompany: Creating General-purpose Agents with Task-specific Tools
NeutralArtificial Intelligence
TheMCPCompany has introduced a benchmark for evaluating tool-calling agents that utilize the Model Context Protocol (MCP) to interact with various real-world services, significantly expanding the tool sets available for Large Language Models (LLMs). This initiative aims to enhance the performance and cost-effectiveness of these agents by leveraging over 18,000 tools through REST APIs.
From Lab to Reality: A Practical Evaluation of Deep Learning Models and LLMs for Vulnerability Detection
NeutralArtificial Intelligence
A recent study evaluated the effectiveness of deep learning models and large language models (LLMs) for vulnerability detection, focusing on models like ReVeal and LineVul across four datasets: Juliet, Devign, BigVul, and ICVul. The research highlights the gap between benchmark performance and real-world applicability, emphasizing the need for systematic evaluation in practical scenarios.
The 2025 Foundation Model Transparency Index
NegativeArtificial Intelligence
The 2025 Foundation Model Transparency Index reveals a significant decline in transparency among foundation model developers, with the average score dropping from 58 in 2024 to 40 in 2025. This index evaluates companies like Alibaba, DeepSeek, and xAI for the first time, highlighting their opacity regarding training data and model usage.
Workflow is All You Need: Escaping the "Statistical Smoothing Trap" via High-Entropy Information Foraging and Adversarial Pacing
PositiveArtificial Intelligence
A new study introduces the DeepNews Framework, which aims to overcome the limitations of large language models (LLMs) in long-form text generation by addressing the 'Statistical Smoothing Trap.' This framework incorporates cognitive processes similar to those of expert financial journalists, enhancing the quality of generated content.
LabelFusion: Learning to Fuse LLMs and Transformer Classifiers for Robust Text Classification
PositiveArtificial Intelligence
LabelFusion has been introduced as a novel fusion ensemble for text classification, combining traditional transformer-based classifiers like RoBERTa with Large Language Models (LLMs) such as OpenAI GPT and Google Gemini. This approach aims to enhance the accuracy and cost-effectiveness of predictions across multi-class and multi-label tasks by integrating embeddings and per-class scores into a multi-layer perceptron for final predictions.
Nvidia Denies ‘Far-Fetched’ Claims as Chip-Smuggling Allegations Target China’s DeepSeek
NegativeArtificial Intelligence
Nvidia has denied allegations of chip smuggling involving its products and the Chinese AI startup DeepSeek, labeling the claims as 'far-fetched.' These allegations highlight potential failures in physical export controls and raise concerns about the ongoing battle against black-market chip sales.
Squashing 'fantastic bugs' hidden in AI benchmarks
NegativeArtificial Intelligence
A Stanford team has identified that approximately 5% of the benchmarks used in AI development may contain significant flaws, which could have serious implications for the reliability of AI systems. This review involved an extensive analysis of thousands of benchmarks, raising concerns about the integrity of AI evaluations.

Ready to build your own newsroom?

Subscribe to unlock a personalised feed, podcasts, newsletters, and notifications tailored to the topics you actually care about