When Reject Turns into Accept: Quantifying the Vulnerability of LLM-Based Scientific Reviewers to Indirect Prompt Injection

arXiv — cs.CL•Friday, December 12, 2025 at 5:00:00 AM

NeutralArtificial Intelligence

A recent study has examined the vulnerability of Large Language Model (LLM)-based scientific reviewers to indirect prompt injection, focusing on the potential to alter peer review decisions from 'Reject' to 'Accept'. This research introduces a new metric, the Weighted Adversarial Vulnerability Score (WAVS), and evaluates 15 attack strategies across 13 LLMs, including GPT-5 and DeepSeek, using a dataset of 200 scientific papers.
The findings are significant as they highlight the risks associated with the increasing reliance on LLMs in scientific peer review processes, particularly as institutions like AAAI and Stanford implement AI-driven assessment systems. Understanding these vulnerabilities is crucial for maintaining the integrity of scientific evaluations.
This development reflects broader concerns regarding the reliability of AI in critical tasks, such as political fact-checking and error detection in published literature. As LLMs become more integrated into various domains, the need for robust evaluation frameworks and safeguards against manipulation becomes increasingly urgent.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

LucidQuery AI

Combines diffusion reasoning with autoregressive LLM for advanced AI analysis.

AI & DataView app details

Sourcely

Find, cite, and write academic papers with AI-powered research assistance.

AI & DataView app details

CoGrader

AI-powered essay grading for instant, accurate feedback and scores.

AI & DataView app details

PaperCheck

AI proofreading for academic papers, improving structure, clarity, and thesis defense.

AI & DataView app details

Grubby.AI

Humanize AI text instantly to pass Turnitin and other detectors with ease.

Lifestyle & HealthView app details

GPTHumanizer

Bypass AI detection with guaranteed undetectable content generation.

AI & DataView app details

Continue Readings

AIhub2 days ago

AAAI 2025 presidential panel on the future of AI research – video discussion on AGI

NeutralArtificial Intelligence

In March 2025, the Association for the Advancement of Artificial Intelligence (AAAI) released a comprehensive report on the Future of AI Research, led by outgoing President Francesca Rossi. The report addresses 17 key AI topics, aiming to outline the future trajectory of AI research in a structured manner.

Read full article

via AIhub

arXiv — cs.CL2 days ago

TheMCPCompany: Creating General-purpose Agents with Task-specific Tools

NeutralArtificial Intelligence

TheMCPCompany has introduced a benchmark for evaluating tool-calling agents that utilize the Model Context Protocol (MCP) to interact with various real-world services, significantly expanding the tool sets available for Large Language Models (LLMs). This initiative aims to enhance the performance and cost-effectiveness of these agents by leveraging over 18,000 tools through REST APIs.

Read full article

via arXiv — cs.CL

arXiv — cs.LG2 days ago

From Lab to Reality: A Practical Evaluation of Deep Learning Models and LLMs for Vulnerability Detection

NeutralArtificial Intelligence

A recent study evaluated the effectiveness of deep learning models and large language models (LLMs) for vulnerability detection, focusing on models like ReVeal and LineVul across four datasets: Juliet, Devign, BigVul, and ICVul. The research highlights the gap between benchmark performance and real-world applicability, emphasizing the need for systematic evaluation in practical scenarios.

Read full article

via arXiv — cs.LG

arXiv — cs.LG2 days ago

The 2025 Foundation Model Transparency Index

NegativeArtificial Intelligence

The 2025 Foundation Model Transparency Index reveals a significant decline in transparency among foundation model developers, with the average score dropping from 58 in 2024 to 40 in 2025. This index evaluates companies like Alibaba, DeepSeek, and xAI for the first time, highlighting their opacity regarding training data and model usage.

Read full article

via arXiv — cs.LG

arXiv — cs.CL2 days ago

Workflow is All You Need: Escaping the "Statistical Smoothing Trap" via High-Entropy Information Foraging and Adversarial Pacing

PositiveArtificial Intelligence

A new study introduces the DeepNews Framework, which aims to overcome the limitations of large language models (LLMs) in long-form text generation by addressing the 'Statistical Smoothing Trap.' This framework incorporates cognitive processes similar to those of expert financial journalists, enhancing the quality of generated content.

Read full article

via arXiv — cs.CL

arXiv — cs.CL2 days ago

LabelFusion: Learning to Fuse LLMs and Transformer Classifiers for Robust Text Classification

PositiveArtificial Intelligence

LabelFusion has been introduced as a novel fusion ensemble for text classification, combining traditional transformer-based classifiers like RoBERTa with Large Language Models (LLMs) such as OpenAI GPT and Google Gemini. This approach aims to enhance the accuracy and cost-effectiveness of predictions across multi-class and multi-label tasks by integrating embeddings and per-class scores into a multi-layer perceptron for final predictions.

Read full article

via arXiv — cs.CL

TechRepublic — Artificial Intelligence3 days ago

Nvidia Denies ‘Far-Fetched’ Claims as Chip-Smuggling Allegations Target China’s DeepSeek

NegativeArtificial Intelligence

Nvidia has denied allegations of chip smuggling involving its products and the Chinese AI startup DeepSeek, labeling the claims as 'far-fetched.' These allegations highlight potential failures in physical export controls and raise concerns about the ongoing battle against black-market chip sales.

Read full article

via TechRepublic — Artificial Intelligence

Phys.org — AI & Machine Learning3 days ago

Squashing 'fantastic bugs' hidden in AI benchmarks

NegativeArtificial Intelligence

A Stanford team has identified that approximately 5% of the benchmarks used in AI development may contain significant flaws, which could have serious implications for the reliability of AI systems. This review involved an extensive analysis of thousands of benchmarks, raising concerns about the integrity of AI evaluations.

Read full article

via Phys.org — AI & Machine Learning

Ready to build your own newsroom?

Subscribe to unlock a personalised feed, podcasts, newsletters, and notifications tailored to the topics you actually care about