World PulseNowPowered by AI

Trending:

Anthropic Finds LLMs Can Be Poisoned Using Small Number of Documents

InfoQ — AI, ML & Data Engineering•Tuesday, November 11, 2025 at 2:00:00 PM

NeutralArtificial Intelligence

Anthropic Finds LLMs Can Be Poisoned Using Small Number of Documents

Anthropic's Alignment Science team released a study on poisoning attacks on LLM training, demonstrating that only 250 malicious examples are sufficient to create a backdoor vulnerability in LLMs. The findings suggest that these attacks become easier as models scale up, raising concerns about the security of AI systems.
This development is significant for Anthropic as it highlights potential vulnerabilities in their LLMs, prompting a need for enhanced security measures in AI training processes. The ability to exploit such weaknesses could undermine trust in AI technologies.
While no related articles were identified, the findings resonate with ongoing discussions in the AI community regarding the security and robustness of machine learning models, emphasizing the importance of safeguarding against potential threats.

— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended Readings

Sector HQ Weekly Digest - November 17, 2025

DEV Community2 days ago

Sector HQ Weekly Digest - November 17, 2025

NeutralArtificial Intelligence

The Sector HQ Weekly Digest for November 17, 2025, highlights the latest developments in the AI industry, focusing on the performance of top companies. OpenAI leads with a score of 442385.7 and 343 events, followed by Anthropic and Amazon. The report also notes significant movements, with Sony jumping 277 positions in the rankings, reflecting the dynamic nature of the AI sector.

Read full article

via DEV Community

From Fact to Judgment: Investigating the Impact of Task Framing on LLM Conviction in Dialogue Systems

arXiv — cs.CL3 days ago

From Fact to Judgment: Investigating the Impact of Task Framing on LLM Conviction in Dialogue Systems

NeutralArtificial Intelligence

The article investigates the impact of task framing on the conviction of large language models (LLMs) in dialogue systems. It explores how LLMs assess tasks requiring social judgment, contrasting their performance on factual queries with conversational judgment tasks. The study reveals that reframing a task can significantly alter an LLM's judgment, particularly under conversational pressure, highlighting the complexities of LLM decision-making in social contexts.

Read full article

via arXiv — cs.CL

Expert-Guided Prompting and Retrieval-Augmented Generation for Emergency Medical Service Question Answering

arXiv — cs.CL3 days ago

Expert-Guided Prompting and Retrieval-Augmented Generation for Emergency Medical Service Question Answering

PositiveArtificial Intelligence

Large language models (LLMs) have shown potential in medical question answering but often lack the domain-specific expertise required in emergency medical services (EMS). The study introduces EMSQA, a dataset with 24.3K questions across 10 clinical areas and 4 certification levels, along with knowledge bases containing 40K documents and 2M tokens. It also presents Expert-CoT and ExpertRAG, strategies that enhance performance by integrating clinical context, resulting in improved accuracy and exam pass rates for EMS certification.

Read full article

via arXiv — cs.CL

Can LLMs Detect Their Own Hallucinations?

arXiv — cs.CL3 days ago

Can LLMs Detect Their Own Hallucinations?

PositiveArtificial Intelligence

Large language models (LLMs) are capable of generating fluent responses but can sometimes produce inaccurate information, referred to as hallucinations. A recent study investigates whether these models can recognize their own inaccuracies. The research formulates hallucination detection as a classification task and introduces a framework utilizing Chain-of-Thought (CoT) to extract knowledge from LLM parameters. Experimental results show that GPT-3.5 Turbo with CoT detected 58.2% of its own hallucinations, suggesting that LLMs can identify inaccuracies if they possess sufficient knowledge.

Read full article

via arXiv — cs.CL

PustakAI: Curriculum-Aligned and Interactive Textbooks Using Large Language Models

arXiv — cs.CL3 days ago

PustakAI: Curriculum-Aligned and Interactive Textbooks Using Large Language Models

PositiveArtificial Intelligence

PustakAI is a framework designed to create interactive textbooks aligned with the NCERT curriculum for grades 6 to 8 in India. Utilizing Large Language Models (LLMs), it aims to enhance personalized learning experiences, particularly in areas with limited educational resources. The initiative addresses challenges in adapting LLMs to specific curricular content, ensuring accuracy and pedagogical relevance.

Read full article

via arXiv — cs.CL

LaoBench: A Large-Scale Multidimensional Lao Benchmark for Large Language Models

arXiv — cs.CL3 days ago

LaoBench: A Large-Scale Multidimensional Lao Benchmark for Large Language Models

PositiveArtificial Intelligence

LaoBench is a newly introduced large-scale benchmark dataset aimed at evaluating large language models (LLMs) in the Lao language. It consists of over 17,000 curated samples that assess knowledge application, foundational education, and bilingual translation among Lao, Chinese, and English. The dataset is designed to enhance the understanding and reasoning capabilities of LLMs in low-resource languages, addressing the current challenges faced by models in mastering Lao.

Read full article

via arXiv — cs.CL

ICL-Router: In-Context Learned Model Representations for LLM Routing

arXiv — cs.LG3 days ago

ICL-Router: In-Context Learned Model Representations for LLM Routing

PositiveArtificial Intelligence

The research paper titled 'ICL-Router: In-Context Learned Model Representations for LLM Routing' presents a novel routing method for large language models (LLMs) that utilizes in-context vectors to enhance model representation. This two-stage method first embeds queries into vectors and then profiles candidate models based on their performance. The approach aims to improve routing performance and allows for the integration of new models without the need for retraining, addressing scalability challenges in LLM applications.

Read full article

via arXiv — cs.LG

Empirical Characterization of Temporal Constraint Processing in LLMs

arXiv — cs.CL3 days ago

Empirical Characterization of Temporal Constraint Processing in LLMs

NeutralArtificial Intelligence

The study titled 'Empirical Characterization of Temporal Constraint Processing in LLMs' investigates how large language models (LLMs) handle temporal constraints in decision-making. It evaluates eight production-scale models with parameter counts ranging from 2.8 to 8 billion. The findings reveal significant risks, including a bimodal performance distribution, high sensitivity to prompt changes, and a 100% false positive rate in some models. Fine-tuning on synthetic examples improved performance by up to 37 percentage points, but the ability to satisfy temporal constraints remains unproven.

Read full article

via arXiv — cs.CL