FlipVQA-Miner: Cross-Page Visual Question-Answer Mining from Textbooks

arXiv — cs.LG•Friday, November 21, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

FlipVQA
This innovation is significant as it addresses the high costs and limitations of existing datasets, providing a more efficient means to harness the wealth of information contained in textbooks and exercise materials.
The emergence of such methodologies highlights ongoing challenges in the field of AI, particularly the need for high

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

Continue Readings

arXiv — cs.CL2 days ago

False Sense of Security: Why Probing-based Malicious Input Detection Fails to Generalize

NegativeArtificial Intelligence

Recent research highlights the limitations of probing-based approaches for detecting malicious inputs in Large Language Models (LLMs). Despite their potential, these methods often fail to generalize, as they tend to identify superficial patterns rather than the semantic harmfulness of inputs. Controlled experiments confirm that probes primarily learn instructional patterns and trigger words, raising concerns about the safety and reliability of LLMs in practical applications.

Read full article

via arXiv — cs.CL

arXiv — cs.CV2 days ago

An Image Is Worth Ten Thousand Words: Verbose-Text Induction Attacks on VLMs

PositiveArtificial Intelligence

The paper discusses the challenges associated with Vision-Language Models (VLMs) in generating lengthy outputs with low information density, which leads to increased energy consumption and costs. It introduces a novel verbose-text induction attack (VTIA) that uses adversarial perturbations to optimize output token length, addressing the limitations of existing methods that merely delay the end of output without maximizing length.

Read full article

via arXiv — cs.CV

arXiv — cs.CL2 days ago

Chain of Summaries: Summarization Through Iterative Questioning

PositiveArtificial Intelligence

The article discusses a novel method called Chain of Summaries (CoS) designed to enhance the summarization capabilities of Large Language Models (LLMs). By employing a dialectical approach inspired by Hegel, CoS iteratively refines initial summaries through questioning, resulting in more comprehensive and contextually relevant outputs. Experiments show that CoS significantly outperforms existing summarization techniques, improving Q&A performance and addressing the challenges posed by LLM-unfriendly web content.

Read full article

via arXiv — cs.CL

arXiv — cs.CL2 days ago

An Iterative Question-Guided Framework for Knowledge Base Question Answering

PositiveArtificial Intelligence

The paper presents iQUEST, an innovative framework for Knowledge Base Question Answering (KBQA) that addresses the challenges of multi-hop reasoning. By iteratively breaking down complex queries into simpler sub-questions, iQUEST ensures coherent reasoning paths and retains critical connections. The framework incorporates a Graph Neural Network to enhance reasoning capabilities, making it a significant advancement in the integration of Large Language Models and knowledge graphs.

Read full article

via arXiv — cs.CL

arXiv — cs.CL2 days ago

Multi-dimensional Data Analysis and Applications Basing on LLM Agents and Knowledge Graph Interactions

PositiveArtificial Intelligence

The paper discusses a novel approach to multi-dimensional data analysis that leverages interactions between Large Language Models (LLMs) and Knowledge Graphs (KGs). It addresses the challenges of extracting insights from complex data by proposing a dynamic analytical ecosystem that allows real-time updates and visualization. This method enhances the ability to explore and analyze data, overcoming limitations associated with static knowledge storage in KGs.

Read full article

via arXiv — cs.CL

arXiv — cs.LG2 days ago

KVTuner: Sensitivity-Aware Layer-Wise Mixed-Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference

PositiveArtificial Intelligence

KVTuner is a proposed framework aimed at enhancing the efficiency of Large Language Models (LLMs) through sensitivity-aware layer-wise mixed-precision KV cache quantization. This approach addresses existing challenges in LLM inference, such as layer-wise sensitivity and high overhead in decision-making. By optimizing KV quantization precision pairs, KVTuner aims to improve throughput and latency while maintaining the effectiveness of LLMs in various contexts.

Read full article

via arXiv — cs.LG

arXiv — cs.CL2 days ago

Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models

NeutralArtificial Intelligence

Research indicates that adversarial poetry serves as a universal single-turn jailbreak mechanism for Large Language Models (LLMs). High attack-success rates were observed across 25 models, with some exceeding 90%. The study also shows that converting harmful prompts into poetic formats significantly increases the success of these attacks, achieving rates up to 18 times higher than prose. Evaluations by LLM judges confirmed the effectiveness of poetic framing in bypassing safety measures.

Read full article

via arXiv — cs.CL

arXiv — cs.CL2 days ago

PSM: Prompt Sensitivity Minimization via LLM-Guided Black-Box Optimization

PositiveArtificial Intelligence

The paper presents a framework for enhancing the security of system prompts used in Large Language Models (LLMs) through a method called shield appending. This approach adds a protective layer to the original prompt, addressing vulnerabilities that can be exploited by adversarial queries. The study formalizes prompt hardening as a utility-constrained optimization problem, aiming to minimize information leakage while maintaining model performance.

Read full article

via arXiv — cs.CL