ToDRE: Effective Visual Token Pruning via Token Diversity and Task Relevance

arXiv — cs.CV•Thursday, November 20, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

ToDRE introduces a novel approach to visual token pruning, focusing on token diversity and task relevance to improve efficiency in large vision
The significance of ToDRE lies in its potential to optimize inference processes in LVLMs, addressing the challenges of redundancy and inefficiency that have plagued traditional methods.
This development reflects a broader trend in AI research towards enhancing model performance through innovative token management strategies, as seen in related works that explore compact representations and the limitations of language

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

Recommended Readings

DEV Community7 hours ago

Deterministic RAG: A Drop-in Replacement for GraphRAG’s Unstable Planning

PositiveArtificial Intelligence

The article discusses the development of a deterministic RAG (Retrieval-Augmented Generation) system designed to replace GraphRAG's unstable planning. Current RAG systems face issues with reproducibility and debugging due to their reliance on LLM-driven dynamic planning. The new deterministic approach aims to enhance stability and auditability while maintaining the system's generative capabilities.

Read full article

via DEV Community

arXiv — cs.CL8 hours ago

In-N-Out: A Parameter-Level API Graph Dataset for Tool Agents

PositiveArtificial Intelligence

The article introduces In-N-Out, a novel dataset designed for tool agents that utilize large language models (LLMs) to interact with external APIs. As tasks grow more complex, these agents often struggle to identify and sequence the correct APIs. In-N-Out addresses this by converting API documentation into a structured graph that captures dependencies, significantly enhancing performance in tool retrieval and multi-tool query generation, nearly doubling the effectiveness of LLMs relying solely on documentation.

Read full article

via arXiv — cs.CL

arXiv — cs.LG8 hours ago

Unveiling Intrinsic Dimension of Texts: from Academic Abstract to Creative Story

NeutralArtificial Intelligence

The study on intrinsic dimension (ID) in large language models (LLMs) reveals its significance in understanding text properties. It highlights that ID is uncorrelated with entropy-based metrics, indicating a distinct measure of geometric complexity. The research also shows genre stratification in ID, with scientific texts having lower ID compared to creative writing, suggesting that LLMs perceive scientific text as simpler. This work utilizes cross-encoder analysis and sparse autoencoders for its findings.

Read full article

via arXiv — cs.LG

arXiv — cs.CL8 hours ago

Confidential Prompting: Privacy-preserving LLM Inference on Cloud

PositiveArtificial Intelligence

The paper presents a concept called confidential prompting, aimed at securing user prompts from untrusted cloud-hosted large language models (LLMs). It introduces Petridish, a system utilizing confidential computing and a technology named Secure Partitioned Decoding (SPD). Petridish operates within a confidential virtual machine (CVM) to protect LLM parameters and user prompts from external threats, while efficiently managing user requests through a dual-process system.

Read full article

via arXiv — cs.CL

arXiv — cs.CL8 hours ago

Fairshare Data Pricing via Data Valuation for Large Language Models

PositiveArtificial Intelligence

The paper discusses the exploitative pricing practices in data markets for large language models (LLMs), which often marginalize data providers. It proposes a fairshare pricing mechanism based on data valuation to enhance seller participation and improve data quality. The framework aims to align incentives between buyers and sellers, ensuring optimal outcomes for both parties while maintaining market sustainability.

Read full article

via arXiv — cs.CL

arXiv — cs.CV8 hours ago

IWR-Bench: Can LVLMs reconstruct interactive webpage from a user interaction video?

NeutralArtificial Intelligence

The paper introduces IWR-Bench, a new benchmark designed to evaluate Large Vision-Language Models (LVLMs) in reconstructing interactive webpages from user interaction videos. Unlike existing benchmarks that focus on static screenshots, IWR-Bench includes 113 tasks from 100 real-world websites, featuring diverse interaction complexities and visual styles. Each task is accompanied by user interaction videos and static assets, aiming to enhance multi-modal reasoning capabilities in web applications.

Read full article

via arXiv — cs.CV

arXiv — cs.CLa day ago

Scaling Textual Gradients via Sampling-Based Momentum

PositiveArtificial Intelligence

The article discusses the challenges and potential of scaling prompt optimization using LLM-provided textual gradients. While this method has proven effective for automatic prompt engineering, issues arise when increasing training data due to context-length limits and diminishing returns from long-context degradation. The authors propose a new approach called Textual Stochastic Gradient Descent with Momentum (TSGD-M), which utilizes momentum sampling to enhance training stability and scalability.

Read full article

via arXiv — cs.CL

arXiv — cs.LGa day ago

Node-Level Uncertainty Estimation in LLM-Generated SQL

PositiveArtificial Intelligence

A new framework for detecting errors in SQL generated by large language models (LLMs) has been introduced, focusing on estimating uncertainty at the node level within the query's abstract syntax tree (AST). The method employs a semantically aware labeling algorithm to assess node correctness and utilizes a classifier to predict error probabilities for each node. This approach allows for precise diagnostics, significantly improving error detection compared to traditional token log-probabilities across various databases and datasets.

Read full article

via arXiv — cs.LG