NLP Datasets for Idiom and Figurative Language Tasks

arXiv — cs.CLFriday, November 21, 2025 at 5:00:00 AM
  • The paper highlights the ongoing difficulties LLMs face in processing idiomatic and figurative language, despite the availability of large datasets. New datasets are introduced to help bridge this gap and improve model performance.
  • This development is significant as it addresses a critical limitation in LLMs, which impacts their effectiveness in real
  • The challenges of understanding non
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Continue Readings
Large language models and research progress: Q&A with an aerospace engineer
NeutralArtificial Intelligence
The rapid expansion of large language models' (LLMs) capabilities—including web search, code execution, data analysis, and hypothesis generation—is outpacing critical reflection on their role in academic research. This raises questions about the implications of LLMs in various fields and the need for a more structured approach to their integration into research methodologies.
Adaptive Guided Upsampling for Low-light Image Enhancement
PositiveArtificial Intelligence
Adaptive Guided Upsampling (AGU) is a novel method for enhancing low-light images by optimizing multiple quality characteristics simultaneously, such as noise reduction and sharpness improvement. This technique utilizes a guided image approach to transfer features from a reference image to the target image. AGU addresses the challenges posed by high noise levels and low brightness in low-light images, enabling real-time high-quality image rendering from low-resolution inputs.
Video-as-Answer: Predict and Generate Next Video Event with Joint-GRPO
PositiveArtificial Intelligence
The article discusses the introduction of Video-Next-Event Prediction (VNEP), a new modality for predicting the next event in videos using dynamic video responses. This approach aims to enhance procedural learning by providing intuitive visual answers instead of text-based predictions. The challenge lies in the need for models to understand multimodal inputs and reasoning conditioned on instructions.
Simple Lines, Big Ideas: Towards Interpretable Assessment of Human Creativity from Drawings
PositiveArtificial Intelligence
The paper proposes a data-driven framework for assessing human creativity through drawings, addressing the limitations of subjective expert scoring. It emphasizes that creativity can be evaluated based on both content and style, enhancing the understanding of artistic expression. The framework includes an enriched dataset and a conditional model that predicts creativity scores, content, and style simultaneously.
Multi-Objective $\textit{min-max}$ Online Convex Optimization
NeutralArtificial Intelligence
The paper discusses advancements in multi-objective online convex optimization (OCO), where an algorithm must select actions based on multiple loss function sequences revealed over time. The focus is on minimizing the 'min-max' regret, which compares the algorithm's performance to an optimal offline benchmark that knows all sequences in advance. This approach broadens the scope of traditional OCO by addressing multiple objectives simultaneously.
CaberNet: Causal Representation Learning for Cross-Domain HVAC Energy Prediction
PositiveArtificial Intelligence
CaberNet is a proposed deep sequence model aimed at improving cross-domain HVAC energy prediction. It addresses the challenges of data scarcity and variability across different buildings and climates, which often lead to overfitting and reliance on expert intervention. By learning invariant representations without prior knowledge, CaberNet enhances the robustness of energy predictions in diverse settings.
Attention-Based Feature Online Conformal Prediction for Time Series
PositiveArtificial Intelligence
The paper presents Attention-Based Feature Online Conformal Prediction (AFOCP) for time series analysis, enhancing online conformal prediction (OCP) by addressing limitations in output space and historical observation treatment. AFOCP utilizes feature space from pre-trained neural networks and incorporates an attention mechanism to adaptively weight historical data, improving prediction accuracy amidst non-stationarity and distribution shifts.
Automatic Uncertainty-Aware Synthetic Data Bootstrapping for Historical Map Segmentation
PositiveArtificial Intelligence
The automated analysis of historical maps has significantly improved due to advancements in deep learning, particularly in computer vision. However, the scarcity of annotated training data for specific historical map corpora poses a challenge. To address this, a method for generating synthetic historical maps by transferring the cartographic style of original maps onto vector data has been proposed, enabling the creation of an unlimited number of training samples for machine learning tasks.