GSAP-ERE: Fine-Grained Scholarly Entity and Relation Extraction Focused on Machine Learning

arXiv — cs.CLThursday, November 13, 2025 at 5:00:00 AM
The introduction of the GSAP-ERE dataset marks a significant advancement in the field of machine learning and AI research. This dataset, which includes 10 entity types and 18 relation types, is manually curated and contains mentions of 63,000 entities and 35,000 relations extracted from the full text of 100 machine learning publications. By facilitating the extraction of fine-grained information, GSAP-ERE aims to enhance the understanding and reproducibility of AI research. Notably, the dataset has demonstrated its utility in enabling fine-tuned models to outperform state-of-the-art large language model prompting methods, achieving performance metrics of 80.6% for Named Entity Recognition (NER) and 54.0% for Relation Extraction (RE), compared to 44.4% and 10.1% respectively for LLMs. This advancement underscores the importance of datasets like GSAP-ERE in driving forward scholarly information extraction and improving the overall landscape of machine learning research.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended Readings
FlakyGuard: Automatically Fixing Flaky Tests at Industry Scale
PositiveArtificial Intelligence
Flaky tests, which unpredictably pass or fail, hinder developer productivity and delay software releases. FlakyGuard is introduced as a solution that leverages large language models (LLMs) to automatically repair these tests. Unlike previous methods like FlakyDoctor, FlakyGuard effectively addresses the context problem by structuring code as a graph and selectively exploring relevant contexts. Evaluation of FlakyGuard on real-world tests indicates a repair success rate of 47.6%, with 51.8% of fixes accepted by developers, marking a significant improvement over existing approaches.
Mobile Jamming Mitigation in 5G Networks: A MUSIC-Based Adaptive Beamforming Approach
PositiveArtificial Intelligence
Mobile jammers present a significant threat to 5G networks, especially in military settings. An innovative anti-jamming framework has been proposed, utilizing Multiple Signal Classification (MUSIC) for precise Direction-of-Arrival (DoA) estimation and Minimum Variance Distortionless Response (MVDR) beamforming for adaptive interference suppression. Machine learning enhances DoA prediction for mobile jammers. Simulations indicate an average Signal-to-Noise Ratio (SNR) improvement of 9.58 dB and a DoA estimation accuracy of up to 99.8%, showcasing the framework's effectiveness in dynamic environ…
Soft-Label Training Preserves Epistemic Uncertainty
PositiveArtificial Intelligence
The article discusses the concept of soft-label training in machine learning, which preserves epistemic uncertainty by treating annotation distributions as ground truth. Traditional methods often collapse diverse human judgments into single labels, leading to misalignment between model certainty and human perception. Empirical results show that soft-label training reduces KL divergence from human annotations by 32% and enhances correlation between model and annotation entropy by 61%, while maintaining accuracy comparable to hard-label training.
DataSage: Multi-agent Collaboration for Insight Discovery with External Knowledge Retrieval, Multi-role Debating, and Multi-path Reasoning
PositiveArtificial Intelligence
DataSage is a novel multi-agent framework designed to enhance insight discovery in data analytics. It addresses limitations of existing data insight agents by incorporating external knowledge retrieval, a multi-role debating mechanism, and multi-path reasoning. These features aim to improve the depth of analysis and the accuracy of insights generated, thereby assisting organizations in making informed decisions in a data-driven environment.
A Machine Learning-Based Multimodal Framework for Wearable Sensor-Based Archery Action Recognition and Stress Estimation
PositiveArtificial Intelligence
A new machine learning-based multimodal framework has been developed for wearable sensor-based archery action recognition and stress estimation. This innovative system utilizes a wrist-worn device equipped with an accelerometer and photoplethysmography (PPG) sensor to collect synchronized motion and physiological data during archery sessions. The framework achieves high accuracy in motion recognition and stress estimation, marking a significant advancement in the analysis of athletes' performance in precision sports.
Scalable Feature Learning on Huge Knowledge Graphs for Downstream Machine Learning
PositiveArtificial Intelligence
The paper presents SEPAL, a Scalable Embedding Propagation Algorithm aimed at improving the use of large knowledge graphs in machine learning. Current models face limitations in optimizing for link prediction and require extensive engineering for large graphs due to GPU memory constraints. SEPAL addresses these issues by ensuring global embedding consistency through localized optimization and message passing, evaluated across seven large-scale knowledge graphs for various downstream tasks.
Automatic Fact-checking in English and Telugu
NeutralArtificial Intelligence
The research paper explores the challenge of false information and the effectiveness of large language models (LLMs) in verifying factual claims in English and Telugu. It presents a bilingual dataset and evaluates various approaches for classifying the veracity of claims. The study aims to enhance the efficiency of fact-checking processes, which are often labor-intensive and time-consuming.
GMAT: Grounded Multi-Agent Clinical Description Generation for Text Encoder in Vision-Language MIL for Whole Slide Image Classification
PositiveArtificial Intelligence
The article presents a new framework called GMAT, which enhances Multiple Instance Learning (MIL) for whole slide image (WSI) classification. By integrating vision-language models (VLMs), GMAT aims to improve the generation of clinical descriptions that are more expressive and medically specific. This addresses limitations in existing methods that rely on large language models (LLMs) for generating descriptions, which often lack domain grounding and detailed medical specificity, thus improving alignment with visual features.