GSAP-ERE: Fine-Grained Scholarly Entity and Relation Extraction Focused on Machine Learning
PositiveArtificial Intelligence
The introduction of the GSAP-ERE dataset marks a significant advancement in the field of machine learning and AI research. This dataset, which includes 10 entity types and 18 relation types, is manually curated and contains mentions of 63,000 entities and 35,000 relations extracted from the full text of 100 machine learning publications. By facilitating the extraction of fine-grained information, GSAP-ERE aims to enhance the understanding and reproducibility of AI research. Notably, the dataset has demonstrated its utility in enabling fine-tuned models to outperform state-of-the-art large language model prompting methods, achieving performance metrics of 80.6% for Named Entity Recognition (NER) and 54.0% for Relation Extraction (RE), compared to 44.4% and 10.1% respectively for LLMs. This advancement underscores the importance of datasets like GSAP-ERE in driving forward scholarly information extraction and improving the overall landscape of machine learning research.
— via World Pulse Now AI Editorial System
