Ground Truth Generation for Multilingual Historical NLP using LLMs

arXiv — cs.CLWednesday, November 19, 2025 at 5:00:00 AM
  • The research focuses on employing large language models to create ground
  • This development is crucial as it demonstrates that even small amounts of synthetic data can substantially enhance NLP tools, particularly for under
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
STAGE: A Benchmark for Knowledge Graph Construction, Question Answering, and In-Script Role-Playing over Movie Screenplays
NeutralArtificial Intelligence
The introduction of STAGE (Screenplay Text, Agents, Graphs and Evaluation) marks a significant advancement in the field of narrative understanding, providing a comprehensive benchmark for evaluating knowledge graph construction, scene-level event summarization, long-context screenplay question answering, and in-script character role-playing across 150 films in English and Chinese.
It's All About the Confidence: An Unsupervised Approach for Multilingual Historical Entity Linking using Large Language Models
PositiveArtificial Intelligence
A new approach called MHEL-LLaMo has been introduced for multilingual historical entity linking, utilizing a combination of a Small Language Model (SLM) and a Large Language Model (LLM). This unsupervised ensemble method addresses challenges in processing historical texts, such as linguistic variation and noisy inputs, by leveraging a multilingual bi-encoder for candidate retrieval and an instruction-tuned LLM for predictions.
How Order-Sensitive Are LLMs? OrderProbe for Deterministic Structural Reconstruction
NeutralArtificial Intelligence
A recent study introduced OrderProbe, a deterministic benchmark designed to evaluate the structural reconstruction capabilities of large language models (LLMs) using fixed four-character expressions in Chinese, Japanese, and Korean. This benchmark aims to address the challenges of sentence-level restoration from scrambled inputs, which often lack a unique solution.
Analyzing Bias in False Refusal Behavior of Large Language Models for Hate Speech Detoxification
NeutralArtificial Intelligence
A recent study analyzed the false refusal behavior of large language models (LLMs) in the context of hate speech detoxification, revealing that these models disproportionately refuse tasks involving higher semantic toxicity and specific target groups, particularly in English datasets.

Ready to build your own newsroom?

Subscribe to unlock a personalised feed, podcasts, newsletters, and notifications tailored to the topics you actually care about