Ground Truth Generation for Multilingual Historical NLP using LLMs

arXiv — cs.CLWednesday, November 19, 2025 at 5:00:00 AM
  • The research focuses on employing large language models to create ground
  • This development is crucial as it demonstrates that even small amounts of synthetic data can substantially enhance NLP tools, particularly for under
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
Sequoia-Backed Pennylane Eyes Funding at $4.3 Billion Valuation
PositiveArtificial Intelligence
Pennylane, a French startup specializing in accounting software, is reportedly in discussions for a new funding round that could value the company at $4.25 billion, nearly double its previous valuation from just seven months ago.
How Language Directions Align with Token Geometry in Multilingual LLMs
PositiveArtificial Intelligence
A recent study on multilingual large language models (LLMs) reveals that language information is distinctly organized within their internal representation space, particularly showing significant separation in the first transformer block. This comprehensive probing study analyzed six multilingual LLMs across all 268 transformer layers, utilizing both linear and nonlinear probes alongside a new Token-Language Alignment analysis.
Shona spaCy: A Morphological Analyzer for an Under-Resourced Bantu Language
PositiveArtificial Intelligence
A new open-source morphological analyzer for the Shona language, named Shona spaCy, has been developed using the spaCy framework. This tool integrates a curated JSON lexicon and linguistically grounded rules to enhance the analysis of noun-class prefixes, verbal subject concords, and other morphological features, achieving 90% accuracy in part-of-speech tagging and 88% in morphological features.
How Well Do LLMs Understand Tunisian Arabic?
NegativeArtificial Intelligence
A recent study highlights the limitations of Large Language Models (LLMs) in understanding Tunisian Arabic, also known as Tunizi. This research introduces a new dataset that includes parallel translations in Tunizi, standard Tunisian Arabic, and English, aiming to benchmark LLMs on their comprehension of this low-resource language. The findings indicate that the neglect of such dialects may hinder millions of Tunisians from engaging with AI in their native language.
MUCH: A Multilingual Claim Hallucination Benchmark
PositiveArtificial Intelligence
A new benchmark named MUCH has been introduced to assess Claim-level Uncertainty Quantification (UQ) in Large Language Models (LLMs). This benchmark includes 4,873 samples in English, French, Spanish, and German, and provides 24 generation logits per token, enhancing the evaluation of UQ methods under realistic conditions.
LangMark: A Multilingual Dataset for Automatic Post-Editing
PositiveArtificial Intelligence
LangMark has been introduced as a new multilingual dataset aimed at enhancing automatic post-editing (APE) for machine-translated texts, featuring 206,983 triplets across seven languages including Brazilian Portuguese, French, and Japanese. This dataset is human-annotated by expert linguists to improve translation quality and reduce reliance on human intervention.