Can LLMs Faithfully Explain Themselves in Low-Resource Languages? A Case Study on Emotion Detection in Persian

arXiv — cs.CLWednesday, November 26, 2025 at 5:00:00 AM
  • A recent study investigates the ability of large language models (LLMs) to provide faithful self-explanations in low-resource languages, focusing on emotion detection in Persian. The research compares model-generated explanations with those from human annotators, revealing discrepancies in faithfulness despite strong classification performance. Two prompting strategies were tested to assess their impact on explanation reliability.
  • This development is significant as it highlights the challenges faced by LLMs in low-resource languages, where the potential for misinterpretation can affect the accuracy of emotional analysis. Understanding these limitations is crucial for improving LLM applications in diverse linguistic contexts.
  • The findings resonate with ongoing discussions about the reliability of LLMs, particularly regarding issues like hallucinations and consistency in outputs. As LLMs continue to evolve, addressing these concerns through frameworks that enhance explanation faithfulness and mitigate biases will be essential for their broader acceptance and effectiveness.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
The Journey of a Token: What Really Happens Inside a Transformer
NeutralArtificial Intelligence
Large language models (LLMs) utilize the transformer architecture, a sophisticated deep neural network that processes input as sequences of token embeddings. This architecture is crucial for enabling LLMs to understand and generate human-like text, making it a cornerstone of modern artificial intelligence applications.
A Systematic Analysis of Large Language Models with RAG-enabled Dynamic Prompting for Medical Error Detection and Correction
PositiveArtificial Intelligence
A systematic analysis has been conducted on large language models (LLMs) utilizing retrieval-augmented dynamic prompting (RDP) for medical error detection and correction. The study evaluated various prompting strategies, including zero-shot and static prompting, using the MEDEC dataset to assess the performance of nine instruction-tuned LLMs, including GPT and Claude, in identifying and correcting clinical documentation errors.
Improved LLM Agents for Financial Document Question Answering
PositiveArtificial Intelligence
Recent advancements in large language models (LLMs) have led to the development of improved critic and calculator agents designed for financial document question answering. This research highlights the limitations of traditional critic agents when oracle labels are unavailable, demonstrating a significant performance drop in such scenarios. The new agents not only enhance accuracy but also ensure safer interactions between them.
Enhancing Reasoning Skills in Small Persian Medical Language Models Can Outperform Large-Scale Data Training
PositiveArtificial Intelligence
A recent study has demonstrated that enhancing reasoning capabilities in small Persian medical language models can outperform traditional large-scale data training methods. Utilizing Reinforcement Learning with AI Feedback (RLAIF) and Direct Preference Optimization (DPO), researchers translated a medical question-answering dataset into Persian, significantly improving the model's performance in medical reasoning tasks.
Large language models replicate and predict human cooperation across experiments in game theory
PositiveArtificial Intelligence
Large language models (LLMs) have been tested in game-theoretic experiments to evaluate their ability to replicate human cooperation. The study found that the Llama model closely mirrors human cooperation patterns, while Qwen aligns with Nash equilibrium predictions, highlighting the potential of LLMs in simulating human behavior in decision-making contexts.
Training-Free Active Learning Framework in Materials Science with Large Language Models
PositiveArtificial Intelligence
A new active learning framework utilizing large language models (LLMs) has been introduced to enhance materials science research by proposing experiments based on text descriptions, overcoming limitations of traditional machine learning models. This framework, known as LLM-AL, was benchmarked against conventional models across four diverse datasets, demonstrating its effectiveness in an iterative few-shot setting.
Interpretable Reward Model via Sparse Autoencoder
PositiveArtificial Intelligence
A novel architecture called Sparse Autoencoder-enhanced Reward Model (SARM) has been introduced to improve the interpretability of reward models used in Reinforcement Learning from Human Feedback (RLHF). This model integrates a pretrained Sparse Autoencoder into traditional reward models, aiming to provide clearer insights into how human preferences are mapped to LLM behaviors.
Point of Order: Action-Aware LLM Persona Modeling for Realistic Civic Simulation
PositiveArtificial Intelligence
A new study introduces a reproducible pipeline for transforming public Zoom recordings into speaker-attributed transcripts, enhancing the realism of civic simulations using large language models (LLMs). This approach includes metadata such as persona profiles and pragmatic action tags, which significantly improve the models' performance in simulating multi-party deliberation.