Fair In-Context Learning via Latent Concept Variables

arXiv — cs.LGTuesday, November 18, 2025 at 5:00:00 AM
  • The research investigates the in
  • This development is significant as it addresses the ethical concerns surrounding LLMs, particularly their tendency to inherit biases from training data, which can lead to unfair outcomes in critical domains.
  • The findings contribute to ongoing discussions about the reliability and ethical implications of LLMs, highlighting the need for improved methodologies to mitigate bias and enhance fairness in AI applications across various sectors.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended Readings
Nearest Neighbor Projection Removal Adversarial Training
PositiveArtificial Intelligence
Deep neural networks have shown remarkable success in image classification but are still susceptible to adversarial examples. Traditional adversarial training methods improve robustness but often overlook inter-class feature overlap, which contributes to vulnerability. This study introduces a new adversarial training framework that reduces inter-class proximity by projecting out dependencies from both adversarial and clean samples in the feature space. The proposed method enhances feature separability and theoretically lowers the Lipschitz constant of neural networks, improving generalization.
Reason-KE++: Aligning the Process, Not Just the Outcome, for Faithful LLM Knowledge Editing
PositiveArtificial Intelligence
The paper introduces Reason-KE++, a new framework designed to enhance the alignment of Large Language Models (LLMs) with new knowledge, particularly in complex reasoning tasks. It identifies a significant issue with existing methods, such as Reason-KE, which focus on format mimicry rather than genuine reasoning, leading to factual inaccuracies. Reason-KE++ employs a Stage-aware Reward mechanism to ensure process-level faithfulness, addressing the limitations of naive outcome-only reinforcement learning that can compromise reasoning integrity.
Reconstruction of Manifold Distances from Noisy Observations
NeutralArtificial Intelligence
The article discusses the reconstruction of the intrinsic geometry of a manifold from noisy pairwise distance observations. It focuses on a diameter 1 d-dimensional manifold and a probability measure that is absolutely continuous with the volume measure. By observing noisy-distance random variables related to true geodesic distances, the authors propose a new framework for recovering distances among points in a dense subsample of the manifold, improving upon previous methods that relied on known moments of noise.
Breaking the Dyadic Barrier: Rethinking Fairness in Link Prediction Beyond Demographic Parity
NeutralArtificial Intelligence
Link prediction is a crucial task in graph machine learning, applicable in areas like social recommendation and knowledge graph completion. Ensuring fairness in link prediction is vital, as biased outcomes can worsen societal inequalities. Traditional methods focus on demographic parity between intra-group and inter-group predictions, but this approach may overlook deeper disparities among subgroups. The authors propose a new framework for assessing fairness in link prediction that goes beyond demographic parity, aiming to better address systemic biases.
Silenced Biases: The Dark Side LLMs Learned to Refuse
NegativeArtificial Intelligence
Safety-aligned large language models (LLMs) are increasingly used in sensitive applications where fairness is crucial. Evaluating their fairness is complex, often relying on standard question-answer schemes that may misinterpret refusal responses as indicators of fairness. This paper introduces the concept of silenced biases, which are unfair preferences hidden within the models' latent space, masked by safety-alignment. Previous methods have limitations, prompting the need for a new approach to assess these biases effectively.
Classification of Hope in Textual Data using Transformer-Based Models
PositiveArtificial Intelligence
This paper presents a transformer-based approach for classifying hope expressions in text. Three architectures (BERT, GPT-2, and DeBERTa) were developed and compared for binary classification (Hope vs. Not Hope) and multiclass categorization (five hope-related categories). The BERT implementation achieved 83.65% binary and 74.87% multiclass accuracy, with superior performance in extended comparisons. GPT-2 showed the lowest accuracy, while DeBERTa had moderate results but at a higher computational cost. Error analysis highlighted architecture-specific strengths in detecting nuanced hope expres…
On the Entropy Calibration of Language Models
NeutralArtificial Intelligence
The study on entropy calibration of language models investigates whether the entropy of a model's text generation aligns with its log loss on human text. Previous findings indicate that models often exhibit miscalibration, where entropy increases and text quality declines with longer generations. This paper explores whether scaling can improve miscalibration and if calibration can be achieved without trade-offs, focusing on the relationship between dataset size and miscalibration behavior.
Consistency Is the Key: Detecting Hallucinations in LLM Generated Text By Checking Inconsistencies About Key Facts
PositiveArtificial Intelligence
Large language models (LLMs) are known for their impressive text generation abilities but often produce factually incorrect content, a phenomenon termed 'hallucination.' This issue is particularly concerning in critical fields such as healthcare and finance. Traditional methods for detecting these inaccuracies require multiple API calls, leading to increased costs and latency. The introduction of CONFACTCHECK offers a novel solution, allowing for efficient hallucination detection by ensuring consistency in factual responses generated by LLMs without needing external knowledge bases.