On Extending Direct Preference Optimization to Accommodate Ties

On Extending Direct Preference Optimization to Accommodate Ties

arXiv — cs.CL•Wednesday, November 5, 2025 at 5:00:00 AM

A recent study investigates two new variants of Direct Preference Optimization (DPO) that explicitly accommodate ties in pair-wise comparisons, extending the traditional Bradley-Terry model. These extensions incorporate methodologies from Rao and Kupper, as well as Davidson, to better handle tied preferences. The research applies these enhanced models to datasets in neural machine translation and summarization, demonstrating practical relevance in these domains. The findings indicate that explicitly labeling ties in preference data can improve the quality of datasets used for training and evaluation. This suggests that incorporating ties provides a more nuanced understanding of preferences, potentially leading to better model performance. Overall, the study supports the positive impact of including ties in preference optimization frameworks.

— via World Pulse Now AI Editorial System

On Extending Direct Preference Optimization to Accommodate Ties

Was this article worth reading? Share it

Ready to build your own newsroom?