Improving the accuracy and generalizability of molecular property regression models with a substructure-substitution-rule-informed framework

arXiv — cs.LGWednesday, November 12, 2025 at 5:00:00 AM
The introduction of the MolRuleLoss framework marks a significant advancement in AI-aided drug discovery, a field that has been grappling with the challenge of accurately predicting molecular properties. Traditional AI models often struggle, particularly with out-of-distribution (OOD) molecules, leading to poor performance in regression tasks. The MolRuleLoss framework aims to rectify this by incorporating substructure-substitution rules into the loss function of molecular property regression models (MPRMs) like GEM and UniMol. This innovative approach has demonstrated notable improvements in prediction accuracy, as evidenced by reduced root mean squared error (RMSE) values for tasks such as lipophilicity and water solubility. Specifically, RMSE values improved from 0.660 to 0.587 for lipophilicity, from 0.798 to 0.777 for water solubility, and from 1.877 to 1.252 for solvation-free energy. These enhancements, ranging from 2.6% to 33.3%, underscore the framework's potential to improve …
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it