Intra-tree Column Subsampling Hinders XGBoost Learning of Ratio-like Interactions

arXiv — cs.LGWednesday, January 14, 2026 at 5:00:00 AM
  • A recent study has revealed that intra-tree column subsampling in XGBoost can hinder the model's ability to learn from ratio-like interactions, which are crucial for synthesizing signals from multiple raw measurements. The research utilized synthetic data with cancellation-style structures to demonstrate that subsampling reduces the model's performance in identifying significant signals.
  • This finding is significant for practitioners using XGBoost, as it highlights potential limitations in the model's learning capabilities when dealing with complex interactions, particularly in scenarios where ratios and rates are involved.
  • The implications of this research extend to the broader field of machine learning, where feature engineering and model optimization are critical. It raises questions about the effectiveness of current boosting methods and the importance of feature scaling, as well as the potential for new strategies that could enhance model performance in similar contexts.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
Regression-adjusted Monte Carlo Estimators for Shapley Values and Probabilistic Values
PositiveArtificial Intelligence
A new study introduces regression-adjusted Monte Carlo estimators for calculating Shapley values and probabilistic values, enhancing the efficiency of these computations in explainable AI. This method integrates Monte Carlo sampling with linear regression, allowing for the use of various function families, including tree-based models like XGBoost, to produce unbiased estimates.

Ready to build your own newsroom?

Subscribe to unlock a personalised feed, podcasts, newsletters, and notifications tailored to the topics you actually care about