RaX-Crash: A Resource Efficient and Explainable Small Model Pipeline with an Application to City Scale Injury Severity Prediction

arXiv — cs.LGWednesday, December 10, 2025 at 5:00:00 AM
  • RaX-Crash has been developed as a resource-efficient and explainable small model pipeline aimed at predicting injury severity from motor vehicle collisions in New York City, utilizing a dataset with over one hundred thousand records. The model employs compact tree-based ensembles, specifically Random Forest and XGBoost, achieving notable accuracy compared to small language models.
  • This advancement is significant as it addresses the substantial public health burden caused by motor vehicle collisions in urban settings, providing a tool that can enhance decision-making and resource allocation for injury prevention and response.
  • The development of RaX-Crash reflects a growing trend in machine learning applications across various domains, including health risk prediction and injury prevention, where models like Random Forest and XGBoost are increasingly favored for their performance and interpretability. This trend underscores the importance of integrating advanced analytics into public health strategies.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
Long-Sequence LSTM Modeling for NBA Game Outcome Prediction Using a Novel Multi-Season Dataset
PositiveArtificial Intelligence
A new study introduces a Long Short-Term Memory (LSTM) model designed to predict NBA game outcomes using a comprehensive dataset spanning from the 2004-05 to 2024-25 seasons. This model utilizes an extensive sequence of 9,840 games to effectively capture evolving team dynamics and dependencies across seasons, addressing challenges faced by traditional prediction models.
An Improved Ensemble-Based Machine Learning Model with Feature Optimization for Early Diabetes Prediction
PositiveArtificial Intelligence
A new ensemble-based machine learning model has been developed to enhance early diabetes prediction using the BRFSS dataset, which includes over 253,000 health records. The model employs techniques like SMOTE and Tomek Links to address class imbalance and achieves a strong ROC-AUC score of approximately 0.96 through various algorithms, including Random Forest and XGBoost.
Predictive Modeling of I/O Performance for Machine Learning Training Pipelines: A Data-Driven Approach to Storage Optimization
PositiveArtificial Intelligence
A recent study has introduced a machine learning approach to predict I/O performance for machine learning training pipelines, addressing the growing issue of data I/O bottlenecks that hinder GPU utilization. By systematically benchmarking various storage backends, the research identified optimal configurations, achieving an impressive R-squared of 0.991 with the XGBoost model, which predicts I/O throughput with an average error of 11.8%.
Graph Convolutional Long Short-Term Memory Attention Network for Post-Stroke Compensatory Movement Detection Based on Skeleton Data
PositiveArtificial Intelligence
A new study has introduced the Graph Convolutional Long Short-Term Memory Attention Network (GCN-LSTM-ATT) for detecting compensatory movements in stroke patients, utilizing skeleton data captured by a Kinect depth camera. The model demonstrated a detection accuracy of 0.8580, outperforming traditional methods such as Support Vector Machine, K-Nearest Neighbor, and Random Forest.
Machine learning in an expectation-maximisation framework for nowcasting
PositiveArtificial Intelligence
A new study introduces an expectation-maximisation framework for nowcasting, utilizing machine learning techniques to address the challenges posed by incomplete information in decision-making processes. This framework incorporates neural networks and XGBoost to model both the occurrence and reporting processes of events, particularly in the context of Argentinian Covid-19 data.
Learning to Select MCP Algorithms: From Traditional ML to Dual-Channel GAT-MLP
PositiveArtificial Intelligence
A novel learning-based framework has been proposed to address the Maximum Clique Problem (MCP), an NP-hard problem with significant applications. This framework integrates traditional machine learning techniques and graph neural networks, specifically utilizing a dual-channel model known as GAT-MLP, which combines a Graph Attention Network with a Multilayer Perceptron to enhance algorithm selection based on graph instance characteristics.
A Comprehensive Study of Supervised Machine Learning Models for Zero-Day Attack Detection: Analyzing Performance on Imbalanced Data
NeutralArtificial Intelligence
A comprehensive study has been conducted to evaluate the performance of five supervised machine learning models in detecting zero-day attacks, which are particularly challenging due to their unknown nature. The research aims to improve detection efficiency by addressing the imbalance in training data through techniques such as grid search and oversampling.