PERFORMANCE ANALYSIS OF ENSEMBLE LEARNING ALGORITHMS IN PREDICTING SOIL COMPACTION CHARACTERISTICS

Authors

  • Muhammad Naveed Khalil
  • Yaser Farman
  • Mah E Kinaan
  • Farhan Ahmad Zubair
  • Fahad Ur Rehman

Keywords:

Maximum Dry Density, Random Forest, Gradient Boosting, Soil Compaction, Feature Selection, SHAP Analysis, Machine Learning, Geotechnical Prediction, Ensemble Models, Cross-Validation

Abstract

The accurate determination of Maximum Dry Density (MDD) is vital for geotechnical design, yet conventional laboratory methods are resource-intensive and time-consuming. This study presents a comparative machine learning framework to predict MDD using seven index soil properties. A dataset of 486 samples was pre-processed, and feature importance was assessed using both univariate regression and the RReliefF algorithm, revealing a critical divergence: Optimum Moisture Content (OMC) was the strongest linear predictor, whereas Sand Content ranked highest in non-linear, multivariate analysis. Two ensemble algorithms Random Forest (RF) and Gradient Boosting (GB) were developed using an 80-20 train-test split, with hyperparameters controlled for a fair comparison (100 trees, max depth of 3). Model performance was evaluated using a suite of metrics (MSE, RMSE, MAE, MAPE, R², CVRMSE), supplemented by 20-fold cross-validation, error distribution analysis, and SHAP interpretability. Results indicated that both models achieved high accuracy, with RF demonstrating a slight edge on the test set (R² = 0.927, RMSE = 0.489) compared to GB (R² = 0.924, RMSE = 0.498). However, GB exhibited superior stability across cross-validation folds, while RF showed greater performance variability. SHAP analysis identified OMC and Gravel Content as the most influential and physically interpretable features. The study concludes that Random Forest is marginally superior for point prediction accuracy, but the choice between RF and GB should consider the trade-off between test-set performance and model stability. The research underscores the value of multi-method feature selection and model interpretability in developing reliable geotechnical predictive tools.

Downloads

Published

2025-12-10

How to Cite

Muhammad Naveed Khalil, Yaser Farman, Mah E Kinaan, Farhan Ahmad Zubair, & Fahad Ur Rehman. (2025). PERFORMANCE ANALYSIS OF ENSEMBLE LEARNING ALGORITHMS IN PREDICTING SOIL COMPACTION CHARACTERISTICS. Policy Research Journal, 3(12), 143–164. Retrieved from https://policyrj.com/1/article/view/1340