A COMPARATIVE STUDY OF CLASSICAL STATISTICAL METHODS AND MODERN DATA SCIENCE ALGORITHMS

Authors

  • Wali Rehman
  • Sohail Anwer
  • Rana Waseem Ahmad
  • Abou Bakar Siddique
  • Aamir Hayyat

Keywords:

Machine Learning, Regression Analysis, Random Forest, Predictive Modeling, Engineering Data Analytics

Abstract

This study presents a comprehensive comparative analysis of classical statistical methods and modern data science algorithms for predictive modeling using the Boston Housing dataset. The research integrates descriptive statistics, correlation analysis, regression modeling, and machine learning techniques to evaluate performance and interpretability. Classical models such as Linear, Ridge, and Lasso Regression provided strong interpretive insights but were constrained by assumptions of linearity and limited capacity to model complex interactions. In contrast, modern algorithms including Decision Tree, Support Vector Regression, and Random Forest demonstrated superior predictive accuracy and robustness to nonlinearity. Among all models, the Random Forest achieved the highest performance with an R² of 0.88 and the lowest RMSE of 2.9, indicating its effectiveness in capturing multidimensional relationships between socioeconomic, environmental, and structural variables. The findings highlight the complementary strengths of traditional and data-driven approaches, suggesting that hybrid analytical frameworks offer the most balanced strategy for accurate and interpretable engineering data analysis and process automation

Downloads

Published

2025-10-13

How to Cite

Wali Rehman, Sohail Anwer, Rana Waseem Ahmad, Abou Bakar Siddique, & Aamir Hayyat. (2025). A COMPARATIVE STUDY OF CLASSICAL STATISTICAL METHODS AND MODERN DATA SCIENCE ALGORITHMS. Policy Research Journal, 3(10), 281–293. Retrieved from https://policyrj.com/index.php/1/article/view/1152