One-Click Regression Benchmarking with SHAP Explainability: An Integrated Python Pipeline for Linear, Regularized, and Tree-Boosting Models

Yucheng Yan

Authors

Yucheng Yan Zhongshan Sanxin School International Department, Zhongshan, Guangdong, China Author

Keywords:

regression benchmarking, model comparison, Explainable AI, machine learning pipeline

Abstract

Applied researchers often need a fast, reproducible way to (a) compare multiple regression algorithms under a consistent preprocessing and evaluation protocol and (b) interpret model behavior beyond scalar accuracy metrics. This paper presents a turnkey Python pipeline that benchmarks five widely used regressors-Ordinary Least Squares, Ridge, Random Forests, XGBoost, and LightGBM-while natively integrating SHAP-based explainability. The system accepts mixed-type datasets, performs robust preprocessing (median imputation and standardization for numeric predictors; most-frequent imputation and one-hot encoding for categorical predictors), and evaluates models on a hold-out set using R², MSE, RMSE, and MAE. Results are exported to a clean, analysis-ready Excel workbook to facilitate immediate reuse in empirical reports. To move beyond aggregate metrics, the pipeline automatically generates SHAP global importance summaries (bar and beeswarm) and feature-dependence plots with interaction highlighting, providing multi-level insight into main effects and potential interactions. The implementation is designed for portability and minimal configuration: users specify a data file and target column, and optional flags control test split, random seed, and the number of visualizations. When no data are provided, a synthetic mixed-type dataset is generated to demonstrate the full workflow end-to-end. By combining standardized benchmarking with model-agnostic interpretability, the proposed tool lowers the barrier to rigorous, transparent model comparison and accelerates the translation of machine-learning methods into substantive research across domains.

One-Click Regression Benchmarking with SHAP Explainability: An Integrated Python Pipeline for Linear, Regularized, and Tree-Boosting Models

Authors

Keywords:

Abstract

Downloads

Published

Issue

Section