OPTIMISING MACHINE LEARNING TECHNIQUES FOR IRREGULAR SAMPLING

Authors

  • Zhenyu Xu, Negar Riazifar

DOI:

https://doi.org/10.52152/5943t304

Keywords:

Machine Learning, Random Forest, Simple Linear Interpolation, XGBoost

Abstract

This study examines how simple linear interpolation (SLI) and natural-neighbourinterpolation (NNI) affect machine learning model performance on irregularly sampled commercial data. Seoul bike-sharing rental datasetispre-processed with SLI and NNI to manage missing values and inconsistencies. The performance of SLI and NNI isthen evaluated by constructing various machine learning models, including XGBoost, Random Forest, k-nearest neighbors(KNN) and Stacking model. Results show that SLI consistently improved the accuracy, particularly in the stacking model, as demonstrated by the area under the receiver operating characteristic(AUC) and kolmogorov-smirnow(KS) statistics. Conversely, NNI had more variable outcomes, occasionally reducing performance. The findings underscore the critical role of data pre-processing throughout machine learning, particularly in domains where data irregularities are prevalent, thereby providing empirical support for employing interpolation methods to improve both model reliability and accuracy. Eventually, findings uncovered by this study empirically support data pre-processing for business data modelling, highlighting the critical role of data pre-processing in optimising the performance of machine learning models.

Downloads

Published

2025-08-12

Issue

Section

Article

How to Cite

OPTIMISING MACHINE LEARNING TECHNIQUES FOR IRREGULAR SAMPLING. (2025). Lex Localis - Journal of Local Self-Government, 23(S5), 1126-1142. https://doi.org/10.52152/5943t304