Abstract
Missing data (MD) in heterogeneous sensor networks presents a major challenge, particularly with non-stationary climate data that exhibit inconsistent correlations with other variables. This issue stems from variations in spatial and temporal resolutions across sensors, along with complex interdependencies among climate factors. To overcome these challenges, a novel framework is introduced that combines Seasonal-Trend decomposition using LOESS (STL) for data harmonization, a Convolutional Autoencoder (CAE) for data imputation, and a Gradient boosting based Neuron Network (G_NN) for final data reconstruction. The STL component enhances the alignment between multisource datasets by isolating trend and seasonal components, enabling more reliable synthesis across heterogeneous sensors. The CAE enables the framework to capture multiscale dependencies used to impute MD, while G_NN refines the final reconstruction by adjusting residual errors. Precipitation is used as a case study given its variability and the difficulty of aligning it with other climate variables across different scales. The model was evaluated under different percentages of MD, exhibiting remarkable performance. At around 30% MD, it achieved a high R², ranging from 0.98 to 0.85. Even with 50% MD, the model retained strong accuracy and scalability, maintaining an R² of approximately 0.7. Overall, the proposed method demonstrated notable improvements of 26% and 32% compared to advanced approaches such as GAN and MICE, respectively.