Abstract
Developing data-driven models for spatiotemporal hydrological prediction presents challenges in managing complexity, capturing fine spatial and temporal resolution, and ensuring model resilience across diverse regions. This study introduces an innovative surrogate deep learning (SDL) architecture designed to predict daily soil moisture (DSM) and daily actual evapotranspiration (DAE) by integrating climate data and geophysical insights, with a focus on mountainous areas such as the Adige catchment. The proposed framework aims to enhance the parameter-calibration quality. The process begins by mapping the statistical characteristics of DAE and DSM across the whole region using an unsupervised fusion technique. Model accuracy is assessed by comparing the similarity of Fuzzy C-Means (FCM) clusters before and after fusion, providing a metric for feature reduction. A data transformation technique using Gradient Boosting Regression (GBR) is then applied to each homogeneous subregion identified by the Random Forest classifier (RFC), based on elevation parameters (Wflow_dem). Furthermore, Kernel density estimation is used to ensure the reproducibility of the RFC-GBR process across large-scale applications. A comparative analysis is conducted across multiple SDL architectures, including LSTM, GRU, TCN, and ConvLSTM, over 50 epochs to better evaluate the beneficial effect of the transformed parameters on model performance and accuracy. Results indicate that adjusted parameter calibration improves model performance in all cases, with better alignment to Wflow ground truth during both wet and dry periods. The proposed model increases the accuracy by 20% to 42% when using simpler SDL models like LSTM and GRU, even with fewer epochs.