Abstract
Accurate prediction of Global Horizontal Irradiance (GHI) is crucial for optimizing solar power generation systems, especially in mountainous regions characterized by complex topography and specific microclimates. These areas face significant challenges due to limited availability of reliable data and accuracy issues stemming from the dynamic nature of the atmosphere and local weather conditions. This scarcity of precise GHI measurements impedes the development of accurate solar energy prediction models, affecting both economic and environmental aspects. In this framework, this paper proposes a novel methodology to address data scarcity challenges in solar energy prediction, particularly focusing on Alpine regions. We employ machine learning techniques such as Random Forest (RF) and Extreme Gradient Boosting (XGBoost) regressors, in conjunction with synthetic data generation, to predict GHI. To assess our approach's accuracy, we selected Bolzano as a case study and modelled the PV AC power outputs before and after optimizing GHI data.