Abstract
Length of hospital stay (LoS) is a key metric for healthcare quality and hospital resource management. This study investigates factors influencing LoS within the Italian healthcare system, using patient-level hospitalization records from standardized hospital discharge forms (Schede di Dimissione Ospedaliera, SDO) for the population served by a local health authority in Piedmont. The dataset included 37,526 patients across 66 facilities over four years. We analyzed patient characteristics, comorbidities, admission details, and hospital-specific factors. Significant associations were observed with age group, comorbidity burden, admission type, and month of admission. Machine learning models, CatBoost and Random Forest, were used to predict LoS, with CatBoost achieving the highest validation R-squared of 0.49. Historical LoS and procedure were the most influential predictors. As diagnosis and procedure information is recorded at discharge, the models are intended for retrospective analysis rather than real-time prediction at admission. The results demonstrate the potential of administrative SDO data for understanding LoS patterns and supporting hospital planning.