Abstract
To better understand the long-term effect of Recommender Systems (RSs) on users' choices, some recent studies have simulated users' interactions with RSs. The RS impact on users is then quantified by measuring global properties of the simulated choices, their distribution and quality. The accuracy of the simulated users' Choice Model (CM), i.e., how the simulated users make their choices among the recommended items, significantly contributes to the validity of the results. In fact, while some CMs have been suggested as plausible, none of them was proved to generate choices “close” to the actual choices, i.e., to those that real users have done, or will do, when exposed to the same recommendations. In this paper, we study two CMs: the Multinomial Logit (MNL) and one based on CatBoost, an algorithm for gradient boosting on decision trees (ML). We train these models to correctly predict the target users' choices, given a set of system-generated recommendations. We found that, the ML model outperforms the MNL one with regard to classical accuracy metrics (precision and balanced accuracy), while MNL's generates choices that better reproduce the distribution of the real choices (Gini index, Shannon Entropy and catalogue coverage). We, therefore, argue that MNL, when simulating users' behaviour, is more suitable for understanding the global impact of a deployed RS.