Abstract
Recommender Systems (RSs) are software tools that assist consumers in their decisionmaking. They typically leverage historical information about user/system interactions, e.g. ratings and purchases, to generate relevant recommendations for users. RSs are often evaluated by measuring metrics such as precision and recall on the generated recommendations. In this thesis, we instead investigate the possibility of assessing RSs by their “long-term” impact on the users’ choice behaviour, an aspect that has received much less attention so far. To do so, we design algorithmic offline simulations that model user-RS interactions for a long (simulated) period of time. In this approach, artificial agents (users) are assumed to adopt a “choice model” (CM) and make repeated choices over time, while a target RS influences these choices. An RS’s long-term can then be quantified by measuring metrics that represent users’ choice distribution, e.g, the Gini index and coverage of the catalogue. We leverage the proposed simulation to evaluate alternative RSs in their impact on users’ choice distribution. We found several important and non-trivial effects of RSs, showing the importance of anticipating the effects of an RS before deploying it. We use the designed simulation to also study how specific user choice behaviours, e.g., the tendency to choose popular or novel items, influence the choice distribution and performance of an RS. By simulating such choice behaviours with algorithmic choice models within our simulation, we analysed the simulated choices and we found that besides the RS, the prevalent CM of a user population can independently affect the choice distribution. For instance, when users tend to choose more recent items, the choice diversity is smaller in the long run compared to when they choose more popular items, irrespectively from the (simulated) RS. On the other hand, RS and CM can jointly affect the choice distribution. Finally, we identify a research gap related to the validity of simulation studies, which originates from the lack of assessing the accuracy of CMs used in simulations. Accordingly, we use simulations to assess the accuracy of alternative CMs in predicting users’ choices when exposed to recommendations. Our results show that sophisticated Machine Learning (ML) based CMs are more accurate in predicting an individual’s choice, compared to a CM that comes from economics research, i.e., multinomial logit CM (MNL). However, the MNL choice model better reproduces the true choice distribution compared to ML-based models.