Abstract
The performance of a Recommender System (RS) is often assessed offline, by measuring the system accuracy in predicting or reconstructing the observed user ratings or choices. As a consequence, RSs optimised for that performance measure may suggest items that the user would evaluate correct but uninteresting, because lacking novelty. In fact, these systems are hardly able to generalise the preferences directly derived from the user’s observed behaviour. To overcome this problem a novel RS approach has been proposed. It applies clustering to users’ observed sequences of choices in order to identify like-behaving users and to learn a user behavioural model for each cluster. It then leverages the learned behaviour model to generate novel and relevant recommendations, not directly the users’ predicted choices. In this paper we assess in a live user study how users evaluate recommendations produced by more traditional approaches and the proposed one along different dimensions. The obtained results illustrate the differences of the compared approaches, the benefits and the limitations of the proposed RS.