Abstract
Microblogs and Social Media applications are continuously growing in spread and importance. Users of Twitter, the currently most popular platform for microblogging, cre-ate more than a billion posts (called tweets) every week. Among all the different types of information being shared, some people post their music listening behavior, which is why Twitter became interesting for the Music Informa-tion Retrieval (MIR) community. Depending on the device and personal settings, some users provide geographic co-ordinates for their microposts.
Having continuously crawled and analyzed tweets for more than 500 days (17 months) we can now present the “Million Musical Tweet Dataset” (MMTD) – the biggest publicly available source of microblog-based music listen-ing histories that includes geographic, temporal, and other contextual information. These extended information makes the MMTD outstanding from other datasets providing mu-sic listening histories.
We introduce the dataset, give basic statistics about its composition, and show how this dataset allows to detect new contextual music listening patterns by performing a comprehensive statistical investigation with respect to cor-relation between music taste and day of the week, hour of day, and country.