Abstract
There is rich literature on the effects of wine characteristics on prices, particularly evaluating the impact of expert ratings. Oczkowski and Doucouliagos (2015) further review key papers in the literature and show that expert ratings have a moderate influence on wine prices across various sample designs, countries or regions and time periods. Schamel (2014) analyses the impact of winery reputation and expert ratings on retail prices comparing co-ops and private wineries in our study regions Trentino and Alto-Adige/Südtirol.
With the rise of social media platforms, ordinary consumers can rate wines and make their opinion known to others. Popular examples of platforms dedicated to wine ratings are vivino.com and cellartracker.com. The observation that frequently rated, i.e., highly popular, but relatively ordinary wines receive high average ratings initiated this paper.
To date, several authors have explored consumer wine ratings found online. Caldas and Rebelo (2013) use CellarTracker data and expert ratings to assess the consistency of ratings for a small sample of Portuguese wines. Their model shows, minor differences aside, that the different rating systems are reasonably consistent. Oczkowski and Pawsey (2019) assess the relative impact of consumer ratings from vivino.com and expert ratings on Australian wine prices using a hedonic model for wines rated by both consumers and experts. Their analysis lends some credibility to the claim made by vivino.com in the sense that consumer ratings have strong correlations with the ratings of several wine experts. The consumer ratings used in their analysis are transformed to have the same quantiles as the expert ratings by James Halliday to which they are compared to. Differences between expert and consumer ratings for wine have been explained by several factors, including different objectives when evaluating a wine, heterogeneous consumers, and the different contexts in which expert and consumer (community) ratings are assigned. Gokcekus and Nottebaum (2011) also note that consumer ratings tend to be made after expert ratings are available and they are not the result of blind testing. Kotonya et al. (2018) analyzed over one million Vivino.com ratings collected between November 2016 and March 2017. They found that wines are typically assessed within their geographical region of origin and that community comments and ratings express a rich knowledge about the wines being comparable to expert opinions. However, some open research questions remain, particularly related to wines normally not rated by experts but within online communities. We claim that a missing topic in the literature relates to exploring how consumer ratings are affected by the rated wines' characteristics. Moreover, the literature lacks critical review of the reliability of data gathered from these platforms which requires detailed knowledge and checks of the sample wines rated.
Therefore, this exploratory study investigates the extent to which wine ratings assigned by consumers on social media platforms can be explained by the wine's popularity, variety, producer, regional origin, and vintage. Moreover, it provides an initial analysis on data reliability for econometric purposes.
We collect a detailed data set of consumer wine ratings provided by vivino.com platform. To be able to thoroughly check data quality, we restrict our analysis to recent vintage wines from two specific provinces or regions in northern Italy (Trentino and Alto Adige/Südtirol). The original dataset was obtained in 2019 and contains 2534 wine ratings. We only retained wine ratings associated with a specific vintage, dropping those with an unclear or non-unique vintage. The final data set analysed includes 2354 wines: 83% from Alto-Adige/Südtirol (88 distinct producers) and 17% from Trentino (66 distinct producers). Despite only considering recent vintages, it was difficult to obtain consistent price data as only about 29% of the wines rated were sold directly on the platform. Moreover, we noticed errors in the correct allocation of wines to the appropriate geographical indication (GI). Thus, we chose to omit information on prices and GIs from the quantitative analysis. The dataset includes information on the name of the producer (producer), the vintage (vintage), the average rating from consumers who purchased the wine (rating), the number of ratings received by each vintage (nratings), the province the wine originates (region) and wine variety (variety).
After a thorough descriptive analysis that helped assessing data reliability, we apply binomial logistic regression (LR) to explore factors determining ratings below and above the sample mean (3.82). Precisely, we determine wines rated one standard deviation (or 0.253) above and below the sample mean to create two dummy variables identifying high-ratings and low-ratings. We use these dummies as dependent variables in two LR models. We classify 621 wines as high-ratings and 327 wines as low-ratings. In our LR model, we consider the number of ratings obtained by each wine (nratings), vintage, producers, and variety as regressors. The nratings variable may be considered a proxy for the wine's popularity on the platform. The original nratings variable is continuous, and it is recoded into an ordinal one based on quantiles to handle non-normality. Vintages are represented by 5 dummies coded as follows: (vintage1) 2013 or lower, (vintage2) 2014, (vinatge3) 2015, (vintage4) 2014, (vintage5) 2017. Varieties and producers are also included as dummies.
The two LR models (low-ratings and high-ratings) have an R2 of 16% and 17%, respectively. Multicollinearity tests revealed no significant concerns as all VIF values are below the recommended maximum threshold of 10 (Hair et al., 1995). Our preliminary findings suggest that wines rated by a greater number of consumers (nratings) are less likely to fall into the low-ratings category. Wines from the 2014 vintage, widely acknowledged as a difficult climate vintage, have higher odds of falling into the low-ratings category. Producers and variety-related significant effects also emerged, all increasing the likelihood of being rated in the low-ratings category. Among the significant producers and varieties analysed, Cantina La-Vis (β 4.12; p < 0.0001) and Schiava (β 2.78; p < 0.0001) respectively record the greatest effects on the dependent variable (i.e., low-ratings). Müller-Thurgau, Chardonnay and Pinot Noir positively predict low-ratings as well, with remarkable effect sizes (Müller-Thurgau: β 2.81; p < 0.0001; Chardonnay β 1.28; p = 0.036; Pinot Noir β 1.28; p = 0.029).
For the high-ratings model, Vintages older than 2013 and the 2015 vintage have positive and significant coefficients (β .93; p < 0.0001, and β .68; p = .006, respectively). Interestingly, the number of ratings is not a significant predictor for high-ratings. Producers and variety effects emerge as well, but almost all of them decrease the odds of a wine being assigned high-ratings. White blends and the Cabernet-Merlot- Lagrein blend are the only exception, being most likely to receive high-ratings (β .78; p = .072, and β .85; p = .055, respectively).
A preliminary indicator for producer reputation on the platform is computed as the product of the total number of ratings received by its wines and its average rating, and further standardized using the z-score (Prodrepindex_std). When Prodrepindex_std is introduced in the model, the effect of nratings remains consistent. Moreover, the variable negatively predicts low-ratings while increasing the odds of high-ratings.
To conclude, results of this study on factors influencing wine ratings in online communities reveal that receiving more ratings (i.e., a higher popularity of the wine on the platform) appears to discourage low-ratings while not encouraging high-ratings. Consumers seem to recognize specific vintages as either higher quality (e.g., 2015) or lower quality (e.g., 2014), demonstrating to possess a specific knowledge of the product in line with Kotonya et al. (2018). Variety and producer effects emerge in both models but mainly explain assigned to the low-ratings category. Lastly, producers’ reputation on the platform seems to positively affect wine ratings.
Regarding data reliability, the Vivino sample analysed does not represent all producers adequately despite including all Trentino Alto Adige wines on the platform. Indeed, some producers have a considerably larger selection of wines than the one sold on Vivino. Hence sample selection does not necessarily represent the actual offer, leading to potentially biased results. Additionally, some producers may have been on the platform for longer than others, with repercussions on their reputation on the community. This is a variable we did not controlled for in this study. This information and the incorrect assignment of GIs raises some concerns about the reliability of the wine communities' data for econometric applications.