Abstract
Recent years have seen increased attention in the dairy sector towards cattle feeding regimens with grass-based one leading to healthier and more expensive products, thus more susceptible to adulteration. Hence, statistical tools guaranteeing milk authenticity and discriminating samples from different diets are needed. Spectroscopy data are routinely used in this context, nonetheless they introduce challenges, such as high-dimensionality and the peculiar wavelengths relationships, that have to be tackled. In this work a modification of the standard Factor Analysis is proposed. The data are mapped into a low-dimensional latent space while clustering the observed variables thus highlighting redundancies and providing more parsimonious summaries of the data and insights on diet induced differences in the milk.