Clustering dependent observations with copula functions
MetadataShow full item record
This paper deals with the problem of clustering dependent ob- servations according to their underlying complex generating process. Di Lascio and Giannerini (2012) introduced the CoClust, a cluster- ing algorithm based on copula function that achieves the task but has a high computational burden. Moreover, the CoClust automatically allocates all the observations to the clusters; thus it cannot discard potentially irrelevant observations. In this paper we introduce an im- proved version of the CoClust that both overcomes these issues and performs better in many respects. By means of a Monte Carlo study we investigate the features of the algorithm we propose and show that it improves consistently with respect to the old CoClust. The validity of our proposal is also supported by applications to real data sets of human breast tumor samples for which the algorithm provides a mean- ingful biological interpretation. The new algorithm is implemented and made available through an updated version of the R package CoClust.