From biological to social networks: Link prediction based on multi-way spectral clustering
MetadataShow full item record
SubjectBioinformatics databases; Cross-disciplinary applications; Clustering; Information filtering
Link prediction in protein-protein interaction networks (PPINs) is an important task in biology, since the vast majority of biological functions involve such protein interactions. Link prediction is also important for online social networks (OSNs), which provide predictions about who is a friend of whom. Many link prediction methods for PPINs/OSNs are local-based and do not exploit all network structure, which limits prediction accuracy. On the other hand, there are global approaches to detect the overall path structure in a network, being computationally prohibitive for huge-size PPINs/OSNs. In this paper, we enhance a previously proposed multi-way spectral clustering method by introducing new ways to capture node proximity in both PPINs/OSNs. Our new enhanced method uses information obtained from the top few eigenvectors of the normalized Laplacian matrix. As a result, it produces a less noisy matrix, which is smaller and more compact than the original one. In this way, we are able to provide faster and more accurate link predictions. Moreover, our new spectral clustering model is based on the well-known Bray-Curtis coefficient to measure proximity between two nodes. Compared to traditional clustering algorithms, such as k-means and DBSCAN, which assume globular (convex) regions in Euclidean space, our approach is more flexible in capturing the non-connected components of a social graph and a wider range of cluster geometries. We perform an extensive experimental comparison of the proposed method against existing link prediction algorithms and k-means algorithm, using two synthetic data sets, three real social networks and three real human protein data sets. Our experimental results show that our SpectralLink algorithm outperforms the local approaches, the k-means algorithm and another spectral clustering method in terms of effectiveness, whereas it is more efficient than the global approaches. © 2013 Elsevier B.V. All rights reserved.
Showing items related by title, author, creator and subject.
Affordable and Energy-Efficient Cloud Computing Clusters: The Bolzano Raspberry Pi Cloud Cluster Experiment Abrahamsson P; Helmer S; Phaphoom N; Nicolodi L; Preda N; Miori L; Angriman M; Rikkilä J; Wang X; Hamily K; Bugoloni S (IEEE, 2013)We present our ongoing work building a Raspberry Pi cluster consisting of 300 nodes. The unique characteristics of this single board computer pose several challenges, but also offer a number of interesting opportunities. ...
Durante, F; Pappadà, R; Torelli, N (2014)A methodology is presented for clustering financial time series according to the association in the tail of their distribution. The procedure is based on the calculation of suitable pairwise conditional Spearman’s correlation ...