Logo image
Probgraph: High-performance and high-accuracy graph mining with probabilistic set representations
Conference proceeding   Open access   Peer reviewed

Probgraph: High-performance and high-accuracy graph mining with probabilistic set representations

M Besta, C Miglioli, Paolo Sylos Labini, J Tětek, P Iff, R Kanakagiri, S Ashkboos, K Janda, M Podstawski, G Kwaśniewski, …
Proceedings of SC22: The International Conference for High Performance Computing, Networking, Storage and Analysis, Dallas, Texas, November 13-18, 2022, pp.1-17
2022 International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2022 (Dallas, Texas, 13/11/2022–18/11/2022)
2022
Handle:
https://hdl.handle.net/10863/51380

Abstract

Approximate Community Detection Approximate Graph Clustering Approximate Graph Mining Approximate Graph Pattern Matching Approximate Triangle Counting Bloom Filters Graph Sketching High-Performance Graph Computations K Minimum Values MinHash
Important graph mining problems such as Clustering are computationally demanding. To significantly accelerate these problems, we propose ProbGraph: a graph representation that enables simple and fast approximate parallel graph mining with strong theoretical guarantees on work, depth, and result accuracy. The key idea is to represent sets of vertices using probabilistic set representations such as Bloom filters. These representations are much faster to process than the original vertex sets thanks to vectorizability and small size. We use these representations as building blocks in important parallel graph mining algorithms such as Clique Counting or Clustering. When enhanced with ProbGraph, these algorithms significantly outperform tuned parallel exact baselines (up to nearly 50 x on 32 cores) while ensuring accuracy of more than 90% for many input graph datasets. Our novel bounds and algorithms based on probabilistic set representations with desirable statistical properties are of separate interest for the data analytics community. Proofs of theorems & more results: http://arxiv.org/abs/2208.11469
pdf
ProbGraph1.77 MBDownloadView
Open Access
pdf
ProbGraph_High-Performance_and_High-Accuracy_Graph_Mining_with_Probabilistic_Set_Representations1.92 MBDownloadView
Open Access
url
https://ieeexplore.ieee.org/abstract/document/10046121View

Details

Metrics

1 Record Views
Logo image