Abstract
Introduction: Within the field of computational metabolomics, significant efforts have been made in the development of open-source software to (pre)process untargeted LC-MS data. However, advances in tools for targeted preprocessing have been comparatively limited and the few existing open-source options are often not maintained nor thoroughly tested on biological data. Furthermore, reliance on vendor-specific software can hamper the translation and transparency of methods and results while remaining time consuming. We therefore present TARDIS, an open-source R package with graphical user interface (GUI) for targeted LC-MS data analysis.
Aim: To develop and benchmark an R package that enables high-throughput, reliable and reproducible LC-MS peak integration.
Methods: A total of 547 saliva, 329 urine and 292 fecal samples from three different cohorts (FAME, ENVIRONAGE and FGFP) were in-house analyzed (LC-MS metabolomics and lipidomics) and preprocessed using TARDIS and XcaliburTM. Preprocessing time and coefficients of variation (CV) of targets reported by each software were compared and correlation between peak areas was quantified by Spearman’s rank correlation (ρ) and linear regression (R2). Integrated peaks were labelled as “Bad”, “Ambiguous” or “Good” and a machine learning classifier was trained using TARDIS-derived quality metrics to predict these labels.
Results: TARDIS succeeded to detect 215 targets in 700 LC-MS runs in less than four hours. CV of target peaks was comparable between TARDIS and XcaliburTM and ρ and R2 between reported areas were high (mean ρ = 0.96, mean R2 = 0.94). A random forest classifier trained on multi-matrix data predicted 98.57% of peaks labelled “Bad” correctly, allowing to filter out low-quality targets without falsely disregarding high-quality targets.
Conclusion: Our results show that TARDIS provides comparable results to state-of-the-art vendor software while enhancing reproducibility, transparency and time efficiency. Included quality metrics revealed the potential of machine learning to enhance peak filtering. TARDIS is freely available at
https://github.com/UGent-LIMET/TARDIS.