Abstract
Untargeted LC-MS/MS is a powerful approach for large-scale metabolomics studies, yet reproducible and efficient analysis of such data remains a major challenge. While R offers highly customizable workflows suited to diverse experimental and instrumental setups, the integration of specialized packages into coherent, scalable pipelines—especially for large cohort analyses—is often complex and fragmented. To address this gap, we present Metabonaut, an educational resource comprising a series of reproducible tutorials for untargeted LC-MS/MS metabolomics data analysis using R and Bioconductor. Built around a representative LC-MS/MS dataset, the tutorials demonstrate how to construct an end-to-end analysis workflow using tools such as xcms and other packages from the RforMassSpectrometry ecosystem. Each tutorial guides users step-by-step through the analysis process—from raw data preprocessing and feature detection to statistical analysis and annotation—emphasizing reproducibility, adaptability, and interoperability. As a case study, we include an analysis of human plasma samples comparing individuals with cardiovascular disease to healthy controls, illustrating quality control, normalization, and differential abundance analysis. Beyond core workflows, Metabonaut offers modules on data inspection and quality assessment, flexible alignment for integrating new data into existing preprocessed sets, and cross-language interoperability—highlighted through spectral annotation using Python’s matchms library. All tutorials are designed to be executable over time and can be used independently or combined into a comprehensive “super-vignette.”
This work is supported by the European Union under the HORIZON-MSCA-2021 project 101073062: HUMAN – Harmonising and Unifying Blood Metabolic Analysis Networks.