Mar 20, 2026

Public workspace(Un)Targeted Metabolomics_DataAnalysis_MTH Tours

This protocol is a draft, published without a DOI.
(Un)Targeted Metabolomics_DataAnalysis_MTH Tours
  • Jérémy Monteiro1,2,
  • Antoine Lefèvre1,2,3,
  • Camille Dupuy1,2,3,
  • Lydie Nadal-Desbarats1,2,3
  • 1Plateforme de Métabolomique et d'Analyses Chimiques, US61 ASB, Université de Tours, CHRU Tours, Inserm, Tours, France;
  • 2MetaboHUB-Tours, Tours, France;
  • 3Université de Tours, INSERM, Imaging Brain & Neuropsychiatry iBraiN U1253, 37032, Tours, France
Icon indicating open access to content
QR code linking to this content
Protocol CitationJérémy Monteiro, Antoine Lefèvre, Camille Dupuy, Lydie Nadal-Desbarats 2026. (Un)Targeted Metabolomics_DataAnalysis_MTH Tours. protocols.io https://protocols.io/view/un-targeted-metabolomics-dataanalysis-mth-tours-hm8mb49u7
License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License,  which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Protocol status: Working
We use this protocol and it's working
Created: January 23, 2026
Last Modified: March 20, 2026
Protocol Integer ID: 239597
Keywords: LC-MS, metabolomics, data analysis, data interpretation, lipidomic data, metabohub, identified compound, sample preparation, sample, data
Funders Acknowledgements:
Agence Nationale de la Recherche
Grant ID: ANR-11-INBS-0010 MetaboHUB
ANR-21-ESRE-0035
Grant ID: ANR-21-ESRE-0035
Abstract
This protocol describes how metabolomic and lipidomic data produced at MetaboHUB-Tours are analysed. It follows on from the sample preparation and LC-MS parameter protocols.
The result is a list of identified compounds (with a stable signal in the sequence) in the samples, based on a confidence level for the identification.
To further interpret the data, software and web page suggestions are provided.
Guidelines
For all softwares used in this protocol, the original papers & the website link are available in the protocol.

Materials
Computer, with at least 4 cores.

All the softwares required are listed in the protocol, with a link to official website or repository.
Troubleshooting
Problem
MSconvert can created artefact and false fragments during conversion
Solution
Not provided
Safety warnings
This protocols has been used only on LC-MS and LC-MS/MS data.

Regarding the storage of your data, if your data is of clinical origin, please remember that it must be anonymised.(NIH recommendations: https://pmc.ncbi.nlm.nih.gov/articles/PMC9373195/)
Ethics statement
This protocole only concerne data files analysis so no specific ethics statement are necessary.
Before start
Remember, on MetaoHUB Tours, every sample is injected 3 times : positive reversed-phase, negative reversed-phase & positive HILIC so, some steps of the protocols need to be done for each modality.

Regarding the storage of your data, if your data is of clinical origin, please remember that it must be anonymised.(NIH recommendations: https://pmc.ncbi.nlm.nih.gov/articles/PMC9373195/)
1.1. In-House MS² Spectral Library
All standards compounds have been injected on the LC-MS system in LC-MS².
Standards are coming from the MSMLS, OAMLS,BACSMLS and FAMLS chemical library from IROA Technologies (purchased from Merck).

Features detection and MS² spectra extraction have been carried out on raw files, with Spec2Xtract.

Note
Currently, Spect2Xtract does not work with Waters files. Instead, you can use MSconvert (cf step 6)

Software
Spec2Xtract
NAME
Sylvain Dechaumet
DEVELOPER
REPOSITORY
Output files are ".msp" ones
All these files have been concatened with FragHub; leading to a unique library file in POS and in NEG.

Software
FragHub
NAME
Axel DaBlanc
DEVELOPER
REPOSITORY

Citation
Axel Dablanc, Solweig Hennechart, Amélie Perez, Guillaume Cabanac, Yann Guitton, Nils Paulhe, Bernard Lyan, Emilien L. Jamin, Franck Giacomoni, Guillaume Marti (2026). FragHub: A Mass Spectral Library Data Integration . Analytical Chemistry.
LINK

1.2. Online Available MS² Spectral Library
GNPS libraries relate to clinical context have been downloaded : GNPS-NIH-CLINICALCOLLECTION1, GNPS-NIH-CLINICALCOLLECTION2, GNPS-NIH-SMALLMOLECULEPHARMACOLOGICALLYACTIVE, GNPS-COLLECTIONS-PESTICIDES-POSITIVE, GNPS-COLLECTIONS-PESTICIDES-NEGATIVE, GNPS-NIST14-MATCHES & MONA
Software
GNPS
NAME
Mingxun Wang, Jeremy J Carver, Vanessa V Phelan, Laura M Sanchez, Neha Garg, Yao Peng, Don Duy Nguyen, Jeramie Watrous, Clifford
DEVELOPER

Citation
Mingxun Wang, Jeremy J Carver et al. (2016). Sharing and community curation of mass spectrometr. Nature Biotechnology.
LINK


All these repositories have been cured, concatened and redondancies have been sorted with FragHub; leading to a unique library file in POS and in NEG.
1.3. Study QC samples MS² Analysis
Features detection and MS² spectra extraction have been carried out on one QC raw files, with MSconvert from ProteoWizard (NB : This step can also be performed on specific QC for subgroups or on different types of samples.)

Identification of features from the QC is carried out in two stages, using Flash Entropy:
  1. Only with the in-house msp files, generated in step 1.1 --> level of identification = 1
  2. Only with the GNPS-based spectral libraries msp files, generated in step 1.2 --> level of identification = 2 or 3
  3. These identification levels are based on

Citation
Estelle Rathahao-Paris, Sandra Alves, Christophe Junot, Jean-Claude Tabet (2015). High resolution mass spectrometry for structural . Metabolomics.
LINK


All MS² spectra comparison and so, identification, is done manually with the parameters :
  • Report top n hits = 5
  • Precursor m/z tolerance (in Da) = 0.01
  • Product ions m/z tolerance (in Da) = 0.02
  • identity_search-score < 0.75 are not checked

As an ouput files, you have a csv files with all hits found. Compounds with a identity_search-score < 0.75 are deleted. Identification are sorted based on hits and manual curation on MS² "experimental vs. theoritical" spectra comparison.

Software
MSconvert (ProteoWizard)
NAME
Chambers, M.C., MacLean, B., Burke, R., Amode, D., Ruderman, D.L., Neumann, S., Gatto, L., Fischer, B., Pratt, B., Egertson, J.,
DEVELOPER

Software
Flash Entropy Search
NAME
Yuanyue Li
DEVELOPER
REPOSITORY

Citation
Chambers et al. (2012). A cross-platform toolkit for mass spectrometry and. Nature Biotechnology.
LINK

Citation
Yuanyue Li & Oliver Fiehn (2023). Flash entropy search to query all mass spectral li. Nature Methods.
LINK

2. Features Detection from all study samples
All samples are injected in MS1 mode and converted in mzML format with MSconvert from ProteoWizard.
mzML files are then transfert to Workflow4Metabolomics for analysis.

Software
Workflow4Metabolomics
NAME
doi: 10.1093/bioinformatics/btu813
DEVELOPER
REPOSITORY

Output files are a csv files with all features detected in samples & a folder with a png file of the chromatographic peak corresponding to each features detected
3. Data Curation
  • Data Curation = Signal Normalization & Metabolites Exclusion :
- Normalization #1 (if possible) : normalization to the mass, the nomber of cells or others parameters
- Normalization #2 :of metabolite areas to the total area of detected metabolites
- Calculation of the coefficient of variation (CV) for each metabolite in the QCs (analytical variability) and in the samples (biological variability)
- Removal of metabolites with analytical variability > biological variability
- Removal of metabolites with analytical variability > 30%
  • Signal Normalization and Metabolites Exclusion are repetead until there is no further exclusion
Fusion of modalites = sorting redundancies:
  • Only the best modality is kept for 1 metabolite, based on RT > Dead Volume (0-1 min) and/or lower CV(metabolite) on QCs
  • "Analysis Validation" is proceed
Analysis Validation: the analyses were validated by observing the distribution of QCs among the samples using Principal Component Analysis (PCA). For all PCAs, the data were log-transformed and underwent UV scaling normalization.
--> Another control of the quality is done with a PCA displaying " blank vs. blank of extraction vs. QC" to check that our identification are not only noises.
Analyze
RESULTS FILES
The final file is a table with the following information:

ABCDEFGHIJKLM
Best modality keptdelta rtdelta ppmName IDKEGG IDHMDB IDCHEBI IDSeveral MetExplore IDmzrtsamples ...QC ...CV(CQ)

4. Going further : Data Interpretation
MetaboAnalyst : pathways & statistical models
MetaboAnalyst is a comprehensive web-based platform designed for statistical, functional, and integrative analysis of metabolomics data. It helps researchers interpret complex datasets through visualization tools, pathway analysis, and biomarker discovery. The platform is widely used to translate raw metabolomics data into biological insights.
Software
MetaboAnalyst
NAME
Jianguo Xia , Nick Psychogios , Nelson Young , David S. Wishart
DEVELOPER

Citation
Jianguo Xia , Nick Psychogios , Nelson Young , David S. Wishart (2009). MetaboAnalyst: a web server for metabolomic data a. Nucleic Acids Research.
LINK

MetExplore : metabolic network
MetExplore is a web server dedicated to the exploration and analysis of metabolic networks. It allows users to visualize metabolic pathways, study network structures, and integrate omics data into genome-scale metabolic models. This tool supports systems biology approaches to better understand metabolism.

Software
MetExplore
NAME
MetaboHUB Staff (https://www.metabohub.fr/)
DEVELOPER

Citation
Ludovic Cottret , Clément Frainay , Maxime Chazalviel , Floréal Cabanettes , Yoann Gloaguen , Etienne Camenen , Benjamin Merlet , Stéphanie Heux , Jean-Charles Portais , Nathalie Poupin , Florence Vinson , Fabien Jourdan (2026). MetExplore: collaborative edition and exploration . Nucleic Acids Research.
LINK

FORUM : Metabolism Knowledge Network Portal
FORUM is a knowledge-sharing portal that connects researchers to curated resources, databases, and tools related to metabolism. It promotes collaboration and data exchange within the metabolism research community. The portal helps integrate dispersed metabolic knowledge into a unified network
Software
FORUM Knowledge graph Database
NAME
Maxime Delmas , Olivier Filangi , Nils Paulhe , Florence Vinson , Christophe Duperier , William Garrier , Paul-Emeric Saunier
DEVELOPER

Citation
Maxime Delmas , Olivier Filangi , Nils Paulhe , Florence Vinson , Christophe Duperier , William Garrier , Paul-Emeric Saunier , Yoann Pitarch , Fabien Jourdan , Franck Giacomoni , Clément Frainay (2021). FORUM: building a Knowledge Graph from public data. Bioinformatics.
LINK

5. FAIR Data : public repository for publication
MetaboLights : a database for metabolomics experiments and derived information
MetaboLights is an open-access repository for metabolomics experiments and their derived information. It enables researchers to store, share, and reuse metabolomics datasets along with standardized metadata. This database supports data transparency, reproducibility, and long-term preservation of metabolomics studies
Software
MetaboLights
NAME
Ozgur Yurekten, Thomas Payne, Noemi Tejera, Felix Xavier Amaladoss, Callum Martin, Mark Williams, Claire O’Donovan
DEVELOPER

Citation
Ozgur Yurekten, Thomas Payne, Noemi Tejera, Felix Xavier Amaladoss, Callum Martin, Mark Williams, Claire O’Donovan (2026). MetaboLights: open data repository for metabolomic. Nucleic Acids Research.
LINK

Protocol references


Software
Flash Entropy Search
NAME
Yuanyue Li
DEVELOPER
REPOSITORY

Software
FragHub
NAME
Axel DaBlanc
DEVELOPER
REPOSITORY

Software
GNPS
NAME
Mingxun Wang, Jeremy J Carver, Vanessa V Phelan, Laura M Sanchez, Neha Garg, Yao Peng, Don Duy Nguyen, Jeramie Watrous, Clifford
DEVELOPER

Software
Spec2Xtract
NAME
Sylvain Dechaumet
DEVELOPER
REPOSITORY


Software
MSconvert (ProteoWizard)
NAME
Chambers, M.C., MacLean, B., Burke, R., Amode, D., Ruderman, D.L., Neumann, S., Gatto, L., Fischer, B., Pratt, B., Egertson, J.,
DEVELOPER


Citations
Step 14
Jianguo Xia , Nick Psychogios , Nelson Young , David S. Wishart. MetaboAnalyst: a web server for metabolomic data a
https://doi.org/10.1093/nar/gkp356
Step 15
Ludovic Cottret , Clément Frainay , Maxime Chazalviel , Floréal Cabanettes , Yoann Gloaguen , Etienne Camenen , Benjamin Merlet , Stéphanie Heux , Jean-Charles Portais , Nathalie Poupin , Florence Vinson , Fabien Jourdan. MetExplore: collaborative edition and exploration
https://doi.org/10.1093/nar/gky301
Step 16
Maxime Delmas , Olivier Filangi , Nils Paulhe , Florence Vinson , Christophe Duperier , William Garrier , Paul-Emeric Saunier , Yoann Pitarch , Fabien Jourdan , Franck Giacomoni , Clément Frainay. FORUM: building a Knowledge Graph from public data
https://doi.org/10.1093/bioinformatics/btab627
Step 17
Ozgur Yurekten, Thomas Payne, Noemi Tejera, Felix Xavier Amaladoss, Callum Martin, Mark Williams, Claire O’Donovan. MetaboLights: open data repository for metabolomic
https://doi.org/10.1093/nar/gkad1045
Step 4
Mingxun Wang, Jeremy J Carver et al.. Sharing and community curation of mass spectrometr
https://doi.org/10.1038/nbt.3597
Step 6
Estelle Rathahao-Paris, Sandra Alves, Christophe Junot, Jean-Claude Tabet. High resolution mass spectrometry for structural
10.1007/s11306-015-0882-8
Step 6
Chambers et al.. A cross-platform toolkit for mass spectrometry and
https://doi.org/10.1038/nbt.2377
Step 6
Yuanyue Li & Oliver Fiehn. Flash entropy search to query all mass spectral li
https://doi.org/10.1038/s41592-023-02012-9