License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Protocol status: Working
We use this protocol and it's working
Created: January 23, 2026
Last Modified: March 20, 2026
Protocol Integer ID: 239597
Keywords: LC-MS, metabolomics, data analysis, data interpretation, lipidomic data, metabohub, identified compound, sample preparation, sample, data
Funders Acknowledgements:
Agence Nationale de la Recherche
Grant ID: ANR-11-INBS-0010 MetaboHUB
ANR-21-ESRE-0035
Grant ID: ANR-21-ESRE-0035
Abstract
This protocol describes how metabolomic and lipidomic data produced at MetaboHUB-Tours are analysed. It follows on from the sample preparation and LC-MS parameter protocols.
The result is a list of identified compounds (with a stable signal in the sequence) in the samples, based on a confidence level for the identification.
To further interpret the data, software and web page suggestions are provided.
Guidelines
For all softwares used in this protocol, the original papers & the website link are available in the protocol.
Materials
Computer, with at least 4 cores.
All the softwares required are listed in the protocol, with a link to official website or repository.
Troubleshooting
Problem
MSconvert can created artefact and false fragments during conversion
Solution
Not provided
Safety warnings
This protocols has been used only on LC-MS and LC-MS/MS data.
This protocole only concerne data files analysis so no specific ethics statement are necessary.
Before start
Remember, on MetaoHUB Tours, every sample is injected 3 times : positive reversed-phase, negative reversed-phase & positive HILIC so, some steps of the protocols need to be done for each modality.
Axel Dablanc, Solweig Hennechart, Amélie Perez, Guillaume Cabanac, Yann Guitton, Nils Paulhe, Bernard Lyan, Emilien L. Jamin, Franck Giacomoni, Guillaume Marti (2026). FragHub: A Mass Spectral Library Data Integration . Analytical Chemistry.
All these repositories have been cured, concatened and redondancies have been sorted with FragHub; leading to a unique library file in POS and in NEG.
1.3. Study QC samples MS² Analysis
Features detection and MS² spectra extraction have been carried out on one QC raw files, with MSconvert from ProteoWizard (NB : This step can also be performed on specific QC for subgroups or on different types of samples.)
Identification of features from the QC is carried out in two stages, using Flash Entropy:
Only with the in-house msp files, generated in step 1.1 --> level of identification = 1
Only with the GNPS-based spectral libraries msp files, generated in step 1.2 --> level of identification = 2 or 3
These identification levels are based on
Citation
Estelle Rathahao-Paris, Sandra Alves, Christophe Junot, Jean-Claude Tabet (2015). High resolution mass spectrometry for structural . Metabolomics.
All MS² spectra comparison and so, identification, is done manually with the parameters :
Report top n hits = 5
Precursor m/z tolerance (in Da) = 0.01
Product ions m/z tolerance (in Da) = 0.02
identity_search-score < 0.75 are not checked
As an ouput files, you have a csv files with all hits found. Compounds with a identity_search-score < 0.75 are deleted. Identification are sorted based on hits and manual curation on MS² "experimental vs. theoritical" spectra comparison.
Software
MSconvert (ProteoWizard)
NAME
Chambers, M.C., MacLean, B., Burke, R., Amode, D., Ruderman, D.L., Neumann, S., Gatto, L., Fischer, B., Pratt, B., Egertson, J.,
Output files are a csv files with all features detected in samples & a folder with a png file of the chromatographic peak corresponding to each features detected
3. Data Curation
Data Curation = Signal Normalization & Metabolites Exclusion :
- Normalization #1 (if possible) : normalization to the mass, the nomber of cells or others parameters
- Normalization #2 :of metabolite areas to the total area of detected metabolites
- Calculation of the coefficient of variation (CV) for each metabolite in the QCs (analytical variability) and in the samples (biological variability)
- Removal of metabolites with analytical variability > biological variability
- Removal of metabolites with analytical variability > 30%
Signal Normalization and Metabolites Exclusion are repetead until there is no further exclusion
Fusion of modalites = sorting redundancies:
Only the best modality is kept for 1 metabolite, based on RT > Dead Volume (0-1 min) and/or lower CV(metabolite) on QCs
"Analysis Validation" is proceed
Analysis Validation: the analyses were validated by observing the distribution of QCs among the samples using Principal Component Analysis (PCA). For all PCAs, the data were log-transformed and underwent UV scaling normalization.
--> Another control of the quality is done with a PCA displaying " blank vs. blank of extraction vs. QC" to check that our identification are not only noises.
RESULTS FILES
The final file is a table with the following information:
A
B
C
D
E
F
G
H
I
J
K
L
M
Best modality kept
delta rt
delta ppm
Name ID
KEGG ID
HMDB ID
CHEBI ID
Several MetExplore ID
mz
rt
samples ...
QC ...
CV(CQ)
4. Going further : Data Interpretation
MetaboAnalyst : pathways & statistical models
MetaboAnalyst is a comprehensive web-based platform designed for statistical, functional, and integrative analysis of metabolomics data. It helps researchers interpret complex datasets through visualization tools, pathway analysis, and biomarker discovery. The platform is widely used to translate raw metabolomics data into biological insights.
Software
MetaboAnalyst
NAME
Jianguo Xia , Nick Psychogios , Nelson Young , David S. Wishart
MetExplore is a web server dedicated to the exploration and analysis of metabolic networks. It allows users to visualize metabolic pathways, study network structures, and integrate omics data into genome-scale metabolic models. This tool supports systems biology approaches to better understand metabolism.
FORUM is a knowledge-sharing portal that connects researchers to curated resources, databases, and tools related to metabolism. It promotes collaboration and data exchange within the metabolism research community. The portal helps integrate dispersed metabolic knowledge into a unified network
Software
FORUM Knowledge graph Database
NAME
Maxime Delmas , Olivier Filangi , Nils Paulhe , Florence Vinson , Christophe Duperier , William Garrier , Paul-Emeric Saunier
MetaboLights : a database for metabolomics experiments and derived information
MetaboLights is an open-access repository for metabolomics experiments and their derived information. It enables researchers to store, share, and reuse metabolomics datasets along with standardized metadata. This database supports data transparency, reproducibility, and long-term preservation of metabolomics studies
Software
MetaboLights
NAME
Ozgur Yurekten, Thomas Payne, Noemi Tejera, Felix Xavier Amaladoss, Callum Martin, Mark Williams, Claire O’Donovan
Ozgur Yurekten, Thomas Payne, Noemi Tejera, Felix Xavier Amaladoss, Callum Martin, Mark Williams, Claire O’Donovan (2026). MetaboLights: open data repository for metabolomic. Nucleic Acids Research.
Maxime Delmas , Olivier Filangi , Nils Paulhe , Florence Vinson , Christophe Duperier , William Garrier , Paul-Emeric Saunier , Yoann Pitarch , Fabien Jourdan , Franck Giacomoni , Clément Frainay. FORUM: building a Knowledge Graph from public data
Ozgur Yurekten, Thomas Payne, Noemi Tejera, Felix Xavier Amaladoss, Callum Martin, Mark Williams, Claire O’Donovan. MetaboLights: open data repository for metabolomic