Intracellular Metabolite Extraction, Peak Detection with El-MAVEN and Quantification Using Python

Bishal Dev Sharma; Daniel Olson

Aug 16, 2025

Version 2

Intracellular Metabolite Extraction, Peak Detection with El-MAVEN and Quantification Using Python V.2

DOI

https://dx.doi.org/10.17504/protocols.io.kqdg3wjbev25/v2

Bishal Dev Sharma¹,
Daniel Olson¹

¹Thayer School of Engineering, Dartmouth College

Bishal Dev Sharma: Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Software; Validation; Visualization; Writing-original draft.
Daniel Olson: Conceptualization; Writing-Review and Editing; Supervision; Project Administration; Funding Acquisition

Lynd/Olson Lab

Bishal Dev Sharma

Dartmouth College

DOI: https://dx.doi.org/10.17504/protocols.io.kqdg3wjbev25/v2

Protocol Citation: Bishal Dev Sharma, Daniel Olson 2025. Intracellular Metabolite Extraction, Peak Detection with El-MAVEN and Quantification Using Python. protocols.io https://dx.doi.org/10.17504/protocols.io.kqdg3wjbev25/v2Version created by Bishal Dev Sharma

License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

Protocol status: Working

We use this protocol and it's working

Created: August 13, 2025

Last Modified: August 16, 2025

Protocol Integer ID: 224622

Keywords: metabolites extraction, metabolomics, intracellular metabolites, python for data analysis, quantification of intracellular metaboli..., extracting intracellular metabolite, intracellular metabolite, microbial culture, microbial cultures this protocol, extraction, methanol, maven for peak detection, rapid filtration, El-MAVEN, comprehensive workflow for intracellular metabolite analysis, intracellular metabolite analysis, intracellular metabolite extraction, absolute quantification of metabolite, extraction of metabolite, metabolite dynamics across experimental condition, metabolite dynamic, metabolite integrity, metabolite, maven for precise peak detection, microbial culture, raw data, ms analysis, precise peak detection, external calibration curve

Funders Acknowledgements:

Center for Bioenergy Innovation (CBI), U.S.

Grant ID: ERKP886

Abstract

This protocol outlines a comprehensive workflow for intracellular metabolite analysis from microbial cultures. It begins with the rapid quenching and extraction of metabolites using a cold solvent mixture of acetonitrile, methanol, and water (2:2:1, v/v/v) to preserve metabolite integrity. After LC-MS analysis, raw data is processed in El-MAVEN for precise peak detection and alignment. The final step involves the absolute quantification of metabolites via a Python-based script that employs external calibration curves. For compounds lacking external standards, the protocol also includes exploratory data analysis, such as generating heatmaps or line plots, to visualize metabolite dynamics across experimental conditions. This approach enables both high-throughput quantification and qualitative exploration across large sample sets.

Attachments

Known BDS 2023_08_15...

2025_07_31 cthe intr...

1.1MB

2025_07_31 cthe intr...

139KB

2025_07_31 peaks wit...

164KB

2025_07_31 peaks wit...

480KB

Sample information A...

12KB

Standards informatio...

10KB

Materials

Acetonitrile (Sigma 271004)
Aluminium block
Ice bucket
KIMBLE KIMAX Graduated Filter Flask (27060)
KNF Lab Filtration Pump (UN86KTP)
Methanol (Sigma 34860)
Petridish (Falcon 351007)
Sample Concentrator (Cole-Parmer EW-36620-40)
Sintered glass funnel
Stainless steel forceps
Any metabolites that you want to use as external standard

Before start

Always wear appropriate personal protective equipment (PPE) while performing this protocol in the lab, including a properly fitted lab coat, gloves, safety goggles, long pants, and closed-toe shoes. 

Before beginning intracellular metabolite extraction, ensure that the aluminum block is pre-cooled to -80 °C in a freezer.

Introduction and steps

Intracellular metabolite analysis provides a snapshot of cellular physiology under specific growth or environmental conditions, offering insights into pathway bottlenecks and reaction thermodynamics. Accurate measurement requires rapid sampling, effective quenching, and efficient extraction to preserve the in vivo metabolite state, particularly for intermediates with rapid turnover. This protocol describes the extraction of intracellular metabolites, detection of peaks in El-MAVEN from Liquid Chromatography-Mass Spectrometry (LC–MS) generated raw data files, and subsequent visualization and quantification of metabolite levels. The workflow is adaptable to diverse experimental designs, supporting both quantitative and comparative metabolomics studies in microbial systems.

This protocol is organized into the following key steps:
Extraction of intracellular metabolites using a cold solvent mixture.
Peak detection and alignment using El-MAVEN.
Quantification of metabolites using external calibration standards through a Python-based script.

Extraction of intracellular metabolites

1d 1h 5m

This section outlines the procedure for extracting intracellular metabolites from microbial cultures.

A YouTube video of metabolite extraction from Daniel Amador-Noguez's Lab at University of Wisconsin-Madison is available here: https://www.youtube.com/watch?v=WpzRXWkbok0

Grow the microbe of interest in a Falcon tube, conical flask, or bioreactor, depending on the requirements of your experiment.
3.1.1 For single time point experiments, grow the experiment cultures to mid-log phase (optical density at 600 nm [OD₆₀₀] ~0.6 – 0.8).

Note
The mid-log phase for a microbe can vary depending on factors such as microbial species, growth media, initial substrate concentrations, and other environmental conditions. The given OD range (~0.6–0.8) corresponds to the mid-log phase of Clostridium thermocellum grown with 5 g/L sugar.

3.1.2 For time-course experiments, collect samples at appropriate intervals throughout the growth period.

Pipette 1.6 mL of cold metabolite extraction buffer into a petri dish placed on top of a pre-chilled aluminum block maintained at -80 °C (Figure 1A, Figure 1B). 
The metabolite extraction buffer is stored at -20°C before use. In our experience, storage at 4°C has not resulted in any noticeable issues.
Figure 1. A. Ice bucket (left) and pre-chilled aluminium block (right) B. Ice bucket with pre-chilled aluminum block and petri dish on top, used to keep samples cold during extraction and quenching.

Note
Dry ice block can also be used instead of pre-chilled aluminium block.

Set up the vacuum filtration unit (Figure 2).  The system consists of a conical filtration flask fitted with a sintered glass funnel, connected to a vacuum pump to facilitate the rapid removal of the liquid phase before quenching.
Once the vacuum is connected, use a squirt bottle to wash the sintered glass funnel thoroughly before use.

Figure 2: Vacuum filtration unit. This setup is used for rapid separation of cells from culture medium. 

Using stainless steel forceps, place the appropriate filter membrane on top of the sintered glass funnel. Rinse the filter thoroughly with Milli-Q water to remove any impurities.
Use a 0.45 µm Nylon membrane for Escherichia coli, Zymomonas mobilis or for cultures grown on substrate concentrations less than 5 g/L sugars.
Use a 3.0 µm Nylon membrane for Clostridium thermocellum and Thermoanaerobacterium saccharolyticum.

Note
Filter selection is critical, as the filter material significantly affects the efficiency of metabolite extraction.
An ideal filter should:
Retain most of the cells,
Absorb minimal culture media,
Filter quickly, and
Be chemically inert (i.e., not react with the metabolite extraction buffer).
For detailed guidance on filter selection for metabolite analysis, refer to Sharma et al., 2023.

Citation
Sharma BD, Olson DG, Giannone RJ, Hettich RL, Lynd LR (2023). Characterization and Amelioration of Filtration Difficulties Encountered in Metabolomic Studies of Clostridium thermocellum at Elevated Sugar Concentrations.https://doi.org/10.1128/aem.00406-23
LINK

Filter 3 – 5 mL of culture using the vacuum filtration ensuring that the product of OD₆₀₀ and volume (in mL) is approximately 2.5 or higher to obtain sufficient metabolite signal for LC-MS analysis.
Note
For cultures at OD₆₀₀ ~0.45–0.50, extracting ~10 mL of sample generally provides stronger signal and more reliable detection.

Immediately transfer the filter to the petri dish containing the cold extraction buffer, placing the cell-facing side in direct contact with the buffer.
Note: For intracellular metabolite analysis from culture supernatant, add (100 – 200) µL of clarified culture (via centrifugation or syringe filtration) directly into the petri dish containing pre-chilled extraction buffer.

Gently homogenize the extraction buffer in the petri dish by pipetting up and down several times to ensure thorough mixing of the cells and solvent, with the cell-facing side of the filter facing upward (Figure 3). Then, transfer the entire volume into a 2 mL eppendorf tube placed on ice.

Figure 3. Filter placed in the petri dish with the cell-facing side up. The extraction buffer now contains the released intracellular metabolites.

Centrifuge the tube(s) at 4 °C to pellet any remaining cell debris (use a refrigerated centrifuge if available) (Figure 4).

Figure 4. Eppendorf tube after centrifugation of the metabolite extraction buffer. A visible pellet containing cell debris and proteins is seen at the bottom of the tube.

Carefully transfer the supernatant (clarified extract) to a new pre-chilled 2 mL Eppendorf tube.

Dry the required volume of the supernatant using a sample concentrator (Figure 5) to remove the solvent, which can interfere with LC-MS or plate reader-based assays. 

Make sure that the needles are inserted with pointed side in and they don't touch the contents of the tube (Figure 6).

Once the metabolite extract is dried, a gel-like substance might be visible at the bottom of tube (Figure 7).
Equipment
Sample Concentrator
NAME
Techne
BRAND
EW-36620-42
SKU
https://www.coleparmer.com/i/cole-parmer-sc-200-sample-concentrator-gas-reservoir-and-stand-for-tubes/3662042
LINK

Equipment
Sample Concentrator Needles
NAME
Needles
TYPE
Fisher Scientific
BRAND
3662097
SKU
https://www.fishersci.com/shop/products/ss-127mm-lng-needle-100-pk/NC0911812
LINK


Figure 5. Sample concentrator setup. The concentrator is connected to a nitrogen tank, with the pressure regulator set to a maximum of 2 psi.

Figure 6. Samples being dried in 1.5 mL Eppendorf tubes using the sample concentrator.

Resuspend the dried sample in HPLC grade water only immediately before use. Typically, samples are concentrated at this step - for example, if 400 µL of metabolite extract is dried, resuspend in 100 µL of water.
Figure 7. Dried metabolite extract in a 1.5 mL Eppendorf tube. Typically, a gel-like substance might be visible at the bottom of the tube, representing the dried sample.

Note
Resuspended samples should be analyzed on the same day for optimal results. While HPLC-grade water is typically used for resuspension, the choice of liquid can vary depending on the downstream analysis. For instance, HPLC-grade water works well for plate reader assays. However, for improved Mass Spectrometry (MS) results, samples should be dissolved in an aqueous phase for reverse-phase chromatography or an organic solvent for forward-phase chromatography.

Both the dried samples and the undried metabolite extracts are stored at -80 °C until further use.
Note
Most damage to stored metabolites occurs during the defrosting process rather than during storage itself. To minimize degradation, avoid repeated freeze-thaw cycles whenever possible. Additionally, dried samples are more stable compared to those stored in the metabolite extraction buffer.

Peak detection using El-MAVEN

This section outlines the basic steps for identifying different compound peaks from raw LC-MS data using El-MAVEN. Peak identification can be performed using either external standards or the default compound library included in the software. These steps are generally sufficient for detecting relevant peaks and exporting peak area data for downstream analysis.
For a more detailed understanding of the underlying algorithms and MAVEN’s processing pipeline, please refer to Clasquin et al., 2012, and Agrawal et al., 2018.
Citation
Agrawal S, Kumar S, Sehgal R, George S, Gupta R, Poddar S, Jha A, Pathak S (2019). El-MAVEN: A Fast, Robust, and User-Friendly Mass Spectrometry Data Processing Engine for Metabolomics.https://doi.org/10.1007/978-1-4939-9236-2_19
LINK

Citation
Clasquin MF, Melamud E, Rabinowitz JD (2012). LC-MS data processing with MAVEN: a metabolomic analysis and visualization engine.https://doi.org/10.1002/0471250953.bi1411s37
LINK

Download El-MAVEN software. 
This protocol uses El-MAVEN v.0.12.0.
Software
El-MAVEN
NAME
Elucidata, Inc.
DEVELOPER
https://github.com/ElucidataInc/ElMaven/releases
REPOSITORY

Note
In our experience, El-MAVEN (v0.12.0) runs more stably on Windows systems compared to macOS, with fewer crashes and smoother performance.

Launch El-MAVEN.
Navigate to File → Save Project. Save the project with a descriptive filename (e.g., MyProjectName.emDB) in your working directory in El-Maven interface (Figure 8).
This .emDB file will store all your session information, including loaded files, peak detection parameters, and annotations.
Figure 8. Screenshot of the El-MAVEN interface showing the layout used for metabolite peak identification.
Enable Compound Widget Panel.
Ensure that the “Show Compound Widget” option (located on the far-right side of the El-MAVEN interface) is turned on. This enables the left panel to display both the Samples and Compounds tabs, which is essential for navigating and reviewing detected features (Figure 9).
Refer to the screenshot for visual guidance. 

Figure 9. Screenshot of the El-MAVEN interface with both the compounds list and the samples list (left panel) visible, enabling efficient navigation and comparison across compounds and samples during peak analysis.
Compound Libraries in the Compound Widget Dropdown.
In the dropdown menu of the Compounds Widget, you will see available compound libraries (Figure 10). By default, El-MAVEN includes libraries such as KNOWNS and SRM2.
In this protocol, we also use a custom compound library named “Known BDS 2023_08_15”, which was created based on the metabolite standards sent for analysis. This custom library allows for targeted peak detection and accurate identification of relevant intracellular metabolites.
Refer to the screenshot for selecting the appropriate library from the dropdown menu (Figure 10).

Figure 10. Dropdown menu displaying available compound libraries in El-MAVEN. KNOWNS and SRM2 are the default libraries, while Known BDS 2023_08_15 is a custom library created using external standards submitted for analysis. 
Key: 3-phosphoglycerate (3PG); Adenosine diphosphate (ADP); Adenosine diphosphate-D-glucose (ADP-D-glu); Adenosine monophosphate (AMP); Adenosine triphosphate (ATP); Acetyl coenzyme A (Acetyl-CoA); Coenzyme A (CoA); Dihydroxyacetone phosphate (DHAP); Fructose-6-phosphate (F6P); Fructose-1,6-bisphosphate (FBP); Glucose-1-phosphate (G1P); Glucose-6-phosphate (G6P); Guanosine diphosphate (GDP); Guanosine monophosphate (GMP); Guanosine triphosphate (GTP);  Nicotinamide adenine dinucleotide, oxidized form (NAD+); Nicotinamide adenine dinucleotide, reduced form (NADH); Nicotinamide adenine dinucleotide phosphate, oxidized form (NADP+); Nicotinamide adenine dinucleotide phosphate, reduced form (NADPH); Phosphoenolpyruvate (PEP); Inorganic pyrophosphate (Ppi)

Note
You can easily create a custom compound library in an Excel file by providing the metabolite/compound name and formula. For reference, see the attached Excel sheet, "Known BDS 2023_08_15.xlsx."

Load Raw Data Files.
Click Open (or go to File → Open) and navigate to the folder containing your LC-MS data files.
Select all the .mzXML files you want to analyze and click Open to load them into El-MAVEN.

Depending on your sample preparation and culture media, each biological sample may correspond to a single file or to multiple split files that have undergone selective filtering. These filtered files help remove large interfering peaks from media components that could otherwise affect metabolite quantification.

Note: You can select multiple files using Shift or Ctrl (Cmd on macOS) to load them all at once.

Peak Detection for Compounds
There are two main approaches to perform peak detection in El-MAVEN when analyzing compounds:
1. Automated Peak Detection
El-MAVEN can automatically detect and integrate peaks for all compounds across all samples using pre-set parameters.
Note
In our experience, running automated peak detection on a large number of samples can cause the software to crash or become unresponsive. To avoid issues, use this feature with caution or apply it to smaller datasets. Additionally, the software may detect a significant number of false peaks, especially if the data quality is suboptimal.
2. Manual Peak Detection (Compound-by-Compound)
You can manually review and select peaks one compound at a time. While this approach is more time-consuming, it offers significantly better control and accuracy, helping to ensure that the correct peak is selected for each compound across all samples.

Recommendation: For high-confidence analysis, especially when standards are used and software stability is a concern, we recommend performing manual peak review compound-by-compound.

Before starting peak detection, make sure to configure isotope settings by clicking on “Isotopes” in the top panel (Figure 11).
Figure 11. Top panel displaying various peak detection options available in El-MAVEN for metabolite analysis.
If you have isotope-labeled compounds, select the appropriate Isotopic Tracer from the list provided.
El-MAVEN currently supports C13, D2, N15, and S34 tracers (Figure 12).
Leave the remaining parameters at their default values unless you have a specific reason to modify them.
Figure 12. Checkbox option to enable or disable isotope detection in El-MAVEN.

Note
To check peak fidelity in El-MAVEN, verify whether the software detects the natural abundance of the 13C peak. Disabling the "Report Isotopes" option will prevent this information from being displayed. Therefore, it is recommended to leave this option enabled, even if your sample does not contain isotopes.

Automated Peak Detection
To begin automated peak detection, click “Peaks” on the top panel (Figure 13).

In the dialog box that opens, choose your Detection Method. The Automated Feature Detection option will detect all peaks present in the chromatogram, regardless of whether they are included in your compound library. However, for this protocol, we use the Compound Database Search method, which restricts detection to compounds listed in the selected library (Figure 13).

You can select your own compound database or use the default libraries such as KNOWNS or SRM2.

Note: All the compounds in our custom database (Known BDS 2023_08_15) are also present in the default KNOWNS library.
Figure 13. Dialog box for configuring the automated peak detection method in El-MAVEN.

Note
It is recommended to keep the Extracted Ion Chromatogram (EIC) window as narrow as possible to ensure accurate peak detection.

Next, navigate to the Group Filtering tab under the Peak Detection dialog. Here, we leave all parameters at their default values except for the Minimum Peak Intensity, which we set to 10,000 for AreaTop (Figure 14). This helps filter out low-intensity noise while preserving biologically relevant signals.

Figure 14. Dialog box for setting group filtering parameters within the automated peak detection method in El-MAVEN.
Once all parameters are set, click “Find Peaks” and wait until the Status reaches 100%.

Manual Peak Detection
To begin manual peak detection, go to the Compounds tab on the left panel and click on the name of the compound you wish to analyze manually (Figure 15; DHAP shown as reference). This will display the Extracted Ion Chromatogram (EIC) for the selected compound across all samples, allowing you to inspect and adjust the peak selections individually.

Figure 15. Manual detection of a compound using the compounds table (left panel), with DHAP shown as a reference.
Once the compound’s chromatogram is visible, double-click on the top of the desired peak to select it. This action will mark the peak and automatically populate the Bookmark Table at the bottom of the screen with the selected peak information (Figure 16).

Figure 16. Bookmarking a peak by double-clicking directly on the peak in the chromatogram.
Now, repeat this process for all the compounds listed in your table. Keep in mind that there may be batch-to-batch variation in retention time across your samples; however, if you have standards included, it becomes easier to locate and select the correct peaks based on their position in the standard. To ensure accurate peak selection across all samples, right-click on the peak in the Bookmark Table, and a dialogue box will appear. From there, select Edit peak-group. This will allow you to navigate through each individual sample or bulk-select multiple samples to verify that the peaks are correctly assigned (Figure 17). We recommend reviewing each compound one by one across all samples to ensure proper peak identification.

Figure 17. Manual inspection of a peak using the peak editor window, enabling review of a compound’s peak across individual samples to ensure accurate identification.
This step becomes especially important for compounds such as G6P, F6P, and G1P, whose peaks elute very close to one another. The software often struggles to accurately assign peaks for all of them unless you manually inspect each sample. These peaks typically appear in the order of G6P, F6P, and G1P. 

For example, in the case shown in Figure 18, the software fails to select the correct first peak for G6P because the second peak has a higher intensity and is mistakenly prioritized. This highlights the importance of manually reviewing peak selections for each compound across all samples to ensure precision.

Figure 18. Example of incorrect identification of closely eluting peaks in El-MAVEN, with glucose 6-phosphate (G6P) used as the reference compound.
To correct this, we go to the Edit peak-group section for G6P and manually select the appropriate peaks for each of the samples (Figure 19). As shown in Figure 19, the software had initially selected the second peak as G6P for the highlighted sample, which is incorrect. We adjust the peak boundary to include only the first peak, which corresponds to G6P (Figure 20). This process is repeated for all samples to ensure that the correct peak is consistently selected. After you check all the samples, press Apply edits to save your changes.

Figure 19. Example of incorrect peak detection as shown in the peak editor dialog box in El-MAVEN. The peak boundaries or apex may be misassigned, highlighting the need for manual verification.

Figure 20. Correction of incorrect peak detection by manually adjusting the peak boundaries for a specific compound. Glucose 6-phosphate (G6P) is shown as the reference.
We repeat this process for all the desired compounds in the dataset to ensure that the correct peaks are being captured across all samples. This thorough manual verification is crucial for maintaining data quality, especially when dealing with closely eluting compounds or batch-to-batch retention time variations.
Note
Since the software tends to crash frequently, we highly recommend saving your progress after completing peak selection for each compound. If the software crashes and you reopen the project file, you might notice that the bookmarked compounds have a suffix like “(1)” or “(2)” added to their names. You can safely ignore this, as we can account for it during downstream data analysis.

Exporting the data
After you have gone through all the compounds and all the samples of your choice, you export the data in .csv format. Click on the csv button on your table and then select export all groups.
To export your data, click on the CSV icon located at the top left of the Bookmark Table. This opens a dropdown menu with several export options:
Export selected groups allows you to export only the groups you have manually selected in the table.
Export all groups includes every group currently listed.
Export good groups exports only those groups that you (as the user) have marked as “good,” typically indicating well-integrated and accepted peaks.
Export excluding bad groups exports all groups except those you’ve marked as “bad,” helping you exclude poor-quality or misintegrated peaks.
Export bad groups exports only the groups flagged by you as “bad,” which is useful for troubleshooting or quality review.
It is important to note that the classification of a group as "good" or "bad" is entirely user-defined based on your review of the peak quality. We commonly choose Export all groups, as we manually review all the peaks and ensure that everything looks correct (Figure 21).
Figure 21. Dialog box for exporting peak data and related information from all samples in El-MAVEN.
Then, export the groups in Peaks Detailed Format Comma Delimited (*.csv) to a folder of your choice, as shown in Figure 22. This is the format we commonly use for downstream data analysis.

Figure 22. Dialog box for exporting peak data in CSV format to a user-specified folder.
Note: Your Bookmark Table may look like the following with or without (1) suffix at the end of sample name (Figure 23) after you have identified and bookmarked all the compounds across your samples.
Figure 23. Sample bookmark table displayed after identifying all desired compounds across the samples.

Note
Don't close the software immediately pressing save to export the samples. Depending on the sample number, it might take couple of minutes to export all the peaks. Generally, we give all the way upto 5 minutes.

Quantification and Data analysis using Python

This section outlines the steps for analyzing metabolite peaks from LC-MS data using Python. It includes Python-based workflows for quantifying peaks using external standards and for interpreting unidentified peaks by generating heat maps for comparative analysis.

This section is divided into two subsections:
Quantitation of compounds with external standards
Visualization of compounds without any external standard (peaks identified using KNOWNS/SRM library in El-Maven)

The sample analysis shown here is similar metabolite data of Clostridium thermocellum LL1592 as shown in Sharma et al., 2024.
Citation
Sharma BD, Hon S, Thusoo E, Stevenson DM, Amador-Noguez D, Guss AM, Lynd LR, Olson DG (2024). Pyrophosphate-free glycolysis in Clostridium thermocellum increases both thermodynamic driving force and ethanol titers.https://doi.org/10.1186/s13068-024-02591-5
LINK
All files and code used for this analysis are also available at: https://github.com/bishaldev/lcms-intracellular-metabolite-analysis


Note
Before starting data analysis, all CSV files were converted to Excel format to preserve additional sheets often used during preliminary analysis. However, the Python code works equally well with CSV files exported directly from El-MAVEN.

For simplicity, the quantification process has been divided into two sections: one with standards and one without. The code can be merged to analyze both types of peaks within a single Python file.

Quantitation of compounds with external standards
Import core Python libraries for data manipulation, numerical calculations, statistical analysis, and plotting.
Load LC–MS peak export from El-MAVEN for compounds measured against external standards.
Clean compound and sample names by removing file extensions, duplicate markers, and unnecessary text.
Mark which samples are standard injections so they can be distinguished from experimental samples.
Extract numeric IDs and sample type (e.g., standard or experimental) from filenames.
Keep only relevant peak data columns, and optionally merge technical replicates or split injections by taking an average or maximum.
Retain only unlabeled parent isotope peaks for quantitation.
Load a prepared spreadsheet containing known concentrations for each standard injection.
Merge the LC–MS peak data with the standards file by compound and sample type.
Add the prepared concentration values into the dataset based on the matching run ID.
Remove peaks with zero intensity, then fit standard curves for each compound relating peak area to known concentration (using a log–log interpolation).
Optionally, select only high-quality standard points for curve fitting and refit the curves for better accuracy.
Apply each compound’s standard curve to the experimental sample peaks to calculate concentrations, and flag values as either interpolated or extrapolated depending on whether they fall within the calibration range.
Load experimental metadata containing timepoints and sample details.
Filter to the relevant experiment and calculate a biomass correction factor based on sample size and cell density.
Merge the quantified metabolite data with metadata, and adjust concentrations using the biomass correction factor (and optional dilution adjustments).
Optionally, create time-course plots for individual compounds, marking extrapolated values.
Convert measured amounts to intracellular concentrations using known constants for extraction volume, target biomass, and estimated cell volume.
Create wide-format tables of metabolite concentrations over time and generate pathway-focused heatmaps.
Calculate and plot metabolite ratios such as F6P/FBP, adenylate charge, and NAD(P)+/NAD(P)H ratios.
Optionally, save all processed tables and generated plots.

Note
The code implementing the above steps can be found in the attached file "2025_07_31 cthe intracellular metabolites with standards.ipynb".

Visualization of compounds without any external standard (peaks identified using KNOWNS/SRM library in El-Maven)
Import core Python libraries and set plotting preferences for notebooks.
Load LC–MS peak export from El-MAVEN for compounds without external standards.
Clean compound and sample names by removing file extensions, duplicates, and unnecessary text.
Extract numeric IDs and sample types from filenames to help with filtering.
Optionally, combine split injections into single representative values.
Keep only unlabeled parent isotope peaks to avoid interference from labeled forms.
Load experimental metadata containing timepoints and sample information.
Filter to the relevant experiment and calculate biomass-based normalization factors.
Merge the metabolite data with metadata, apply the biomass normalization, and remove zero-intensity peaks.
Pivot the dataset into a matrix of compounds versus timepoints for visualization.
Select a subset of metabolites of interest for focused analysis.
Generate heatmaps (with or without clustering) to compare metabolite dynamics across timepoints.
Optionally, create simple line plots or ratio plots for exploratory interpretation.
Save processed tables and visualizations as needed.

Note
The code implementing the above steps can be found in the attached file "2025_07_31 cthe intracellular metabolites without standards.ipynb".

Protocol references

Agrawal, S., Kumar, S., Sehgal, R., George, S., Gupta, R., Poddar, S., Jha, A., & Pathak, S. (2019). El-MAVEN: A Fast, Robust, and User-Friendly Mass Spectrometry Data Processing Engine for Metabolomics. In A. D’Alessandro (Ed.), High-Throughput Metabolomics (Vol. 1978, pp. 301–321). Springer New York. https://doi.org/10.1007/978-1-4939-9236-2_19

Bennett, B. D., Yuan, J., Kimball, E. H., & Rabinowitz, J. D. (2008). Absolute quantitation of intracellular metabolite concentrations by an isotope ratio-based approach. Nature Protocols, 3(8), 1299–1311. https://doi.org/10.1038/nprot.2008.107

Clasquin, M. F., Melamud, E., & Rabinowitz, J. D. (2012). LC‐MS Data Processing with MAVEN: A Metabolomic Analysis and Visualization Engine. Current Protocols in Bioinformatics, 37(1). https://doi.org/10.1002/0471250953.bi1411s37

Sharma, B. D., Olson, D. G., Giannone, R. J., Hettich, R. L., & Lynd, L. R. (2023). Characterization and Amelioration of Filtration Difficulties Encountered in Metabolomic Studies of Clostridium thermocellum at Elevated Sugar Concentrations. Applied and Environmental Microbiology, 89(4), e00406-23. https://doi.org/10.1128/aem.00406-23

Sharma, B. D., Hon, S., Thusoo, E., Stevenson, D. M., Amador-Noguez, D., Guss, A. M., Lynd, L. R., & Olson, D. G. (2024). Pyrophosphate-free glycolysis in Clostridium thermocellum increases both thermodynamic driving force and ethanol titers. Biotechnology for Biofuels and Bioproducts, 17(1), 146. https://doi.org/10.1186/s13068-024-02591-5

Citations

Step 3.4

Sharma BD, Olson DG, Giannone RJ, Hettich RL, Lynd LR. Characterization and Amelioration of Filtration Difficulties Encountered in Metabolomic Studies of Clostridium thermocellum at Elevated Sugar Concentrations.

https://doi.org/10.1128/aem.00406-23

Step 4

Agrawal S, Kumar S, Sehgal R, George S, Gupta R, Poddar S, Jha A, Pathak S. El-MAVEN: A Fast, Robust, and User-Friendly Mass Spectrometry Data Processing Engine for Metabolomics.

https://doi.org/10.1007/978-1-4939-9236-2_19

Step 4

Clasquin MF, Melamud E, Rabinowitz JD. LC-MS data processing with MAVEN: a metabolomic analysis and visualization engine.

https://doi.org/10.1002/0471250953.bi1411s37

Step 5

Sharma BD, Hon S, Thusoo E, Stevenson DM, Amador-Noguez D, Guss AM, Lynd LR, Olson DG. Pyrophosphate-free glycolysis in Clostridium thermocellum increases both thermodynamic driving force and ethanol titers.

https://doi.org/10.1186/s13068-024-02591-5

Acknowledgements

We thank Eashant Thusoo (PhD Candidate, University of Wisconsin-Madison) for his valuable input on metabolite extraction and peak detection using El-MAVEN.

This work was supported by the Center for Bioenergy Innovation (CBI), U.S. Department of Energy, Office of Science, Biological and Environmental Research Program under Award Number ERKP886.

We also acknowledge the use of ChatGPT (OpenAI, 2025) for language editing support during the preparation of this protocol. The authors take full responsibility for the content, accuracy, and interpretation of the information presented.