License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Protocol status: Working
Created: December 07, 2015
Last Modified: November 21, 2017
Protocol Integer ID: 2079
Abstract
In this protocol, we perform a metabolic profiling of metagenomic datasets using HUMAnN applied to the 20 samples taxonomically profiled in the other protocols contained in this collection.
Once the metabolic profile has been generated, the following protocols can be performed:
MetaPhlAn output merge and visualizations
GraPhlAn visualization of single and multiple samples
Taxonomic biomarker discovery with LEfSe
Guidelines
Metabolic profiles of the 20 HMP samples with HUMAnN can be generated with the following steps:
Perform a translated search (using blastx or usearch) against the KEGG DB. Since HUMAnN's development KEGG has become commercial, we are currently developing support for other data sources.
Place the translated BLAST results using KEGG gene identifiers in the
input
directory (optionally can be gzipped or bzipped, several other formats can be enabled editing the settings in the
SConstruct
file).
Run the
scons
command, optionally parallelizing multiple analyses using the
-j
flag. Results will be placed in the "output" directory.
We provide the HUMAnN output for users who want to perform the downstream analysis pipeline but avoid the computational intensive steps above. The 20 samples profiled with HUMAnN are available here.
Using HUMAnN output, you can perform the metabolic counterpart of the taxonomic pipeline presented in this tutorial. With a table of metabolomic abundances, the previous protocols can be performed with little to no modification. We report below the first command to obtain the merged table of metabolic abundances.
REQUIREMENTS: HUMAnN, scons, the KEGG protein DB. HUMAnN can be obtained using Mercurial:
hg clone ssh://hg@bitbucket.org/chuttenh/humann
or using the direct links to the zip, gz, or bz2 archives.
Perform a translated search (using blastx or usearch) against the KEGG DB.
Note
Since HUMAnN's development KEGG has become commercial, we are currently developing support for other data sources.
Place the translated BLAST results using KEGG gene identifiers in the
input
directory.
Note
Optionally can be gzipped or bzipped, several other formats can be enabled editing the settings in the
SConstruct
Run the
scons
command, optionally parallelizing multiple analyses using the
-j
flag. Results will be placed in the "output" directory.
We provide the HUMAnN output for users who want to perform the downstream analysis pipeline but avoid the computational intensive steps above. The 20 samples profiled with HUMAnN are available here.
Using HUMAnN output, you can perform the metabolic counterpart of the taxonomic pipeline presented in this tutorial. We report below the first command to obtain the merged table of metabolic abundances.