Query regarding HUMAnN2

sprajendra · April 5, 2020, 6:27pm

I have installed HUMAnN2 from bioconda and downloaded respective Metaphlan data base (mpa_v20_m200), Chocophlan database (full_chocophlan_plus_viral.v0.1.1.tar.gz) and uniref90 database (uniref90_annotated.1.1.dmnd).
I merged the paired end reads through flash.
I run the command humann2 --input merged.fastq --output humann2_out.
-Though my sample is rice paddy soil sample (highly diverse), only 10 species were detected after the successful completion of Metaphlan run. Why ?
Among them only two are selected for functional annotation. Why ?
The command is running from last 24 hr, how much time it generally take to complete analysis.
Am I doing something wrong or something is missing ?

Please suggest, what should I do ?

franzosa · April 6, 2020, 3:08pm

I think all of these issues are related to a lack of known species in your community relative to the tools’ databases. The reason why only a subset of the detected species are used for functional profiling is that some detected “species” are of the form s__Genus_unclassified, for which we do not have a species pangenome for mapping. As a result, most of your reads will be forwarded to translated search, which is considerably slower than nucleotide-level mapping (~2 hrs per 10M reads using 8 threads). Because of these issues, I’d recommend running with UniRef50 rather than UniRef90 in order to map more reads to homologs of known proteins.

sprajendra · April 6, 2020, 3:53pm

Thanks @franzosa for your suggestions.

Topic		Replies	Views
Using a single "general" MetaPhlAn bugs list for all samples HUMAnN	3	299	July 20, 2023
No results reported in humann output HUMAnN	1	325	July 14, 2020
Humann3 computation speed HUMAnN	1	2021	September 29, 2020
HumanN: which reference database? why so many ummaped reads? HUMAnN	1	388	July 6, 2021
Getting HUMAnN2 and MetaPhlAn2 to run together HUMAnN	6	1805	May 10, 2020

Query regarding HUMAnN2

Related topics