Question about using MetaPhlAn3 output as input for HUMAnN3

Hello. I am trying to run HUMAnN3 on some samples which I have already performed MetaPhlAn3 analysis on. To avoid restarting a MetaPhlAn run from scratch when running HUMAnN, I provided the MetaPhlAn relative abundance output to HUMAnN using the --taxonomic-profile flag. Here is my code:

humann -i $input -o humann --remove-temp-output --taxonomic-profile $profile --threads 8 --metaphlan-options="–bowtie2db mpa_database"

When I look at the output, however, I am seeing a much lower number of selected species in the prescreen, which explain a low percentage of the predicted community composition:

Output files will be written to: /scratch/josh.kim/acemory/HOLA_data/humann

Decompressing gzipped file …

Removing spaces from identifiers in input file …

Found g__Bacteroides.s__Bacteroides_thetaiotaomicron : 1.31% of mapped reads

Found g__Bacteroides.s__Bacteroides_vulgatus : 1.21% of mapped reads

Found g__Erysipelatoclostridium.s__Erysipelatoclostridium_ramosum : 0.58% of mapped reads

Found g__Clostridium.s__Clostridium_butyricum : 0.46% of mapped reads

Found g__Klebsiella.s__Klebsiella_pneumoniae : 0.36% of mapped reads

Found g__Prevotella.s__Prevotella_copri : 0.20% of mapped reads

Found g__Enterococcus.s__Enterococcus_faecalis : 0.13% of mapped reads

Found g__Klebsiella.s__Klebsiella_quasipneumoniae : 0.12% of mapped reads

Found g__Escherichia.s__Escherichia_coli : 0.09% of mapped reads

Found g__Klebsiella.s__Klebsiella_variicola : 0.09% of mapped reads

Found g__Streptococcus.s__Streptococcus_salivarius : 0.05% of mapped reads

Found g__Streptococcus.s__Streptococcus_parasanguinis : 0.03% of mapped reads

Found g__Lactobacillus.s__Lactobacillus_rhamnosus : 0.02% of mapped reads

Total species selected from prescreen: 13

Selected species explain 4.64% of predicted community composition

I’m confused as to why I’m getting such a low percentage of predicted community composition compared to when I run MetaPhlAn within HUMAnN, which usually gives me more than 95% predicted community composition. Furthermore, will this impact the results of HUMAnN3 analysis? Thank you for any clarification or help.

Hmm… passing the taxonomic profile as an additional input should behave identically to asking HUMAnN to generate the profile from MetaPhlAn within its run. Have you tried running this sample the “traditional” way to see if you get a different set of species selected? The only other oddity I can point out is that if you’re using the --taxonomic-profile override then specifying additional MetaPhlAn options isn’t necessary and could in principle be re-running MetaPhlAn in an unexpected mode.