The bioBakery help forum

Diversity missing in profile output

I am trying to profile a marine sponge tissue metagenome (illumina highseq reads) and I ran the following command:
metaphlan DNA_1.fq,DNA_2.fq --bowtie2out metaphlan.bowtie2.bz2 --nproc 24 --input_type fastq -o profiled_metaphlan.txt

It runs with the warnings:

WARNING: The metagenome profile contains clades that represent multiple species merged into a single representant.
An additional column listing the merged species is added to the MetaPhlAn output.

And the output:

#mpa_v30_CHOCOPhlAn_201901
#/tools/software/bioinfo-tools/miniconda/miniconda3/envs/metaphlan/bin/metaphlan DNA_1.fq,DNA_2.fq --bowtie2out metaphlan.bowtie2.bz2 --nproc 24 --input_type fastq -o profiled_metaphlan.txt
#SampleID Metaphlan_Analysis
#clade_name NCBI_tax_id relative_abundance additional_species
k__Eukaryota 2759 100.0
k__Eukaryota|p__Ascomycota 2759|4890 100.0
k__Eukaryota|p__Ascomycota|c__Saccharomycetes 2759|4890|4891 100.0
k__Eukaryota|p__Ascomycota|c__Saccharomycetes|o__Saccharomycetales 2759|4890|4891|4892 100.0
k__Eukaryota|p__Ascomycota|c__Saccharomycetes|o__Saccharomycetales|f__Saccharomycetaceae 2759|4890|4891|4892|4893 100.0
k__Eukaryota|p__Ascomycota|c__Saccharomycetes|o__Saccharomycetales|f__Saccharomycetaceae|g__Saccharomyces 2759|4890|4891|4892|4893|4930 100.0
k__Eukaryota|p__Ascomycota|c__Saccharomycetes|o__Saccharomycetales|f__Saccharomycetaceae|g__Saccharomyces|s__Saccharomyces_cerevisiae 2759|4890|4891|4892|4893|4930|4932 100.0 k__Eukaryota|p__Ascomycota|c__Saccharomycetes|o__Saccharomycetales|f__Saccharomycetaceae|g__Saccharomyces|s__Saccharomyces_sp_boulardii

However I expect a lot more diversity in the sample, especially bacterial members. What could have gone wrong?

Thanks!

Any chance for an answer? would the warning be indicative of issues in the run that collapse all diversity expected?

Hi Catarina,
Sorry for not catching up this before. The warning just indicates that markers for S. cerevisiae may indicate also the presence of S. boulardii, since the species are 5% similar.
In this case, it is possible that the bacterial community you are analyzing cannot be described by the species present in the MetaPhlAn database