bacterial abundances raised to 100%

Dear MetaPhlAn developers. I have just used the script and noticed there might be some issues and I was wondering if you could help. Firstly with running the script, we noticed that parsing the database name uses the following code
` def get_database_name(self):
“”"Gets database name

        str: the database name
    return self.database.split('/')[-1][:-4]`

This meant that when specifying the database we had to give metaphlan_db_vOct22/mpa_vOct22_CHOCOPhlAnSGB_202212.bt2 instead of a general link to the database directory metaphlan_db_vOct22. I’m not sure if this was intentional?

Secondly, after successfully running the script, we noticed that the SGB to GTDB conversions raised the bacterial classification to 100. In my samples a large percentage of reads were unclassified so using the MetaPhlAn database we got this as an example:


However after converting to GTDB we got this:

#clade_name	relative_abundance
d__Bacteria	100.0
d__Bacteria;p__Fusobacteriota	50.12007
d__Bacteria;p__Proteobacteria	22.51449
d__Bacteria;p__Firmicutes_A	21.422109999999996
d__Bacteria;p__Bacteroidota	4.45885
d__Bacteria;p__Campylobacterota	1.48448

The samples no longer summed to 100. I think that perhaps the ratios between the phyla abundances are correct but they just to be changed to be in proportion to the bacterial abundance which should not be 100 but rather UNCLASSIFIED-100. Let me know if this seems correct or if you think the error came from elsewhere.

Thanks in advance.