Inquiry regarding abnormal taxonomy in MetaPhlAn4 DB

Hello,

I’m using MetaPhlAn4 with the database mpa_vJan21_CHOCOPhlAnSGB_202103 and find weird taxonomy in the database.

During analysis, I found a weird result and checked the original metaphlan DB file using Python.

Codes:

db = pickle.load(bz2.open(db_path, 'r'))
db_tax_keys = pd.Series(db["taxonomy"].keys())
db_tax_keys[db_tax_keys.str.contains("CFGB12541")].tolist()

Returns:

['k__Bacteria|p__Bacteroidetes|c__CFGB12541|o__OFGB12541|f__FGB12541|g__GGB35550|s__GGB35550_SGB48439|t__SGB48439',
 'k__Bacteria|p__Firmicutes|c__CFGB12541|o__OFGB12541|f__FGB12541|g__GGB35551|s__GGB35551_SGB53794|t__SGB53794']

The class CFGB12541 is in both phyla Bacteroidetes and Firmicutes.

I was wondering why the same class is in different phyla.

Is this intended taxonomy in metaphlan DB? or erroneous result? or is there anything I missed?

Best regards,

Hi @1112
For really uncharacterized SGBs (sgbs unknown at the FGB level) we infer the phyla using the closest reference genome in the database, it seems that, in this case, for SGB 48439 the closest reference genome belongs to the Bacteroidetes phylum while for SGB 53794 the closest reference genome belongs to Firmicutes. This is one of the limitations of our taxonomic assignment approach

Thank you for your response!

I’ll just handle them as unclassified taxa.