Inconsisten Taxonomy in vOct22 version

There are some inconsitent taxonomic entries in the mpa_vOct22_CHOCOPhlAnSGB_202212 db:

“CFGB10299”, “CFGB1787”, “CFGB2830”, “CFGB2992”, “CFGB4862”, “CFGB9382”

“FGB10299”, “FGB1787”, “FGB2830”, “FGB2992”, “FGB4862”, “FGB9382”

“Emergencia”, “GGB79996”, “Lawsonibacter”

I think in all the cases it is the case that higher levels are not consistent.

Sorry @SilasK
What do you mean with inconsistent taxonomies, could you provide a full taxonomy example?

same name for genus has different family, order or phylum.

It many linked to new defined genera and some defined.

Please make a consistent taxonomy!

This R code did the job for me:



    tax_table(pseq)[tax_table(pseq)[, "Class"] %in% c("CFGB10299", "CFGB1787", "CFGB2830", "CFGB2992", "CFGB4862", "CFGB9382"), "Phylum"] <- "Unclassified_Phylum"
    tax_table(pseq)[tax_table(pseq)[, "Genus"] == "Lawsonibacter", "Order"] <- "Eubacteriales"
    tax_table(pseq)[tax_table(pseq)[, "Genus"] == "Lawsonibacter", "Family"] <- "Oscillospiraceae"

    tax_table(pseq)[tax_table(pseq)[, "Genus"] == "Emergencia", "Order"] <- "Eubacteriales"
    tax_table(pseq)[tax_table(pseq)[, "Genus"] == "Emergencia", "Family"] <- "Eubacteriales_Family_XIII_Incertae_Sedis"


Dear @SilasK
For known species (kSGBs) and uSGBs known, at least, at the family level, the taxonomy are fully based on the NCBI database, and thus inconsistent taxonomies of different species are out of our control.
For uSGBs only known up to the phylum level, the phylum assignment is bases on the closest reference genomes to the SGB, in some borderline cases, our approach can produce inconsistencies in which the same uknown genus or family, in different SGBs can be assigned to different phylum