Which ncbi taxdump version used for metaphlan4 database?

Which version of the NCBI taxdump (Index of /pub/taxonomy/taxdump_archive) was used for the taxonomy in metaphlan4?

It appears that a fairly old taxdump was used, since taxid=541000, but this taxid was merged in to taxid=216675 on March 15, 2021 (see Ruminococcaceae - Taxonomy - NCBI).

I’m asking because I’m comparing taxonomic lineages of 2 datasets: one in which the taxids are defined by Metaphlan4 and one in which the taxids are defined using the 2022-12-14 taxdump version. It seems that a lot of clades to not overlap when they should. For example, the Ruminococcus taxonomic lineage does not overlap between the 2 datasets, which is likely due to the change of the Ruminococcus taxonomy in NCBI on March 15, 2021 (Ruminococcaceae => Oscillospiraceae).

Some taxids in the Metaphlan4 output (e.g., 3353859 and 7415705) don’t appear to be in the NCBI taxonomy database, regardless of the release version

Hi @nick-youngblut
The taxdump for the Jan21 database is from Feb 03, 2021.

Thanks @aitor.blancomiguez! That’s really helpful