I’m using HUMAnN version 3.6 with all the latest databases for MetaPhlAn, ChocoPhlAn and UniRef90.
In the output files we have access, when available, to the species to which genefamilies are associated. This information can be very usefull but the species name often doesn’t match the NCBI name, making the comparison of species names from other sources very difficult. So I was wondering if a mapping file between “NCBI TaxID” and “Species name” was available.
I found in this topic links to two files containing the information I need (mpa_v30_taxonomy.txt and mpa_v30_CHOCOPhlAn_201901_taxonomy.txt.gz), but they seem out of date. Indeed, many species available in ChocoPhlAn are missing from these mapping files. I’m using ChocoPhlAn v201901_v31.
Is there somewhere an up-to-date version available?
Thanks a lot for your answer! This is exactly the file I needed. However, it raises two questions to me:
I still have missing species in this file. For exemple, in the ChocoPhlAn folder where genomes were downloaded, I have Acetobacter_ascendens species. But it is absent from the file you shared. On the 12772 species I have in ChocoPhlAn, 9685 are available in the file and 3087 are not.