Availabilty of mapping NCBI TaxID to Species name

Hello,

I’m using HUMAnN version 3.6 with all the latest databases for MetaPhlAn, ChocoPhlAn and UniRef90.
In the output files we have access, when available, to the species to which genefamilies are associated. This information can be very usefull but the species name often doesn’t match the NCBI name, making the comparison of species names from other sources very difficult. So I was wondering if a mapping file between “NCBI TaxID” and “Species name” was available.

I found in this topic links to two files containing the information I need (mpa_v30_taxonomy.txt and mpa_v30_CHOCOPhlAn_201901_taxonomy.txt.gz), but they seem out of date. Indeed, many species available in ChocoPhlAn are missing from these mapping files. I’m using ChocoPhlAn v201901_v31.
Is there somewhere an up-to-date version available?

Thank you very much in advance,
Younous

Not sure about for ChocoPhlAn v201901_v31, but if you switch to the current version I think you might be looking for this?

Thanks a lot for your answer! This is exactly the file I needed. However, it raises two questions to me:

  1. I still have missing species in this file. For exemple, in the ChocoPhlAn folder where genomes were downloaded, I have Acetobacter_ascendens species. But it is absent from the file you shared. On the 12772 species I have in ChocoPhlAn, 9685 are available in the file and 3087 are not.
  2. When executing humann_databases --download chocophlan full ./chocophlan, it downloads the file http://huttenhower.sph.harvard.edu/humann_data/chocophlan/full_chocophlan.v201901_v31.tar.gz. Is there a vJan21_CHOCOPhlAnSGB_202103 version? Maybe that will correspond with the content of the file you shared.

Hmm, I’m not sure then, sorry! I’m also interested in the answer so I’ll be following along to see what the dev team says.

It sounds like you might be using MetaPhlAn 4 with HUMAnN 3.6? If so, please see this discussion: