Conversion of HUMAnN taxonomy outputs to GTDB

Hi,

I’m running HUMAnN v3.9 using the mpa_vJun23_CHOCOPhlAnSGB_202403 database, and I have generated output files such as:

Gene families file (genefamilies.tsv)
Reactions file (reactions.tsv)
Pathway abundance file (pathabundance.tsv)

I understand that HUMAnN relies on MetaPhlAn’s taxonomic profiling, and the taxonomy used in the stratified outputs is based on SGB (Species-level Genome Bins).

I’ve already used the sgb_to_gtdb_profile.py script (provided with MetaPhlAn 4) to convert MetaPhlAn taxonomic profiles to GTDB format.

Now, I would like to ask:

Can the same sgb_to_gtdb_profile.py script or a similar approach be applied to convert the taxonomy in the HUMAnN output files to GTDB taxonomy?
If not directly, is there a recommended workflow or mapping file that can help translate the g__Genus|s__Species labels in the HUMAnN stratified outputs to GTDB taxonomy?

Any suggestions or tools would be greatly appreciated!

Thank you.

Boram

The MetaPhlAn version is based on assigning GTDB taxonomy to the SGBs. HUMAnN also outputs taxonomy in terms of SGBs (more specifically, s__SPECIES.t__SGB). So you could swap the SGB-based taxonomy in the HUMAnN output for their GTDB-equivalent species names. We do not have a script for this, but the mapping file used by the MetaPhlAn conversion script could be applied for this task (manually).