Cannot input custom taxonomic profile

Trying to use --taxonomic-profile in Humann 3.0 yields this error :
ERROR: The MetaPhlAn2 taxonomic profile provided was not generated with the expected database version. Please update your version of MetaPhlAn2 to v3.0.

Is there a way to convert metaphlan2 reports to metaphlan3 ? my MPA file was generated via KrakenTools from a kraken report. From what I have read in HUMAnN v3 joint taxonomic profile they are not exactly the same…

The first line of a MetaPhlAn3 output file,


specifies the software version and ChocoPhlAn database version. You could try adding this line to your file. HUMANn3 will likely complain if it comes across a clade_name that is not in the database.

Hi, thanks a lot for your input. I have managed to figure that out and sharpen my (very basic) bash skills to convert the report this way :

awk ‘{printf("%s\t\n", $0)}’ temp.MPA.TXT | awk ‘BEGIN{printf("#mpa_v30_CHOCOPhlAn_201901\n")}1’ - > bugs_list.MPA.TXT; rm temp.MPA.TXT

HUMANN accepts that input. The prescreen is however somewhat crazy, saying :
Found g__Burkholderia.s__Burkholderia_sp._PAMC_28687 : 303.00% of mapped reads
Found g__Burkholderia.s__Burkholderia_sp._PAMC_26561 : 204.00% of mapped reads
Found g__Bradyrhizobium.s__Bradyrhizobium_erythrophlei : 202.00% of mapped reads
Found g__Lichenicola.s__Lichenicola_cladoniae : 144.00% of mapped reads
Found g__Granulicella.s__Granulicella_sp._WH15 : 35.00% of mapped reads
Found g__Granulicella.s__Granulicella_sp._5B5 : 32.00% of mapped reads
Found g__Granulicella.s__Granulicella_tundricola : 33.00% of mapped reads
Found g__Granulicella.s__Granulicella_mallensis : 30.00% of mapped reads
Found g__Terriglobus.s__Terriglobus_roseus : 28.00% of mapped reads
Found g__Acidisarcina.s__Acidisarcina_polymorpha : 19.00% of mapped reads
Found g__Paenibacillus.s__Paenibacillus_sp._E222 : 159.00% of mapped reads
Found g__Physcomitrium.s__Physcomitrium_patens : 1032.00% of mapped reads
Found g__Homo.s__Homo_sapiens : 155.00% of mapped reads
Total species selected from prescreen: 13
Selected species explain 2376.00% of predicted community composition

I imagine this is caused by Krake, whose profile includes reads spanning more than just the marker genes. I’m trying to figure out if that could affect the humann pipeline downstream.