I am trying to run Humann using a script created by a coworker.
Unfortunately I get the following error:
ERROR: The MetaPhlAn taxonomic profile provided was not generated with the database version v3 or vOct22 . Please update your version of MetaPhlAn to at least v3.0 or if you are using MetaPhlAn v4 please use the database vOct22.
I am using MetaPhlAn version 4.0.6 (1 Mar 2023) and humann v3.8 and the database should be vOct22.
This is the the metaphlan options when running Humann:
–metaphlan-options “–bowtie2db {metaphlan4_db} --bowtie2out {bowtie_bz2} --nproc 20 --input_type fastq -o {profile_MPH4} --biom {file_biom} --unclassified_estimation -t rel_ab_w_read_stats” | tee -a {humannlog}
As mentioned the script has been created by a co-worker and I am using the same environment and the database is placed in the same folder on the cluster.
While troubleshooting she has run one of the samples (without problem) but I keep getting the mentioned error.
I have tried to compare our human_config files and can find no difference, and I am at a loss of what to do next. (I had to change the path for the database_folders to do this)
(I want to replicate some data and updating the database, which was my last idea, is not really ideal)
Have you experienced this kind of problem before or do you have any suggestions of how to move forward?
It’s possible that your MetaPhlAn marker database might’ve updated spontaneously such that it is now out of step with your version of HUMAnN. The version of the marker database is listed in the header of MetaPhlAn’s output (the lines starting with #). If you can check what that is I can advice on which version of HUMAnN 3 to be using. You can also specify that MetaPhlAn use a specific marker index via the --index flag.
Thank you so much for the response! I’ve managed to resolve the issue.
Looking back I am not completely certain of where the mistake was introduced but is seems that the problem was due to a misspelling in the path to my database (–bowtie2db). Because of this, MetaPhlAn didn’t recognize the path and defaulted to its own settings specified through HUMANn.
It seems that, since I hadn’t manually downloaded the database, HUMAnN reverted to a default database, which turned out to be incompatible with the version of MetaPhlAn I was using. My coworker, however, seems to have had a default path to the correct database, so she was able to run the code without issues (even though it contained the same misspelling).
One concern I have is that MetaPhlAn automatically reverted to default settings when it encountered an issue with the specified parameters. This meant that options we had customized, such as --biom {file_biom}, --unclassified_estimation, -t rel_ab_w_read_stats, and --nproc 20, were not applied when running the code. In the case of my coworker it ran without problem and if I hadn’t had any problems we might not have realized that our custom settings were not implemented.
It would be very helpful if an error could be raised when a setting is incorrect, rather than reverting silently to default values. This would prevent misunderstandings and ensure the correct settings are applied
Doing all the trouble shooting we also realised that there seemed to be a difference in our default paths, but could not find out how to determine this. The humann_cofig file allows us to update the default paths for humann, but we couldn’t find the configurations for metaphlan when it is run as part of humann. (Everything is run on a cluster using conda environments). There is no rush, but understanding how to access and view these config files would be nice.