HUMANn 4 + MetaPhlAn 4

I fixed this… SEE SOLUTION BELOW

Hi all sorry to bother you again,

MetaPhlAn version 4.1.1 (11 Mar 2024)
humann v4.0.0.alpha.1

I need to run MetaPhlAn 4 + HUMANn 4 (I want to run StrainPhlAn after that).

I had to install diamond, bowtie2, and glpk
I also provided the route to the databases with humann_config
I am using a SRR file from a human metagenome as a demo.

I used this for MetaPhlAn 4

metaphlan RAW/SRR14076335_1.fastq.gz --input_type fastq -s SAMS/SRR14076335.sam.bz2 --bowtie2out BOWTIE2/SRR14076335.bowtie2.bz2 -o BUGS/SRR14076335_profile.tsv --add_viruses --unclassified_estimation  --index mpa_vOct22_CHOCOPhlAnSGB_202403 --bowtie2db ./CHOCO/mpa_vOct22_CHOCOPhlAnSGB_202403

Then I uncompressed the sam file

bzip2 -d ./SAMS/SRR*

then I run humann

humann --input ./SAMS/SRR14076335.sam.bz2 --output ./OUT/SRR14076335 --threads 22 --metaphlan-options "--bowtie2db ./CHOCO" --nucleotide-database "./HUMANn/chocophlan"

I got this after that in the gene families file

Gene Family HUMAnN v4.0.0.alpha.1 Adjusted CPMs SRR14076335
READS_UNMAPPED 461337.0000000000

Edited to add the solution

My fault on this side I added -t rel_ab_w_read_stats and removed the --metaphlan-options from the humann run

metaphlan RAW/SRR14076335_1.fastq.gz --input_type fastq -s SAMS/SRR14076335.sam.bz2 --bowtie2out BOWTIE2/SRR14076335.bowtie2.bz2 -o BUGS/SRR14076335_profile.tsv --add_viruses --unclassified_estimation  --index mpa_vOct22_CHOCOPhlAnSGB_202403 --bowtie2db ./CHOCO/ -t rel_ab_w_read_stats --nproc 6 

humann --input ./RAW/SRR14076335_1.fastq.gz  --taxonomic-profile ./BUGS/SRR14076335_profile.tsv  --output ./FromBUGS/  --nucleotide-database "./HUMANn/chocophlan" --threads 2

Thanks for following up with your solution!

Hi,

I have a problem with running the demofile as provided by HUMANn v4.0.0.alpha.1 while using MetaPhlan 4.2 and using the newest database: mpa_vJan25_CHOCOPhlAnSGB_202503_VSG

When I summitted the job i got an error, because the flag --bowtie2out is changed to --mapout in metaphlan, but this is not yet updated in HUMANn. I have adjusted the prescreen.py code to correct for that. So that is working now.

However, now i received the following error:

PM - humann.search.prescreen - ERROR: The relative abundance and coverage were not found in the MetaPhlAn taxonomic profile

It seems that HUMANn expects different headers do differ when I run the demo file for both MetaPhlan and HUMANn independently:

Output:
#clade_name NCBI_tax_id relative_abundance additional_species

Expected:
#clade_name clade_taxid relative_abundance coverage estimated_number_of_reads_from_the_clade

the flag -t rel_ab_w_read_stats within MetaPhlan helps that error as described above:

humann --input ./05_HUMAnN/demo_humann_v4.fastq --output ./output_humann_demo_humann_2025DB_flag2 --metaphlan-options "--db_dir ./python3.9/site-packages/metaphlan/metaphlan_databases -t rel_ab_w_read_stats"

Next, It seems that the newest database 2025, does not work well with HUMANn as it expects the db:

humann.search.prescreen - ERROR: The MetaPhlAn taxonomic profile provided does not contain the database version vOct22_CHOCOPhlAnSGB_202403 in any of its header lines.

Anyone has an idea on how to solve this?

The HUMAnN 4 alpha is designed to work with MetaPhlAn’s vOct22_CHOCOPhlAnSGB_202403 marker set. It sounds like there is also an interface change in MetaPhlAn 4.2 that we will accommodate in the next HUMAnN 4 release. For now I would stick with an earlier MetaPhlAn 4 + the aforementioned marker database when running HUMAnN 4.

Just trying to get to grips with this too. Metaphlan 4 can use the new database for taxonomy but humann 4 uses the old Oct22 database and so I cant use metaphlan.tsv files that were generated from running metaphlan? I would have to run it all again using the fastq files for humann and let humann 4 reference to the Oct22 metaphlan database? I have been trying this for a couple of weeks now since i returned from the biobakery workshop. When I went, i was using an older metaphlan and humann, everything worked, I upgraded both and now they dont seem to work together.

I have replied to the new post you created here:

I still cant get anything to work, humann will not run it just says Running metaphlan …

ERROR: The MetaPhlAn taxonomic profile provided does not contain the database version vOct22_CHOCOPhlAnSGB_202403 in any of its header line
This is more than frustrating. i have half a pipeline. using version 4 alpha of humann and 4.1 metaphlan with vOct22 database for metaphlan.
I was at the biobakery workshop and upgraded to humann4 there an it has not worked since.
version of humann v4.0.0.alpha.1
MetaPhlAn version 4.1.1 (11 Mar 2024)

Hi,

Could you head -n1 the MetaPhlAn taxonomy profile? Which database version does it state in the first line of the file? From the error, it looks like you used a newer database than the vOct22_CHOCOPhlAnSGB_202403.

Did you specify the MetaPhlAn database that should be used when running MetaPhlAn? I think you can do that using --index, to prevent it from using the newer database that you also might have installed.

From the --help:
-x INDEX, --index INDEX
Specify the id of the database version to use. If “latest”, MetaPhlAn will get the latest version. If an index name is provided, MetaPhlAn will try to use it, if available, and skip the online check. If the database files are not found on the local MetaPhlAn installation they will be automatically downloaded [default latest]

If this doesn’t work it would be helpful to see the code you’re using, without code it is difficult to reproduce or solve an error.

Best, Barbara