The bioBakery help forum

Humann3 sam file input

When I use aligned .sam file from bwa as input for for Humann3, I just get a single line saying UNMAPPED in the reads in the _genefamilies.tsv output file. Whereas when I use the same command using Humann2, I get a fair number of mapped genes.
humann --input input.sam --output out_dir
vs
humann2 --input input.sam --output out_dir

Is this a known problem?

Thanks

Sorry for the delay. Can you clarify how you made the SAM file with bwa? In particular what were you mapping against? It’s possible that your index sequence headers were not formatted in a way that HUMAnN could understand.

I used
bwa mem reference.fasta 1_fastq.gz 2.fastq.gz > mapped.sam

The reference is here
https://megares.meglab.org/download/megares_v2.00/megares_drugs_database_v2.00.fasta

It seem to work with Humann2 but not Humann3 - would the output from Humann2 essentially be identical to those from Huamnn3?

Hmm, that file doesn’t appear to be in the 2.0 format (which also works in 3.0). You can see here about how sequence headers need to be structured to work with HUMAnN:

The other option is to use generic headers and then build a separate ID mapping file to tell HUMAnN which species/functions each sequence belongs to.