Humann2 failing after temp files produced

Hi - Thanks for the detailed follow up and sorry to hear you are still having install issues. Is it possible you have two versions of MetaPhlAn installed on your system? One installed with conda (at v3.0.4) and another installed with another method? This could account for the issues/errors you are seeing. To double check look at your $PATH and the directories it contains to see if you see more than one “metaphlan” executable.

Thank you,
Lauren

Hi Rachel, sorry about this. The version you have installed is 3.0.4 but I forgot to update the version string. I’ll push an updated build and correct the version number.

Hi both,

Thanks for this. There isn’t another version of metaphlan as I cleared everything out before re-installing humann3. If sounds like I am actually working with v3.0.4 on these data, so I’m not sure why I’m not getting any species assignments. Is it possible that the kneaddata.fastq output wasn’t the correct one to use as the input for humann3? (See further up in the discussion).

Many thanks

Hi - Thanks for checking. With the latest version of MetaPhlAn installed does it resolve the error you were seeing before? read_fastx.py: command not found . I think the error is why you were not seeing any species identified.

Thank you,
Lauren

Hi, it doesn’t really do anything if I do read_fastx.py (with the biobakery3 environment activated).

if I do less read_fastx.py, I get the following:

#!/bin/sh

‘’‘exec’ /home/rantwis/miniconda3/envs/biobakery3/bin/python “0" "@”

’ ‘’’

-- coding: utf-8 --

import re

import sys

from metaphlan.utils.read_fastx import main

if name == ‘main’:

sys.argv[0] = re.sub(r’(-script.pyw|.exe)?$’, ‘’, sys.argv[0])

sys.exit(main())

Does this shed any light?

Hi - That is great you have the executable read_fastx.py installed. If you run the MetaPhlAn command that previously generated the read_fastx.py error does it now run without error? Here is the command to try:

$ /home/rantwis/miniconda3/envs/biobakery3/bin/metaphlan /home/rantwis/seqdata/clean_danish/Ash10.unmapped_kneaddata.fastq -t rel_ab -o /home/rantwis/seqdata/clean_danish/outputs/Ash10.unmapped_kneaddata_humann_temp/Ash10.unmapped_kneaddata_metaphlan_bugs_list.tsv --input_type fastq --bowtie2out /home/rantwis/seqdata/clean_danish/outputs/Ash10.unmapped_kneaddata_humann_temp/Ash10.unmapped_kneaddata_metaphlan_bowtie2.txt

If that runs okay, then try running HUMAnN again.

Thank you,
Lauren

It’s saying:

OSError: “[Errno 2] No such file or directory: ‘seqdata/clean_danish/outputs/Ash10.unmapped_kneaddata_humann_temp/Ash10.unmapped_kneaddata_metaphlan_bowtie2.txt’”

Fatal error running BowTie2.

Is there any chance we could set up a Zoom call perhaps please? I’ve been working on this since March and we can’t seem to fix it! Perhaps it would be easier if I can share my screen?

Many thanks

Hi - Sorry you are seeing an error. It looks like the path in your command might not be valid. Can you try the following (with any input fastq file) and if it works I think you should be okay to run HUMAnN?

$ metaphlan Ash10.unmapped_kneaddata.fastq -t rel_ab -o Ash10.unmapped_kneaddata_metaphlan_bugs_list.tsv --input_type fastq --bowtie2out Ash10.unmapped_kneaddata_metaphlan_bowtie2.txt

Sorry to hear you have been debugging this for so long. I think you almost have it working! If not and you are still stuck please feel free to ping me on the post. I think with one more iteration you should be all set!

Thanks,
Lauren

Hi Lauren,
thanks for this. I’ve run that and the Ash10.unmapped_kneaddata_metaphlan_bugs_list.tsv is still coming up as:

#SampleID Metaphlan_Analysis
#clade_name NCBI_tax_id relative_abundance additional_species
UNKNOWN -1 100.0

Do you know what can be the problem?
Thank you

Hi - Did this run have any errors? If not, how many reads are in your input file and what are the read lengths? Either not having enough reads or having reads that are too short could result in an “unknown” result.

Thank you,
Lauren

There were no errors. I’ve had a look and there are around e.g. 3 million reads, 50-100 bp’s long. Are these too short?
Thanks

Hello - I just checked and the min read length for MetaPhlAn is 70 nt. If most of your reads are shorter then this cutoff it might be the reason you are seeing all “unknown” for the results.

Hi @fbeghini - I just wanted to sync you in to get your input. Do you have any notes/suggestions?

Thank you,
Lauren

Hi @lauren.j.mciver and @ra123,

Yes, it makes sense, if no enough reads are retained because of the length, the profile will result in an UNKNOWN. You can try to run MetaPhlAn with --read_min_len 49 to keep all the reads longer than 50bp and then feed the MetaPhlAn profile output to HUMAnN via --taxonomic-profile.
Otherwise, you can directly run HUMAnN with --metaphlan-options="--read_min_len 49"

Thank you @fbeghini!

Ok thanks - it looks like nothing is kept if I include this qualifier. Are the functional assignments still reliable with read lengths less than 50, or are the data all junk?!
Many thanks

Functional assignments should still be OK since we filter on database sequence coverage. I.e. the genes we initially identify as “present” would need to be covered at >50% of sites by small reads to make their way into the output. This helps to boost specificity relative to individual alignments of short reads.

Ok great, many thanks.

Hello,i use MetaPhlAn version 3.0.7 (09 Dec 2020),
exec>> time parallel -j 3
‘humann --input {}
–output temp/humann3/
–metaphlan-options="–read_min_len 49"’
::: temp/concat/*.fq > temp/log

but p144C_metaphlan_bugs_list.tsv is still coming up as:
#SampleID Metaphlan_Analysis
#clade_name NCBI_tax_id relative_abundance additional_species
UNKNOWN -1 100.0

Do you konw how to solve it ?
Thank you.

Have you tried running one of these jobs NOT in parallel just to check that it isn’t an issue with the parallelization syntax? If the same thing happens, it would seem that your reads aren’t mapping to MetaPhlAn marker genes, either because the reads are not microbial OR they are not from protein-coding regions (this happens, for example, if you try to analyze 16S sequences with MetaPhlAn).

Thank you, single job also has same problem, the reads are from others’ example of lectures, i will try to analyse my reads to check if the same problem will occur again.

Thanks