Humann3 only generated temp folder and no 3 main output files

Hi members in the Huttenhower Lab,

When running humann v3.6, there were no 3 main output files and only temp folder generated, shown as below
Picture1

The error says:

#CPU threads: 6
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Temporary directory: /nobackup/denxyu/zijian/metaphlan4/01_mpa/humann_output/merged_trimmed_006D_humann_temp/tmpbu6dzawx
Percentage range of top alignment score to report hits: 1
Opening the database… [0.164s]
Database: /nobackup/denxyu/new_humann3/humann_databases/uniref/uniref90_201901b_full (type: Diamond database, sequences: 87296736, letters: 29247941583)
Block size = 2000000000
Opening the input file… [0.043s]
Opening the output file… [0s]
Loading query sequences… Error: Error reading input stream at line 6: Invalid character (*) in sequence

CRITICAL ERROR: Error executing: /nobackup/denxyu/miniconda/envs/humann/bin/diamond blastx --query /nobackup/denxyu/zijian/metaphlan4/01_mpa/humann_output/merged_trimmed_006H_humann_temp/merged_trimmed_006H_bowtie2_unaligned.fa --evalue 1.0 --threads 6 --top 1 --outfmt 6 --db /nobackup/denxyu/new_humann3/humann_databases/uniref/uniref90_201901b_full --out /nobackup/denxyu/zijian/metaphlan4/01_mpa/humann_output/merged_trimmed_006H_humann_temp/tmpykvlw7q0/diamond_m8_odufijgj --tmpdir /nobackup/denxyu/zijian/metaphlan4/01_mpa/humann_output/merged_trimmed_006H_humann_temp/tmpykvlw7q0

Error message returned from diamond :
diamond v2.1.6.160 (C) Max Planck Society for the Advancement of Science
Documentation, support and updates available at http://www.diamondsearch.org
Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

When checking the *_bowtie2_unaligned.fa file, there is *. I have sucessfully run humann, but failed with a different batch of sequencing data. Could you please give me some suggestions? Thank you very much in advance.

* will sometimes occur as a STOP codon in protein sequences but I don’t think it should be occurring in your sequencing reads? Can you share a portion of the reads file containing the *(s)? That appears to be what is tripping up DIAMOND.

Hi Franzosa,
Thank you very much for your reply.
I double-checked the data and found that * was generated during cutadapt step. After discarding lines with *, HUMANN was running successfully. Appreciate your response again.

Hi @Xia
I am facing the same issue, what do you mean by discarded the line with *. Can you share how you did this exactly?
Any help would be appreciated. Thank you

Hi @franzosa, I have the same error. I am using version 3.8 and I have the temp folder but not the 3 main output files.
For instance my error says (path names have been changed):
CRITICAL ERROR: Error executing: /install/software/restart/py3/naga//bin/diamond blastx --query /path/to/humann_temp/sample_bowtie2_unaligned.fa --evalue 1.0 --threads 2 --top 1 --outfmt 6 --db /databases/humann3.8/uniref/uniref90_201901b_full --out /H3/sample_L001_h3/humann_temp/tmpb3yiqtuw/diamond_m8_2v9dte3w --tmpdir /path/to/humann_temp/tmpb3yiqtuw

Error message returned from diamond :
diamond v2.1.6.160 (C) Max Planck Society for the Advancement of Science
Documentation, support and updates available at http://www.diamondsearch.org
Please cite: Sensitive protein alignments at tree-of-life scale using DIAMOND | Nature Methods Nature Methods (2021)

#CPU threads: 2
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Temporary directory: /path/to/humann_temp/tmpb3yiqtuw
Percentage range of top alignment score to report hits: 1
Opening the database… [0.815s]
Database: /databases/humann3.8/uniref/uniref90_201901b_full (type: Diamond database, sequences: 87296736, letters: 29247941583)
Block size = 2000000000
Opening the input file… [0.055s]
Opening the output file… [0.022s]
Loading query sequences… Error: Error reading input stream at line 70270: Invalid character (*) in sequence

CRITICAL ERROR: Error executing: /install/software/restart/py3/naga//bin/diamond blastx --query /path/to/humann_temp/sample_bowtie2_unaligned.fa --evalue 1.0 --threads 2 --top 1 --outfmt 6 --db /databases/humann3.8/uniref/uniref90_201901b_full --out /path/to/humann_temp/tmpoqsvteve/diamond_m8_263c0d62 --tmpdir /path/cutfastq/H3/sample_h3/humann_temp/tmpoqsvteve

When I checked the given line in the error, I could not find any *

Can you please help me as to how to proceed in this case? Any suggestions will be appreciated.
Thank you

From the earlier reply, it sounds like some QC that was done to the sequencing reads included * characters in the output, which are then tripping up DIAMOND. Have you checked your FASTQ input to make sure it doesn’t have * characters on the sequence lines?