Humann3 only generated temp folder and no 3 main output files

Xia · May 2, 2023, 11:05am

Hi members in the Huttenhower Lab,

When running humann v3.6, there were no 3 main output files and only temp folder generated, shown as below

The error says:

#CPU threads: 6
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Temporary directory: /nobackup/denxyu/zijian/metaphlan4/01_mpa/humann_output/merged_trimmed_006D_humann_temp/tmpbu6dzawx
Percentage range of top alignment score to report hits: 1
Opening the database… [0.164s]
Database: /nobackup/denxyu/new_humann3/humann_databases/uniref/uniref90_201901b_full (type: Diamond database, sequences: 87296736, letters: 29247941583)
Block size = 2000000000
Opening the input file… [0.043s]
Opening the output file… [0s]
Loading query sequences… Error: Error reading input stream at line 6: Invalid character (*) in sequence

CRITICAL ERROR: Error executing: /nobackup/denxyu/miniconda/envs/humann/bin/diamond blastx --query /nobackup/denxyu/zijian/metaphlan4/01_mpa/humann_output/merged_trimmed_006H_humann_temp/merged_trimmed_006H_bowtie2_unaligned.fa --evalue 1.0 --threads 6 --top 1 --outfmt 6 --db /nobackup/denxyu/new_humann3/humann_databases/uniref/uniref90_201901b_full --out /nobackup/denxyu/zijian/metaphlan4/01_mpa/humann_output/merged_trimmed_006H_humann_temp/tmpykvlw7q0/diamond_m8_odufijgj --tmpdir /nobackup/denxyu/zijian/metaphlan4/01_mpa/humann_output/merged_trimmed_006H_humann_temp/tmpykvlw7q0

Error message returned from diamond :
diamond v2.1.6.160 (C) Max Planck Society for the Advancement of Science
Documentation, support and updates available at http://www.diamondsearch.org
Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

When checking the *_bowtie2_unaligned.fa file, there is *. I have sucessfully run humann, but failed with a different batch of sequencing data. Could you please give me some suggestions? Thank you very much in advance.

franzosa · May 4, 2023, 5:55pm

* will sometimes occur as a STOP codon in protein sequences but I don’t think it should be occurring in your sequencing reads? Can you share a portion of the reads file containing the *(s)? That appears to be what is tripping up DIAMOND.

Xia · May 7, 2023, 8:44pm

Hi Franzosa,
Thank you very much for your reply.
I double-checked the data and found that * was generated during cutadapt step. After discarding lines with *, HUMANN was running successfully. Appreciate your response again.

Dhrati_Patangia · May 7, 2024, 4:03pm

Hi @Xia
I am facing the same issue, what do you mean by discarded the line with *. Can you share how you did this exactly?
Any help would be appreciated. Thank you

Dhrati_Patangia · May 20, 2024, 1:33pm

Hi @franzosa, I have the same error. I am using version 3.8 and I have the temp folder but not the 3 main output files.
For instance my error says (path names have been changed):
CRITICAL ERROR: Error executing: /install/software/restart/py3/naga//bin/diamond blastx --query /path/to/humann_temp/sample_bowtie2_unaligned.fa --evalue 1.0 --threads 2 --top 1 --outfmt 6 --db /databases/humann3.8/uniref/uniref90_201901b_full --out /H3/sample_L001_h3/humann_temp/tmpb3yiqtuw/diamond_m8_2v9dte3w --tmpdir /path/to/humann_temp/tmpb3yiqtuw

Error message returned from diamond :
diamond v2.1.6.160 (C) Max Planck Society for the Advancement of Science
Documentation, support and updates available at http://www.diamondsearch.org
Please cite: Sensitive protein alignments at tree-of-life scale using DIAMOND | Nature Methods Nature Methods (2021)

#CPU threads: 2
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Temporary directory: /path/to/humann_temp/tmpb3yiqtuw
Percentage range of top alignment score to report hits: 1
Opening the database… [0.815s]
Database: /databases/humann3.8/uniref/uniref90_201901b_full (type: Diamond database, sequences: 87296736, letters: 29247941583)
Block size = 2000000000
Opening the input file… [0.055s]
Opening the output file… [0.022s]
Loading query sequences… Error: Error reading input stream at line 70270: Invalid character (*) in sequence

CRITICAL ERROR: Error executing: /install/software/restart/py3/naga//bin/diamond blastx --query /path/to/humann_temp/sample_bowtie2_unaligned.fa --evalue 1.0 --threads 2 --top 1 --outfmt 6 --db /databases/humann3.8/uniref/uniref90_201901b_full --out /path/to/humann_temp/tmpoqsvteve/diamond_m8_263c0d62 --tmpdir /path/cutfastq/H3/sample_h3/humann_temp/tmpoqsvteve

When I checked the given line in the error, I could not find any *

Can you please help me as to how to proceed in this case? Any suggestions will be appreciated.
Thank you

franzosa · June 20, 2024, 9:19pm

From the earlier reply, it sounds like some QC that was done to the sequencing reads included * characters in the output, which are then tripping up DIAMOND. Have you checked your FASTQ input to make sure it doesn’t have * characters on the sequence lines?

Topic		Replies	Views
HUMAnN3 Diamond tmp file doesnt exist HUMAnN	2	790	June 25, 2020
Humann3 cannot find temporary diamond file HUMAnN	3	246	July 7, 2023
Diamond version Error: HUMANn3 HUMAnN	3	1146	May 19, 2021
Humann3 creating many bowtie2 index files in the temp dir HUMAnN	1	290	November 10, 2023
Error in humann3 run HUMAnN	10	1462	June 24, 2021

Humann3 only generated temp folder and no 3 main output files

Related topics