I am trying to run Humann3 using:
MetaPhlAn version 3.0.1 (25 Jun 2020)
Bowtie2 version 2.3.5.1
When I ran the pipeline using demo.fastq file the process is complete without problem but now I am using my own data and the process is killed while bowtie2 is running. Is a disk space problem but I have near to 500 gb of free space. How much space I need to run Humann?
Hello, Thanks for the detailed file listing! It looks like the run you posted ran through the nucleotide search portion with bowtie2 as I see the *_aligned.[sam/tsv] files but likely got stuck in the next step in the workflow. I think those large tmp output files that you see are from the translated search portion with diamond. With a compressed input file at 16gb that is a lot of reads which is great! However, you might have a lot of alignments in the translated search portion of the run which could make those files very large (looks like they are ~80Gb). With 500gb of disk space if you would just run a few jobs at a time and also set the option to remove the intermediate output files, --remove-temp-output, if you do not need these intermediate alignment files for future reference, this should solve the issue with running out of disk space.
Hi Lauren thank you for your answer. Unfortunatelly I achieve the same error.
I used:
humann --threads 16 -i myfastq.fastq.gz -o humann_results/ --remove-temp-output
and this I see on the screen:
Output files will be written to: /home/ubuntu/sample/35/humann_results
Decompressing gzipped file …
Removing spaces from identifiers in input file …
Running metaphlan …
Found g__Dactylococcopsis.s__Dactylococcopsis_salina : 54.70% of mapped reads
Found g__Paraburkholderia.s__Paraburkholderia_fungorum : 10.53% of mapped reads
Found g__Halorubrum.s__Halorubrum_sp_AJ67 : 8.76% of mapped reads
Found g__Paraburkholderia.s__Paraburkholderia_insulsa : 8.30% of mapped reads
Found g__Halorubrum.s__Halorubrum_tebenquichense : 7.33% of mapped reads
Found g__Cutibacterium.s__Cutibacterium_acnes : 7.09% of mapped reads
Found g__Halorubrum.s__Halorubrum_hochstenium : 1.09% of mapped reads
Found g__Phormidium.s__Phormidium_willei : 0.64% of mapped reads
Found g__Phormidium.s__Phormidium_sp_OSCR : 0.62% of mapped reads
Found g__Coleofasciculus.s__Coleofasciculus_chthonoplastes : 0.51% of mapped reads
Found g__Halothece.s__Halothece_sp_PCC_7418 : 0.45% of mapped reads
Total species selected from prescreen: 11
Selected species explain 100.00% of predicted community composition
Creating custom ChocoPhlAn database …
Running bowtie2-build …
Running bowtie2 …
Killed.
Despite I used --remove-temp-outout, the temp files anyway were created:
$ ls humann_results/
35_merge_humann_temp_mir4tggp
Hi - Thanks for the follow up info! I agree with you in that I don’t think it is a disk space issue. I think possibly your run is being killed because it is running out of memory when it is processing the bowtie2 results. If you are running a couple runs at once try just running one at a time and see if this helps!