Hi. I just started using humann 4.0. I’m running with slurm though the wmgx workflow. The files that output into scratch for humann have numbers appended in the middle, but the script doesn’t recognize the format, so when the workflow tries to copy them to the main output files, it throws a file not found error. Kneaddata and metaphlan are both running correctly, and humann is creating the correct output data, but just the wrong file names. I’ve included information below. I can also email over log files as needed.
Best,
Artemis
tool versions:
biobakery_workflows v3.1
kneaddata v0.12.0
MetaPhlAn version 4.1.0 (23 Aug 2023)
humann v4.0.0.alpha.1
command:
export HOST="human"
export KNEADDATA_DB_HUMAN_GENOME=/data/databases/kneaddata_2023/${HOST}
export METAPHLAN_DB=/data/databases/biobakery/bb4/metaphlan
export CHOCOPHLAN_DB=/data/databases/biobakery/bb4/humann/chocophlan_v4_alpha
export METAPHLAN_INDEX=mpa_vOct22_CHOCOPhlAnSGB_202403
biobakery_workflows wmgx --input $1 --output $2 --threads 10 --pair-identifier _R1 \
--grid-jobs 10 --grid slurm --grid-scratch $2/scratch --grid-partition="defq" \
--grid-tasks="humann,10000,75000,10" \
--grid-environment="
source ~/miniforge3/etc/profile.d/conda.sh
conda activate ~/miniforge3/envs/biobakery4
export KNEADDATA_DB_HUMAN_GENOME=/data/databases/kneaddata_2023/${HOST}" \
--contaminate-databases ${KNEADDATA_DB_HUMAN_GENOME}/ \
--skip-nothing --remove-intermediate-output --bypass-strain-profiling \
--qc-options="--max-memory=1000m --run-trf \
--trimmomatic=~/biobakery_workflows_databases/Trimmomatic-0.39/ \
--trf ~/miniforge3/envs/biobakery4/bin/" \
--taxonomic-profiling-options="--add_viruses --bowtie2db=${METAPHLAN_DB} \
--index ${METAPHLAN_INDEX} --unclassified_estimation -t rel_ab_w_read_stats" \
--functional-profiling-options="--nucleotide-database ${CHOCOPHLAN_DB} \
--protein-database /data/databases/biobakery/bb4/humann/uniref90/uniref/ --remove-stratified-output --memory-use minimum "
output example for on sample:
$ ls biobakery4_out/scratch/humann/main/S01*
biobakery4_out/scratch/humann/main/S01_2_genefamilies.tsv
biobakery4_out/scratch/humann/main/S01_3_reactions.tsv
biobakery4_out/scratch/humann/main/S01_4_pathabundance.tsv
biobakery4_out/scratch/humann/main/S01.log
error file output:
$ cat biobakery4_out/slurm_files/task_61_*.err
cp: cannot stat ‘biobakery4_out//scratch/humann/main/S01_genefamilies.tsv’: No such file or directory