Pass metaphlan --input-type into Humann3

Hello,
I am running Humann3 and it reported error of --input-type not specified for metaphlan. I already specified human “–input-format fastq.gz”. My understanding is metaphlan is embedded in Humann3 and I did not call it in my code. I tried multiple ways to pass the metaphlan --input-type to human, but none of them work. How do I solve it? Thanks!

I tried

  • –metaphlan-options “–input_type fastq” (Not working)
    *–metaphlan-options=“–input_type fastq” (Not working)
    *–metaphlan-options --input_type fastq" (Not working)

You shouldn’t need to pass that flag at all (--metaphlan-options is mostly intended for “expert” reconfiguration of MetaPhlAn during a HUMAnN run). If you pass a fastq.gz to HUMAnN as --input it should handle everything for you.

Hello Eric,
Thank you for your answer. However, I specified <–input-format fastq.gz> in my run and it still reports the same metaphlan format error below.

Humann command (test so I put 20min, usually runs for 20hrs):

#!/bin/bash
#SBATCH --time=0:20:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --mem=100G
#SBATCH --job-name=humann_S1

export OMP_NUM_THREADS=4
conda init bash
export CONDA3PATH=/mnt/home/shenyike/anaconda3
module load Conda/3
conda activate biobakery3

cd /mnt/gs21/scratch/shenyike/metagenome_GLWA/Humann_input

humann_config --update database_folders nucleotide /mnt/home/shenyike/.local/lib/python3.6/site-packages/metaphlan/metaphlan_databases/chocophlan

humann_config --update database_folders protein /mnt/home/shenyike/.local/lib/python3.6/site-packages/metaphlan/metaphlan_databases/uniref

humann_config --update database_folders utility_mapping /mnt/home/shenyike/.local/lib/python3.6/site-packages/metaphlan/metaphlan_databases/utility_mapping

humann -i S1_QC.fastq.gz -o humann_results/humann_S1 --threads 4 --input-format fastq.gz

conda deactivate

scontrol show job $SLURM_JOB_ID

Error message:

HUMAnN configuration file updated: database_folders : nucleotide = /mnt/home/shenyike/.local/lib/python3.6/site-packages/metaph
lan/metaphlan_databases/chocophlan
HUMAnN configuration file updated: database_folders : protein = /mnt/home/shenyike/.local/lib/python3.6/site-packages/metaphlan
/metaphlan_databases/uniref
HUMAnN configuration file updated: database_folders : utility_mapping = /mnt/home/shenyike/.local/lib/python3.6/site-packages/m
etaphlan/metaphlan_databases/utility_mapping
Output files will be written to: /mnt/gs21/scratch/shenyike/metagenome_GLWA/Humann_input/humann_results/humann_S1
Decompressing gzipped file ...


Running metaphlan ........

CRITICAL ERROR: Error executing: /mnt/home/shenyike/anaconda3/envs/biobakery3/bin/metaphlan /mnt/gs21/scratch/shenyike/metagenome_GLWA/Humann_input/humann_results/humann_S1/S1_QC_humann_temp/tmp42duzuia/tmp8ruex3tz -t rel_ab -o /mnt/gs21/scratch/shenyike/metagenome_GLWA/Humann_input/humann_results/humann_S1/S1_QC_humann_temp/S1_QC_metaphlan_bugs_list.tsv --input_type error --bowtie2out /mnt/gs21/scratch/shenyike/metagenome_GLWA/Humann_input/humann_results/humann_S1/S1_QC_humann_temp/S1_QC_metaphlan_bowtie2.txt --nproc 4

Error message returned from metaphlan :
usage: metaphlan --input_type {fastq,fasta,bowtie2out,sam} [--force]
                 [--bowtie2db METAPHLAN_BOWTIE2_DB] [-x INDEX]
                 [--bt2_ps BowTie2 presets] [--bowtie2_exe BOWTIE2_EXE]
                 [--bowtie2_build BOWTIE2_BUILD] [--bowtie2out FILE_NAME]
                 [--min_mapq_val MIN_MAPQ_VAL] [--no_map] [--tmp_dir]
                 [--tax_lev TAXONOMIC_LEVEL] [--min_cu_len]
                 [--min_alignment_len] [--add_viruses] [--ignore_eukaryotes]
                 [--ignore_bacteria] [--ignore_archaea] [--stat_q]
                 [--perc_nonzero] [--ignore_markers IGNORE_MARKERS]
                 [--avoid_disqm] [--stat] [-t ANALYSIS TYPE]
                 [--nreads NUMBER_OF_READS] [--pres_th PRESENCE_THRESHOLD]
                 [--clade] [--min_ab] [-o output file] [--sample_id_key name]
                 [--use_group_representative] [--sample_id value]
                 [-s sam_output_file] [--legacy-output] [--CAMI_format_output]
                 [--unknown_estimation] [--biom biom_output] [--mdelim mdelim]
                 [--nproc N] [--install] [--force_download]
                 [--read_min_len READ_MIN_LEN] [-v] [-h]
                 [INPUT_FILE] [OUTPUT_FILE]
metaphlan: error: argument --input_type: invalid choice: 'error' (choose from 'fastq', 'fasta', 'bowtie2out', 'sam')

Have you tried running the HUMAnN command without --input-type? Normally that is handled automatically (and the right file is then passed to MetaPhlAn). This looks like it could be arising because MetaPhlAn doesn’t have fastq.gz in its input types list (but HUMAnN will decompress to fastq and pass it that way).

Hello Eric,
The one without --input-type worked one year ago. When I tried this time, it reported this error so I specified --input-type. I just tried again, here is the error:

HUMAnN configuration file updated: database_folders : nucleotide = /mnt/home/shenyike/microbiomedatabases/chocophlan
HUMAnN configuration file updated: database_folders : protein = /mnt/home/shenyike/microbiomedatabases/uniref
HUMAnN configuration file updated: database_folders : utility_mapping = /mnt/home/shenyike/microbiomedatabases/utility_mapping
Output files will be written to: /mnt/gs21/scratch/shenyike/metagenome_GLWA/Humann_input/humann_results/humann_S1
CRITICAL ERROR: Unable to determine the input file format. Please provide the format with the --input_format argument.
JobId=20350572 JobName=humann_S1```

Is it possible that while the files have the fastq.gz extension they are not actually in that format? Can you check the first few lines with zcat FILE_NAME | head?

Thank you, Eric!
Yes, it is the file format issue. I noticed I forgot to put the .fastq.gz when I loop cat forward and reverse read. Error solved. Thank you!

Great, glad you worked it out! :slight_smile: