Gzip fastq as input

Hi, I am using MetaPhlAn version 3.0.14, in the help, the gzip file format is not mentioned. May I ask whether the gzip fastq can be directly used as input for MetaPhlAn? If so, the --input_type should be selected “fastq”? For example, the part of command as below:
metaphlan xxx_1.fastq.gz,xxx_2.fastq.gz --input_type fastq

When running successfully, there is “WARNING: The metagenome profile contains clades that represent multiple species merged into a single representant.
An additional column listing the merged species is added to the MetaPhlAn output.” for some samples, may I ask whether this is normal, and can be ignored or what to do?


Hi @farmer2020
Yes, it is possible to run MetaPhlAn 3.0.14 with FASTQ files compressed with gzip. Exactly, the --input_type should still be fastq.
The warning you are describing is normal. MetaPhlAn 3 includes markers describing species groups (for 1328 species as they were unlikely to be distinguishable in metagenomic samples, Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3 | eLife). When some of these species groups are detected, MetaPhlAn reports that warning for the user to be aware of it.

Hi Aitor, many thanks for your fast reply.