Hi,
I’m currently trying to perform a community analysis for milk samples. I’ve received the data and performed a trimming using trimmomatic and then proceeded to use metaphlan. However, all the samples have returned an UNKNOWN 100 output.
I used the following command:
#!/bin/bash
module load miniconda
conda activate metaphlan
METAPHLAN_DB=meta_database/
for f in *.gz
do name=$(basename $f .fastq.gz)
metaphlan --nproc $SLURM_CPUS_PER_TASK --bowtie2out ${name}.bt2.bz2 --input_type fastq --bowtie2db $METAPHLAN_DB $f > ${name}_profile.txt
done
And I received this output:
#mpa_v30_CHOCOPhlAn_201901
#/home/ccastillo/.conda/envs/metaphlan/bin/metaphlan L006_S3_L001_R1_001.fastq.gz --bowtie2out metagenome.bowtie2.bz2 --nproc 5 --bowtie2db meta_db --input_type fastq --unknown_estimation -o profiled_me$
#SampleID Metaphlan_Analysis
#clade_name NCBI_tax_id relative_abundance additional_species
UNKNOWN -1 100.0
So, I decided to perform a metaphlan run with just one sample. I used the following command:
metaphlan L006_S3_L001_R1_trimmed.fastq.gz --bowtie2out metagenome.bowtie2.bz2 --nproc 5 --bowtie2db meta_db --input_type fastq --unknown_estimation -o profiled_metagenome.txt
And I still recieved the same output. I tried the raw samples (before the trimming) using the following command:
metaphlan L006_S3_L001_R1_001.fastq.gz --bowtie2out metagenome.bowtie2.bz2 --nproc 5 --bowtie2db meta_db --input_type fastq --unknown_estimation -o profiled_metagenome.txt
And the same problem persisted.
Reading that the read length might be an issue, I performed the following command:
gunzip -c L006_S3_L001_R1_trimmed.fastq.gz | awk 'NR%4 == 2 {lengths[length($0)]++} END {for (l in lengths) {print l, lengths[l]}}' > count.txt
And after performing head and tail in the output file, I noticed that most reads are in the 250 nt length. Which apparently isn’t a problem:
head
36 3
37 6
38 30
39 31
40 12
41 10
42 7
43 11
44 10
45 17
tail
243 159
244 317
245 336
246 292
247 621
248 2385
249 8040
250 37725
251 94589
So, right now, I’m out of ideas. I was wondering if maybe it has something to do with the database, and maybe I’m specifying it wrongly.
This is the version I’m currently using:
MetaPhlAn version 3.0.14 (19 Jan 2022)
Thanks for any help you could give me!