Hi !
I am writing this message because I have some questions regarding MetaPhlAn3, and I was hoping you could help me with that.
I am running MetaPhlAn3 on fastq files using the following parameters: --read_min_len 70, --min_mapq_val 5, --add_viruses, to obtain bowtie2 and sam outputs.
Then, I used the bowtie2 output to do other analyses (like “-t reads_map”).
I was first surprised to see that the SAM files still contain reads with MAPQ < 5. Are the parameters not applied to the SAM output ? Yet, it seems that it excluded reads smaller than 70bp.
Secondly, when I discarded manually reads with MAPQ<5 in the SAM, I did not find the same numbers of reads than the one in the “reads_map” file and I am not sure to understand why.
For example, to give you an order of the difference, for one individual, I obtain 149352 reads in the SAM file, 104497 reads when filtering on the MAPQ<5, while I have 105108 reads in the “reads_map” file.
If you have any explanation for that, I would appreciate it !
Thank you in advance for your answer
Anthony