Nreads of bowtie2 file

What is the meaning of ‘#nreads’ and ‘#avg_read_length’ at the end of the bowtie2 file?But “#nreads” is not the number of reads of the input file. For example:

  • version:3.1.0
B1_2_8078375__2.8078373 363265__G6AXV1__HMPREF0673_01458
#nreads 15972026
#avg_read_length        148.32879642194422

Hi @song
Are the number of reads and their avg length of your input (of those that passed metaphlan quality controls). It is used internally by metaphlan when executing the profiling directly from the bowtie2out files

Thank you for your answer. But the number of reads in the actual input file is different from the bowtie2 file. Why does this happen? If possible, please test an example and check.

The number of reads in the bowtie2 file are the reads that pass the quality control of metaphlan (l> 70nt lenght)

thank you very much!

The number of lines in the bowtie2_out.bz2 result file—does it represent the total number of reads from the filtered fastq file that aligned to the database? In the .profile file, is the ‘nreads’ value supposed to reflect the total number of reads from the original fastq file? For a specific taxonomy in the profile file, should its absolute abundance be calculated by multiplying its proportion by the total number of reads in the fastq file, or by the number of lines in the bowtie2_out.bz2 file?