[BUG] Running MetaPhlAn from SAM input fails to the type of the variable "n_metagenome_reads"

Hi,

This bug is related to this post here but fails already earlier in the code.

When running MetaPhlAn with providing the aligned sequencing data in SAM format, the function map2bbh throws an error because the variable n_metagenome_reads in this function is initialised as a NoneType variable and further along there is the attempt to compare it to an integer.

Traceback (most recent call last):
  File "/mnt/archgen/users/huebner/mp4_test/conda/2e0ba2a4937cd8e361b55a3afec62c75_/bin/metaphlan", line 10, in <module>
    sys.exit(main())
  File "/mnt/archgen/users/huebner/mp4_test/conda/2e0ba2a4937cd8e361b55a3afec62c75_/lib/python3.9/site-packages/metaphlan/metaphlan.py", line 1089, in main
    markers2reads, n_metagenome_reads, avg_read_length = map2bbh(pars['inp'], pars['min_mapq_val'], pars['input_type'], pars['min_alignment_len'], pars['subsampling'], pars['subs
ampling_seed'])
  File "/mnt/archgen/users/huebner/mp4_test/conda/2e0ba2a4937cd8e361b55a3afec62c75_/lib/python3.9/site-packages/metaphlan/metaphlan.py", line 887, in map2bbh
    elif n_metagenome_reads < 10000:
TypeError: '<' not supported between instances of 'NoneType' and 'int'

Commenting out the lines 887 and 888 allows MetaPhlAn to run successfully.
I also think that the same type o comparison in line 868 subsampling >= n_metagenome_reads will likely throw the same error.

Platform (please complete the following information):

  • Version 4.0.2

Hi @alexhbnr
Thanks for reporting this, I will push a fix in the next version solving this problem

Hi @alexhbnr
The new 4.0.3 version is now available in conda solving the uncatch error. However, for running metaphlan on the sam file is now mandatory to include the --nreads param in the execution.

Hi Aitor,

Thank you very much for addressing this so quickly!