High Memory Usage of sample2markers.py

nickp60 · October 28, 2022, 4:00am

Hi Biobakery Team!
I am noticing that the sample2markers.py script is taking up very large amounts of memory. With a input sam.bz2 of 134M, it ends up consuming 80GB memory. Is this a bug? Or is there a way of executing it on fragments of the alignment and recombining to make it more scalable?

Thanks in advance!

aitor.blancomiguez · November 2, 2022, 11:00am

Hi @nickp60
Thanks for reporting this, we never experimented such a high consumption of RAM when executing sample2markers.py Could it be possible to share the input sam file to have a better idea of what is going on?

nickp60 · November 2, 2022, 4:12pm

Sure, whats the best email for you? I’ll send a download link. Thanks so much!

nickp60 · November 2, 2022, 4:15pm

Here are the samtools stats, for the record:
sample2markers_highmem.stats.txt (30.8 KB)

aitor.blancomiguez · November 11, 2022, 12:44pm

Hi @nickp60
How many procs where you using for the sample2markers execution?

nickp60 · November 29, 2022, 5:31pm

So sorry I missed this! I was using 16

aitor.blancomiguez · November 30, 2022, 8:28am

Then I think it is expected, the memory consumption of sample2markers will grow linearly with the number of cores used. However, if you are interested we are currently working in a new version that should speed up the process while maintaining a stable consumption of memory. You can check an alpha version of the code in this branch of the mpa repository: GitHub - biobakery/MetaPhlAn at sample2markers_speedup

Topic		Replies	Views
Strainphlan sample2markers.py AttributeError: 'NoneType' object has no attribute 'reference_free_consensus' StrainPhlAn	3	447	August 5, 2021
Issue with running StrainPhlAn 4.1 - Error in running ample2markers.py MetaPhlAn	2	168	October 24, 2024
Sample2markers.py faced Error StrainPhlAn	1	90	October 24, 2024
MetaPhlan 4 not working with .gz for sample2markers.py step MetaPhlAn	2	403	July 28, 2023
Sample2markers.py would not close when done MetaPhlAn	0	215	August 4, 2021

High Memory Usage of sample2markers.py

Related topics