I am running maaslin3, version ‘0.99.7’ on Nimbus server. I put “cores = 4/8/16” in the Rscript , and the nohup.out ends at “INFO::Creating cluster of 16 R processes” but the jobs were killed. If I change the cores to 1, it runs fine.
The command I am using to run the Rscript is “nohup Rscript My_Rscript.r &”. I also tried without “nohup &”. It was also stuck at “INFO::Creating cluster of 16 R processes” and almost killed the terminal connection.
Do you have any idea about how this situation might be caused? Any idea would help! Thanks in advance!
Parallelization is a known challenge in MaAsLin. We use the pbapply package since it provides a progress bar, but sometimes the memory usage with parallelization shoots through the roof for reasons that aren’t clear. We’ve tried other packages like future, but these don’t seem to solve the issue. Also, in my testing, the objects within maaslin3 aren’t very large (a few MB at most), so the memory explosion is a mystery to me. In 0.99.8, I’ve added a few lines that print the memory of each object going into the parallelization step in the log when you specify verbosity=“DEBUG”, so maybe that’ll help with diagnosing the issue. I’ve also made a few adjustments that might improve things. Would you mind running it again with the new version and letting me know? If that doesn’t work and you can figure out what’s going wrong, I’d be happy to accept a pull request.
In the meantime, is using 1 core feasible? We’ve found that multiple cores only speeds things up when the model itself is very complex. Otherwise, the overhead created typically outweighs the advantages (especially if you have a lot of features).