Hi. I have species level shotgun data. I am conducting the following basic transformations.
- Converting count to relative abundance
- Dropping species with <10% prevalence in my sample
Are these the right steps to do before I setup my data? Is there any other important step that I am missing?
Thanks.
Sanyog
Hi Sanyog,
Both MaAsSLin 2 and MaAsLin 3 will automatically convert your counts to relative abundances, and our advice in MaAsLin 3 is to not drop the low prevalence species (though in MaAsLin 2 dropping those below 10% prevalence was the default and implemented automatically).
Will
1 Like
Hi Will,
This is great! Thanks a lot for your response.
Sanyog
Hi Will. How can I increase the speed of my analyses? Does increasing the number of cores speed things up? Any other valid steps?
Sanyog
What’s the formula you’re using and the dataset size? Using more cores might (but probably won’t) help.
Will
1 Like
Hi Will. Species level shotgun file has 214 rows and 4097 columns. Metadata file has 214 rows and 91 columns. In my main model, I am including 13 variables from the metadata file (4 continuous, 8 binary, 1 factor with 3 levels). It currently takes about 45 mins to run 1 model.
If by 1 model you mean 1 full MaAsLin run, that sounds about right for that size of data. You have a pretty complicated model, and 4097 is a lot of species. It’s maybe worth checking if many of those are rare species and should be screened out by the prevalence/abundance screens (e.g. requiring at least 10% of samples to have at least 0.1% abundance of the species).
Will