MaAsLin2 on Galaxy

Hi,

I am trying to do MaAsLin2 on R and I wonder how MaAsLin2 works…for example, I have a table with relative abundance containing 6 levels in it (Kingdom, Phylum,…Spieces; rows are taxa and columns are samples), so for each sample the total sum will be around 6.0 (each level sums up to 1). Can I throw this whole table into MaAsLin2 or do I have to do this level-by-level? I assume that if the model runs row-by-row, then it doesn’t matter if I throw the whole table in (with relative abundance) but it would matter if the model works on absolute abundance and considers the relationship between rows.

Then I have a metadata with several fixed effects. Is it the same to run the model with each fixed effect separately as with all fixed effects at the same time?

Thanks so much!

Wenting

Hi @wy14940 ,

This is a matter of preference. Its most common to run Maaslin2 on a single taxonomic level rather than than all levels at once. This is due to a number of different reasons including the fact that the abundances of higher level taxa are dependent on the abundances of lower level taxa (i.e. genus abundance is dependent on the abundance of species within that genus). I would determine what taxonomic resolution you are interested in and run it at that level (many times this is at the species or SGB level).

Running fixed effects together vs. separately will give you different results. Running your model with all fixed effects will essentially tell you the impact of those effects when you consider the rest of the effects in the model. Where as running them by your self would just tell you the impact of that effect without considering and/or controlling for the impact of other things that might be impacting the results. So this really comes down to the biological question at hand.

hope that is helpful

Cheers,
Jacob Nearing

Thank you for your reply Jacob! You are absolutely right about the higher level abundance is dependent on the low level. Please correct me if I misunderstood what you said: if I am interested in the genus and species levels, and I already computed relative abundance for each level separately (i.e., the relative abundance of genus X is the sum of the relative abundances of all species under this genus X), is it still not okay to merge two levels together into one table and run it with MaAsLin2? I agree that it’s common to run the model on only one level each time, but I am curious whether the results will be different (e.g., a significant genus emerges when I only run the model on genus level but it disappears when I run the merged data).

Best wishes,
Wenting

Hi @wy14940 ,

You will get different q-values as you will be running more tests and so the FDR calculation will be different.

Its currently debatable depending on who you talk to whether correcting across all levels or correcting at each individual taxonomic level is more appropriate.

I personally just pick a single taxonomic level and run my analysis with that level alone.

Cheers,
Jacob Nearing

Thank you! This really helps!