Thank you for creating and maintaining such a useful tool!
I have a question regarding a MaAsLin2
analysis of humann3 data from metagenomic sequencing performed in two different bird populations. I ran the following association test using the variables “Phase” (factor w/ 2 levels, “breeding” or “staging”) and “Mihat” (numeric):
fit <- Maaslin2( analysis_method = "NEGBIN", normalization = "CSS", transform = "NONE", input_data = data_df, input_metadata = map2, output = file.path(Save_fp, "Maaslin2_path_abundance"), fixed_effects = c("Phase", "Mihat"), min_prevalence = 0.5, )
But there is a bit of confusion among colleagues about interpreting the results, especially in light of Himel’s response to this question.
The significant_results.tsv
file looks like this:
feature | metadata | value | coef | stderr | N | N.not.0 | pval | qval |
---|---|---|---|---|---|---|---|---|
PWY0.1586 | Mihat | Mihat | -0.359878416 | 0.04065671 | 20 | 15 | 8.62E-19 | 2.78E-16 |
PWY.6151 | Mihat | Mihat | 0.341375963 | 0.043378182 | 20 | 17 | 3.55E-15 | 5.72E-13 |
PWY.6270 | Mihat | Mihat | 0.335625435 | 0.054723826 | 20 | 15 | 8.62E-10 | 3.08E-08 |
GLUCOSE1PMETAB.PWY | Phase | Breeding | -0.873864422 | 0.144191751 | 20 | 13 | 1.36E-09 | 4.37E-08 |
PWY.5103 | Mihat | Mihat | 0.289748826 | 0.048018476 | 20 | 15 | 1.60E-09 | 4.68E-08 |
Is it correct to say that:
(1) In metadata = Mihat rows, features that have a negative coef are negatively correlated with Mihat?
The first pathway, PWY0.1586, has a negative coef, while the second (PWY.6151) has a positive coef, yet when you examine the graphs that were generated, both slopes are negative. Is the sign of the coef related to the direction of the association? If not, how should we be interpreting this?
(2) In metadata = Phase rows, features that have a negative coef are negatively correlated with the factor in the “value” column of that row.
For example, there is a negative coef in the GLUCOSE1PMETAB.PWY row. Does this mean that GLUCOSE1PMETAB.PWY is negatively associated with the “breeding” phase? Or does it mean that “breeding” is the reference factor level, and therefore a negative coef means the pathway is negatively associated with the opposite factor level, i.e. staging?
Similar to above, if you examine the plots, the pathway abundance all seem to be elevated in the “staging” group regardless of the sign on the coef.
Could you help us better understand how to interpret and draw conclusions from these outputs?
Thanks so much!