Trying to understand coef column (and how to convert it to fold change)

Hi everyone!

I just performed an expression analysis with transcripts, and I’m trying to understand what exactly means “coef” column. I read in the Tutorial this:

coef : the model coefficient value (effect size).

So, ok, it’s the effect size, but I’m used to read about fold changes and log2 fold changes in this kind of analyses. So how can be this coefficient value transformed to fold change (in case this coefficient wasn’t fold change, which is a possibility I don’t discard)?

If it helps, I used these parameters:
transform = “LOG”, analysis_method = “LM”, correction = “BH”, normalization = “TSS”, standardize = FALSE

Thank you very much in advance.

1 Like

Hello, I am also interested in this post. Thanks a lot, Gabri

Hi @dts (and @mrgambero),
Thanks for the question! In a previous response, we showed how to convert the coef to log2 fold change. This is also something that we are thinking of implementing more clearly in the next major iteration of MaAsLin. We haven’t done it to date because it is confusing to implement/interpret within the multivariable infrastructure of MaAsLin. In the future, we are thinking of allowing a call for the main variable of interest within the model which will allow the results of a log2 fold change to be more interpretable.
I hope this helps!
Kelsey

Thanks @Kelsey_Thompson !

I am sorry to be annoying, but I still have doubts about how to do that.
That question explains how to convert the coefficient to log2fold change in case of a glm with poisson distribution. But I usually use either 'lm" or “negative binomial”. Would that be the same?

I think in case of lm, I do not need to do the following step:
fc<- exp(fit$coefficients[2]) ## Antilog coef #2

What about negative binomial?

Thanks for considering this implementation for the future.
I think it is a good idea to have it has fold change, so it is better quantifiable.
Now we have this coefficient but we do not really know how to biologically interpret it.

I thank you in advance and for your awesome work!!
Gabri

Hi @mrgambero - it’s the same as the Poisson GLM for the negative binomial. In fact, you can use the same formula for any GLM with a log link. Thanks!

1 Like

Thank you both! It helped a lot. I saw that previous response, but I was not sure at all if it would fit for the parameters I was using (in fact, I deduce from @himel.mallick answer that this formula is not proper for linear model, right?). Now I changed the analysis_method to CPLM and transform to NONE, and following the formula I guess I get the log2 fold change correctly.

Thanks again, and keep on doing your excellent job.

Thanks, all. Just to add to the rationale for not doing a similar back transformation for linear models: with a log2 transformation in place (default in MaAsLin 2, similar to limma), the coefficients can be interpreted as the log2 fold-changes themselves, as explained here. Note that, the interpretation is not quite the same without a log2 transformation for a linear model. This goes back to @Kelsey_Thompson’s comment on why we opted to report coefficients instead which are generally more universal across models and they are likewise much easier to interpret in our multi-model, multivariable setup.

Hello Himel.mallick.

I am so sorry, you got me a little even more confused.
When you say “log2 transformation in place (default in MaAsLin 2”.
it means, that, if I use Maaslin with default parameters (so that would be
analysis_method = “LM”)
The “coef” column is already a log2fold change?
I DO NOT need to do any further operation on the data.

not even the: x<- log2(coef) step.

Is that correct?

I am sorry to be annoying. I just want to make sure I am reporting and explaining the value correctly.

Thanks!
Gabri

Hi @mrgambero - that is correct. This is simply because the data is already log2-transformed and based on the math provided above, it will be approximately the same as the log2 fold-change value as shown below.

Actual log2(FC) = log2(mean(Group1/Group2))
MaAsLin 2 coefficient or “Log2(FC)” for the default model = mean(log2(Group1)) - mean(log2(Group2)).

Does this make sense?

It makes sense to me. Thanks again for your detailed explanations.

yes! thanks!
Better to be sure I was not understanding pears for apples!

Thanks a lot!

Hi again.

I’m still a little bit confused about this. I just performed a 16S analysis and a differential abundance analysis of taxa with these parameters:

min_abundance = 0.0,
min_prevalence = 0.0,
normalization = “TSS”,
transform = “LOG”,
analysis_method = “LM”,
max_significance = 0.05,
correction = “BH”,
standardize = FALSE

One taxon which is clearly more abundant in one type of sample (94%, 64% and 98% one type; 0.6%, 0.08% and 3.4% the other type). The results are:

|coef|0.599906572791737|
|stderr|0.05919489753717|
|pval|0.000533675113693|
|qval|0.029992541389525|

As said in this thread, the coef value would correspond to log2 fold change, but I honestly believe it does not (with any other tool the log 2 fold change is between 6 and 7).

So my question is straightforward, how do you go from the coef value to the log2 fold change?

Thanks!

Any help? @himel.mallick I would really appreciate your help with this when you have a moment. Thank you!

Hi @dts - can you clarify what you mean by 'with any other tool the log 2 fold change is between 6 and 7'? The only comparable tools in my mind would be those based on linear models such as limma. Did you have a chance to take a look at the corresponding plot? Does the plot signify a huge effect size?

Thanks,
Himel

Just to be even more explicit. If you have transform = "NONE" in the call to Maaslin2, then the following code will get you a significance-filtered results table with a log2fc column (using dplyr):

sig_res_fit <- fit_data$results %>% 
  mutate(log2fc = log2(exp(coef)), .before = pval) %>% 
  filter(qval <= 0.25)  # default max_significance
1 Like