I’m trying to run the CLR normalisation option on Maaslin and in the significant results ouput file the ‘N not 0’ column is incorrect. It is saying one of my taxa is only present in 6 samples, but it is present in 209 samples!
Due to the I am concerned about the accuracy of the output, and wondering if I am doing something incorrectly.
I input a table of relative abundance data with the taxa as columns and the samples as rows.
Also please note I haven’t done anything with the zeros in the relative abundance table. I would really appreciate your guidance. Thanks!
for (p in variable_list) {
output_file ← paste0(‘maaslin_allweeks_CLR_confounders’, p)
Thanks Jacob - could you please also confirm for me whether the Maaslin CLR transform option adds a pseudocount to the relative abundance data to eliminate zeros prior to the CLR transformation? Or is that something I need to do prior to uploading the data in to Maaslin?
Hi, following up on the pseudo-count question, How are the pseudo-counts determined? Say my data contains already very small relative abundances (EX: 6.851430e-08
) is the psueodo-count determined proportionally to this?
Will Maaslin2 do +1 shift even for relative abundance data, with 0-1? Do you think it is appropriate? And, will the shifted values be divided by total sum for CLR since CLR is a transformation for compositional data?
Yes when using CLR will have a pseudo count added of 1. Whether or not that is appropriate depends on your data as CLR transformations with pseudo counts can sometimes cause issues when the covariate of interest is associated with read depth.
I’m not sure I understand your second question but the formula used for the normalization can be found here: