Hi there,
I am using MaasLin2 to test the association between taxonomy and binary clinical outcomes. In the metadata clinical outcomes, I have several missing values (shown as “NA”). After I ran MaasLin2, the output file still showed the total number of samples, i.e., not removing samples with NA values. And in the box plot for showing significant taxonomy association, in addition to 0 and 1, there is another box for NA. It seems MaasLin2 can remove missing values automatically when performing association analysis, would you mind to let me know how to do that? Thank you.
Zhaozhong
Hi Zhaozhong (@hellofuture) - MaAsLin 2, by default, uses na.exclude
which keeps the NA’s in the final data frame and any associated output (e.g. residuals) but does not use them in the fitting process. In other words, your final coefficient estimates are always based on the non-missing values. In case you don’t want to see the NA’s in the final plots, please remove them beforehand so that they don’t show up in the box plots. Again, this does not change the final results or conclusions but only the appearance of the NA’s in the per-feature visualizations.
1 Like
Hi Himel, thank you for the explanation!
1 Like