Incorrect counts of N and N.not.0


I am running MaasLin2 like this:

fit_data <- Maaslin2(input_data = df_input_data,
                         input_metadata = df_input_metadata,
                         output = paste0(read, "_", threshold, "_new"),
                         fixed_effects = c("Diagnosis"),
                         reference = c("Diagnosis,Ctrl"),
                         normalization = "CLR",
                         transform = "LOG",
                         analysis_method = "LM",
                         plot_heatmap = FALSE,
                         plot_scatter = FALSE,
                         max_significance = 0.05)

And a result I get a list of significant_results.tsv where the counts of total values are correct while the non-zeros are not, i.e.:

feature	metadata	value	coef	stderr	N	N.not.0	pval	qval
K03637	Diagnosis	PD	1	3.70074341541719e-17	87	1	1.11773196541154e-49	2.33233403449207e-47
K00077	Diagnosis	PD	1	7.40148683083438e-17	87	1	8.94185572329222e-49	1.67928050483428e-46
K06959	Diagnosis	PD	1	1.11022302462516e-16	87	1	3.01787630661113e-48	5.15233791255973e-46
K03660	Diagnosis	PD	1	1.48029736616688e-16	87	1	7.15348457863382e-48	1.03340338759033e-45
K05540	Diagnosis	PD	1	1.48029736616688e-16	87	1	7.15348457863382e-48	1.03340338759033e-45
K02189	Diagnosis	PD	1	2.96059473233375e-16	87	1	5.72278766290709e-47	7.67671087924251e-45

For example, for the KEGG KOs above my data includes at least 26 such KEGG KOs per sample, meaning at least 26 non-zero entries per sample. I obviously cannot rely on such results if counts are not correct.

I’ve tried multiple ways to fix it but still I cannot find an error or what could cause it.
Please let me know if you have any idea :slight_smile: Thank you!

Hi @paulinemaligne

This seems to be a lingering issue that we are currently working on solving… For now I would suggest filtering out all NA metadata values before running the command and seeing if this solves the issue.


Hi Jacob,

thanks for your response! I’ve tried it, doesn’t help :frowning:

Hi @paulinemaligne

Could you try running your command without the “LOG” transform. CLR normalization already “logs” your data so this is not needed and may be what is causing the issue.

Jacob Nearing